当前位置: 首页>>代码示例>>Python>>正文


Python EMRJobRunner.get_s3_keys方法代码示例

本文整理汇总了Python中mrjob.emr.EMRJobRunner.get_s3_keys方法的典型用法代码示例。如果您正苦于以下问题:Python EMRJobRunner.get_s3_keys方法的具体用法?Python EMRJobRunner.get_s3_keys怎么用?Python EMRJobRunner.get_s3_keys使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在mrjob.emr.EMRJobRunner的用法示例。


在下文中一共展示了EMRJobRunner.get_s3_keys方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: reducer_init

# 需要导入模块: from mrjob.emr import EMRJobRunner [as 别名]
# 或者: from mrjob.emr.EMRJobRunner import get_s3_keys [as 别名]
 def reducer_init(self):
     emr = EMRJobRunner(aws_access_key_id=AWS_ACCESS_KEY, aws_secret_access_key=AWS_SECRET_KEY)
     idf_parts = emr.get_s3_keys('s3://6885public/jeffchan/term-idfs/')
     self.word_to_idf = dict()
     for part in idf_parts:
         json = part.get_contents_as_string()
         for line in StringIO.StringIO(json):
             pair = json.loads(line)
             self.word_to_idf[pair['term']] = pair['idf']
开发者ID:jeffchan,项目名称:asciiclass,代码行数:11,代码来源:mr_tf.py

示例2: reducer_init

# 需要导入模块: from mrjob.emr import EMRJobRunner [as 别名]
# 或者: from mrjob.emr.EMRJobRunner import get_s3_keys [as 别名]
    def reducer_init(self):
        self.idfs = {}

        # Iterate through the files in the bucket provided by the user
        if self.options.aws_access_key_id and self.options.aws_secret_access_key:
            emr = EMRJobRunner(aws_access_key_id=self.options.aws_access_key_id,
                               aws_secret_access_key=self.options.aws_secret_access_key)
        else:
            emr = EMRJobRunner()

        for key in emr.get_s3_keys("s3://" + self.options.idf_loc):
            # Load the whole file first, then read it line-by-line: otherwise,
            # chunks may not be even lines
            for line in StringIO(key.get_contents_as_string()): 
                term_idf = JSONValueProtocol.read(line)[1] # parse the line as a JSON object
                self.idfs[term_idf['term']] = term_idf['idf']
开发者ID:myw,项目名称:dataiap,代码行数:18,代码来源:mr_tfidf_per_sender_aws.py


注:本文中的mrjob.emr.EMRJobRunner.get_s3_keys方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。