当前位置: 首页>>代码示例>>Python>>正文


Python Article.properties方法代码示例

本文整理汇总了Python中amcat.models.Article.properties方法的典型用法代码示例。如果您正苦于以下问题:Python Article.properties方法的具体用法?Python Article.properties怎么用?Python Article.properties使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在amcat.models.Article的用法示例。


在下文中一共展示了Article.properties方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: get_articles

# 需要导入模块: from amcat.models import Article [as 别名]
# 或者: from amcat.models.Article import properties [as 别名]

#.........这里部分代码省略.........
                max_id = max(max_id, int(row[AID]))
                self.n_rows += 1
                if not self.n_rows  % 10000000:
                    logging.info(".. scanned {self.n_rows} rows".format(**locals()))
            self.maxid = max_id
            
        logging.info("{self.n_rows} rows, max ID {max_id}, allocating memory for hashes".format(**locals()))

        hashes = ctypes.create_string_buffer(max_id*28)
        NULL_HASH = b'\x00' * 28
        orphans = "PLENTY"
        passno = 1

        if self._continue:
            logging.info("Continuing from previous migration, getting state from DB")
            with conn().cursor('migration-continue') as c:
                c.itersize = 10000 # how much records to buffer on a client
                c.execute("SELECT article_id, hash FROM articles")
                i = 0
                while True:
                    rows = c.fetchmany(10000)
                    if not rows:
                        break
                    i += len(rows)
                    if not i % 1000000:
                        logging.info("Retrieved {i} rows...".format(**locals()))
                    for (aid, hash) in rows:
                        offset = (aid - 1) * 28
                        hashes[offset:offset+28] = hash
            self.n_rows -= i
            logging.info("Continuing migration, {i} articles retrieved, up to {self.n_rows} to go".format(**locals()))
        
        while orphans:
            norphans = len(orphans) if isinstance(orphans, list) else orphans
            logging.info("*** Pass {passno}, #orphans {norphans}".format(**locals()))
            passno += 1

            if orphans == "PLENTY":
                r = csv.reader(open(fn))
                next(r) # skip header
                todo = r
            else:
                todo = orphans
            
            orphans = []
            MAX_ORPHANS_BUFFER = 50000
            
            for i, row in enumerate(todo):
                if not i % 1000000:
                    norphans = len(orphans) if isinstance(orphans, list) else orphans
                    logging.info("Row {i}, #orphans: {norphans}".format(**locals()))

                aid = int(row[AID])
                
                offset = (aid - 1) * 28
                stored_hash = hashes[offset:offset+28]
                if stored_hash != NULL_HASH:
                    continue
                
                parent_id = _int(row[index['parent_article_id']])
                if (parent_id == aid) or (parent_id in SKIP_PARENTS):
                    parent_id = None
                if parent_id:
                    poffset = (parent_id - 1) * 28
                    parent_hash = hashes[poffset:poffset+28]
                    if parent_hash == NULL_HASH:
                        # it's an orphan, can't process it now, so either buffer or re-iterate
                        if orphans != "PLENTY": # try to buffer
                            if len(orphans) > MAX_ORPHANS_BUFFER:
                                orphans = "PLENTY"
                            else:
                                orphans.append(row)
                        continue
                    parent_hash = binascii.hexlify(parent_hash).decode("ascii")
                else:
                    parent_hash = None

                date = row[index['date']]
                date = date.split("+")[0]
                date = datetime.strptime(date[:19], '%Y-%m-%d %H:%M:%S')

                
                a = Article(
                    project_id = row[index['project_id']],
                    date = date,
                    title = row[index['headline']],
                    url = row[index['url']] or None,
                    text = row[index['text']],
                    parent_hash=parent_hash)
                
                a.properties = {v: row[index[v]] for v in PROP_FIELDS if row[index[v]]}
                a.properties['medium'] = media[int(row[index['medium_id']])]
                a.properties['uuid'] = str(a.properties['uuid'])
                props = json.dumps(a.properties)
            
                hash = amcates.get_article_dict(a)['hash']
                hashes[offset:offset+28] = binascii.unhexlify(hash)

                yield (a.project_id, aid, a.date, a.title, a.url, a.text,
                       hash2binary(hash), hash2binary(a.parent_hash), props)
开发者ID:amcat,项目名称:amcat,代码行数:104,代码来源:migrate_34_35.py


注:本文中的amcat.models.Article.properties方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。