本文整理汇总了Python中nltk.corpus.util.LazyCorpusLoader.instances方法的典型用法代码示例。如果您正苦于以下问题:Python LazyCorpusLoader.instances方法的具体用法?Python LazyCorpusLoader.instances怎么用?Python LazyCorpusLoader.instances使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类nltk.corpus.util.LazyCorpusLoader
的用法示例。
在下文中一共展示了LazyCorpusLoader.instances方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: from_nltk
# 需要导入模块: from nltk.corpus.util import LazyCorpusLoader [as 别名]
# 或者: from nltk.corpus.util.LazyCorpusLoader import instances [as 别名]
def from_nltk(cls):
"""Returns a fully populated Propbank with the help of NLTK's interface"""
ptb = LazyCorpusLoader(
'ptb',
CategorizedBracketParseCorpusReader,
r'wsj/\d\d/wsj_\d\d\d\d.mrg',
cat_file='allcats.txt'
)
propbank_ptb = LazyCorpusLoader(
'propbank', PropbankCorpusReader,
'prop.txt', 'frames/.*\.xml', 'verbs.txt',
lambda filename: filename.upper(),
ptb
) # Must be defined *after* ptb corpus.
role_dict = {}
for roleset_xml in propbank_ptb.rolesets():
role = Role.fromxml(roleset_xml)
role_dict[role.roleset_id] = role
instance_dict = defaultdict(dict)
pb_instances = propbank_ptb.instances()
for instance in pb_instances:
instance.fileid = instance.fileid.lower()
file_num = instance.fileid.split("/")[-1].split(".")[0].replace("wsj_", "")
sentnum = str(instance.sentnum)
predicate = instance.predicate
tree = instance.tree
if isinstance(predicate, nltk.corpus.reader.propbank.PropbankTreePointer):
key = Propbank.pointer_to_word(predicate, tree)
elif isinstance(predicate, nltk.corpus.reader.propbank.PropbankSplitTreePointer):
key = tuple([Propbank.pointer_to_word(p, tree) for p in predicate.pieces])
else:
### TODO: Investigate when this is the case ###
#assert False
continue
pb_instance = PropbankInstance(instance.fileid, file_num, sentnum, key, instance.roleset, instance.arguments)
instance_dict[(file_num, sentnum)][key] = pb_instance
return Propbank(role_dict, instance_dict)