本文整理汇总了Python中nltk.compat.urlopen方法的典型用法代码示例。如果您正苦于以下问题:Python compat.urlopen方法的具体用法?Python compat.urlopen怎么用?Python compat.urlopen使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类nltk.compat
的用法示例。
在下文中一共展示了compat.urlopen方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: _open
# 需要导入模块: from nltk import compat [as 别名]
# 或者: from nltk.compat import urlopen [as 别名]
def _open(resource_url):
"""
Helper function that returns an open file object for a resource,
given its resource URL. If the given resource URL uses the "nltk:"
protocol, or uses no protocol, then use ``nltk.data.find`` to find
its path, and open it with the given mode; if the resource URL
uses the 'file' protocol, then open the file with the given mode;
otherwise, delegate to ``urllib2.urlopen``.
:type resource_url: str
:param resource_url: A URL specifying where the resource should be
loaded from. The default protocol is "nltk:", which searches
for the file in the the NLTK data package.
"""
resource_url = normalize_resource_url(resource_url)
protocol, path_ = split_resource_url(resource_url)
if protocol is None or protocol.lower() == 'nltk':
return find(path_, path + ['']).open()
elif protocol.lower() == 'file':
# urllib might not use mode='rb', so handle this one ourselves:
return find(path_, ['']).open()
else:
return urlopen(resource_url)
######################################################################
# Lazy Resource Loader
######################################################################
# We shouldn't apply @python_2_unicode_compatible
# decorator to LazyLoader, this is resource.__class__ responsibility.
示例2: _update_index
# 需要导入模块: from nltk import compat [as 别名]
# 或者: from nltk.compat import urlopen [as 别名]
def _update_index(self, url=None):
"""A helper function that ensures that self._index is
up-to-date. If the index is older than self.INDEX_TIMEOUT,
then download it again."""
# Check if the index is aleady up-to-date. If so, do nothing.
if not (self._index is None or url is not None or
time.time()-self._index_timestamp > self.INDEX_TIMEOUT):
return
# If a URL was specified, then update our URL.
self._url = url or self._url
# Download the index file.
self._index = nltk.internals.ElementWrapper(
ElementTree.parse(compat.urlopen(self._url)).getroot())
self._index_timestamp = time.time()
# Build a dictionary of packages.
packages = [Package.fromxml(p) for p in
self._index.findall('packages/package')]
self._packages = dict((p.id, p) for p in packages)
# Build a dictionary of collections.
collections = [Collection.fromxml(c) for c in
self._index.findall('collections/collection')]
self._collections = dict((c.id, c) for c in collections)
# Replace identifiers with actual children in collection.children.
for collection in self._collections.values():
for i, child_id in enumerate(collection.children):
if child_id in self._packages:
collection.children[i] = self._packages[child_id]
elif child_id in self._collections:
collection.children[i] = self._collections[child_id]
else:
print('removing collection member with no package: {}'.format(child_id))
del collection.children[i]
# Fill in collection.packages for each collection.
for collection in self._collections.values():
packages = {}
queue = [collection]
for child in queue:
if isinstance(child, Collection):
queue.extend(child.children)
else:
packages[child.id] = child
collection.packages = packages.values()
# Flush the status cache
self._status_cache.clear()