当前位置: 首页>>代码示例>>Python>>正文


Python treebank.TreebankWordDetokenizer方法代码示例

本文整理汇总了Python中nltk.tokenize.treebank.TreebankWordDetokenizer方法的典型用法代码示例。如果您正苦于以下问题:Python treebank.TreebankWordDetokenizer方法的具体用法?Python treebank.TreebankWordDetokenizer怎么用?Python treebank.TreebankWordDetokenizer使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在nltk.tokenize.treebank的用法示例。


在下文中一共展示了treebank.TreebankWordDetokenizer方法的5个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: __init__

# 需要导入模块: from nltk.tokenize import treebank [as 别名]
# 或者: from nltk.tokenize.treebank import TreebankWordDetokenizer [as 别名]
def __init__(self, *args, **kwargs):
        if 'tokenize' in kwargs:
            raise TypeError('``TreebankEncoder`` does not take keyword argument ``tokenize``.')

        if 'detokenize' in kwargs:
            raise TypeError('``TreebankEncoder`` does not take keyword argument ``detokenize``.')

        try:
            import nltk

            # Required for moses
            nltk.download('perluniprops')
            nltk.download('nonbreaking_prefixes')

            from nltk.tokenize.treebank import TreebankWordTokenizer
            from nltk.tokenize.treebank import TreebankWordDetokenizer
        except ImportError:
            print("Please install NLTK. " "See the docs at http://nltk.org for more information.")
            raise

        super().__init__(
            *args,
            tokenize=TreebankWordTokenizer().tokenize,
            detokenize=TreebankWordDetokenizer().detokenize,
            **kwargs) 
开发者ID:PetrochukM,项目名称:PyTorch-NLP,代码行数:27,代码来源:treebank_encoder.py

示例2: __init__

# 需要导入模块: from nltk.tokenize import treebank [as 别名]
# 或者: from nltk.tokenize.treebank import TreebankWordDetokenizer [as 别名]
def __init__(self, config, train_data_loader, eval_data_loader, vocab, is_train=True, model=None):
        self.config = config
        self.epoch_i = 0
        self.train_data_loader = train_data_loader
        self.eval_data_loader = eval_data_loader
        self.vocab = vocab
        self.is_train = is_train
        self.model = model
        self.detokenizer = Detok()

        if config.emotion or config.infersent or config.context_input_only:
            self.botmoji = Botmoji()
            self.botsent = Botsent(config.dataset_dir.joinpath('train'), version=1, explained_var=0.95)

        # Info for saving epoch metrics to a csv file
        if self.config.mode == 'train':
            self.pandas_path = os.path.join(config.save_path, "metrics.csv")
            self.outfile_dict = {k: getattr(config, k) for k in OUTPUT_FILE_PARAMS}
            self.df = pd.DataFrame()

        self.save_priming_sentences() 
开发者ID:natashamjaques,项目名称:neural_chat,代码行数:23,代码来源:solver.py

示例3: get_detokenize

# 需要导入模块: from nltk.tokenize import treebank [as 别名]
# 或者: from nltk.tokenize.treebank import TreebankWordDetokenizer [as 别名]
def get_detokenize():
    return lambda x: TreebankWordDetokenizer().detokenize(x) 
开发者ID:ConvLab,项目名称:ConvLab,代码行数:4,代码来源:utils.py

示例4: get_dekenize

# 需要导入模块: from nltk.tokenize import treebank [as 别名]
# 或者: from nltk.tokenize.treebank import TreebankWordDetokenizer [as 别名]
def get_dekenize():
    return lambda x: TreebankWordDetokenizer().detokenize(x) 
开发者ID:snakeztc,项目名称:NeuralDialog-ZSDG,代码行数:4,代码来源:utils.py

示例5: detokenize

# 需要导入模块: from nltk.tokenize import treebank [as 别名]
# 或者: from nltk.tokenize.treebank import TreebankWordDetokenizer [as 别名]
def detokenize(line):
    """
    Detokenizes the processed CNN/DM dataset to recover the original dataset,
    e.g. converts "-LRB-" back to "(" and "-RRB-" back to ")".
    """
    line = line.strip().replace("``", '"').replace("''", '"').replace("`", "'")
    twd = TreebankWordDetokenizer()
    s_list = [
        twd.detokenize(x.strip().split(" "), convert_parentheses=True)
        for x in line.split("<S_SEP>")
    ]
    return " ".join(s_list) 
开发者ID:microsoft,项目名称:nlp-recipes,代码行数:14,代码来源:cnndm.py


注:本文中的nltk.tokenize.treebank.TreebankWordDetokenizer方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。