Python SolrClient.get_industry_term_field_analysis方法代码示例

本文整理汇总了Python中SolrClient.SolrClient.get_industry_term_field_analysis方法的典型用法代码示例。如果您正苦于以下问题：Python SolrClient.get_industry_term_field_analysis方法的具体用法？Python SolrClient.get_industry_term_field_analysis怎么用？Python SolrClient.get_industry_term_field_analysis使用的例子？那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类SolrClient.SolrClient的用法示例。

在下文中一共展示了SolrClient.get_industry_term_field_analysis方法的1个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: TaggingProcessor

# 需要导入模块: from SolrClient import SolrClient [as 别名]
# 或者: from SolrClient.SolrClient import get_industry_term_field_analysis [as 别名]

#.........这里部分代码省略.........
        
        if not self.dict_tagging:
            self._logger.info("dictionary tagging is set to false. Disable dictionary tagging.")
            return
        
        self._logger.info("Dictionary tagging is enabled.")
        
        try:
            self.dictionary_file = config['DICTIONARY_TAGGER']['dictionary_file']
        except KeyError:
            self._logger.exception("Oops! 'dict_tagging' is set incorrectly in config file. Default to use default csv file in config dir.")
            self.dictionary_file = os.path.join(os.path.dirname(__file__), '..','config','Steel-Terminology-Tata-Steel.csv')
        
        
        try:
            self.dict_tagger_fuzzy_matching=config['DICTIONARY_TAGGER']['dict_tagger_fuzzy_matching']
            if "true" == self.dict_tagger_fuzzy_matching.lower():
                self.dict_tagger_fuzzy_matching = True
            elif "false" == self.dict_tagger_fuzzy_matching.lower():
                self.dict_tagger_fuzzy_matching = False
        except KeyError:
            self._logger.exception("Oops! 'dict_tagger_fuzzy_matching' is set incorrectly in config file. Default to False.")
            self.dict_tagger_fuzzy_matching=False
        
        try:
            self.dict_tagger_sim_threshold=float(config['DICTIONARY_TAGGER']['dict_tagger_sim_threshold'])
        except KeyError:
            self._logger.exception("Oops! 'dict_tagger_sim_threshold' is set incorrectly in config file. Default to 0.95.")
            self.dict_tagger_sim_threshold=float(0.95)
        
        self.dict_terms = load_terms_from_csv(self.dictionary_file)
        
        self._logger.info("normalising terms from dictionary...")
        self.dict_terms = [self.solrClient.get_industry_term_field_analysis(dict_term) for dict_term in self.dict_terms]
        self._logger.info("dictionary terms are normalised and loaded successfully. Total dictionary term size is [%s]", str(len(self.dict_terms)))
        
        if self.dict_tagger_fuzzy_matching:
            self._logger.info("loading into Trie nodes for fuzzy matching...")
            self.dict_terms_trie = TrieNode()
            [self.dict_terms_trie.insert(normed_term) for normed_term in self.dict_terms]
            self._logger.info("loaded into Trie nodes successfully.")
        else:
            self.dict_terms_trie = TrieNode()
        
    def load_grammars(self):
        grammars=[]
        
        pos_sequences = read_by_line(self.pos_sequences_file)
        for sequence_str in pos_sequences:
            grammars.append(sequence_str.replace('\n','').strip())
        
        return grammars
    
    def parsing_candidates_regexp(self, text_pos_tokens,candidate_grammar):
        cp = nltk.RegexpParser(candidate_grammar)
        
        candidate_chunk=cp.parse(text_pos_tokens)    
        term_candidates=set()
        for node_a in candidate_chunk:
            if type(node_a) is nltk.Tree:
                if node_a.label() == 'TermCandidate':
                    term_tokens=[]
                    for node_b in node_a:
                        if node_b[0] == '"':
                            #TODO: find a more elegant way to deal with spurious POS tagging for quotes
                            continue

开发者ID:jerrygaoLondon，项目名称:SPTR，代码行数:70，代码来源:TaggingProcessor.py

注：本文中的SolrClient.SolrClient.get_industry_term_field_analysis方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。