当前位置: 首页>>代码示例>>Java>>正文


Java Analyzer.getReuseStrategy方法代码示例

本文整理汇总了Java中org.apache.lucene.analysis.Analyzer.getReuseStrategy方法的典型用法代码示例。如果您正苦于以下问题:Java Analyzer.getReuseStrategy方法的具体用法?Java Analyzer.getReuseStrategy怎么用?Java Analyzer.getReuseStrategy使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在org.apache.lucene.analysis.Analyzer的用法示例。


在下文中一共展示了Analyzer.getReuseStrategy方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: QueryAutoStopWordAnalyzer

import org.apache.lucene.analysis.Analyzer; //导入方法依赖的package包/类
/**
 * Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for the
 * given selection of fields from terms with a document frequency greater than
 * the given maxDocFreq
 *
 * @param delegate Analyzer whose TokenStream will be filtered
 * @param indexReader IndexReader to identify the stopwords from
 * @param fields Selection of fields to calculate stopwords for
 * @param maxDocFreq Document frequency terms should be above in order to be stopwords
 * @throws IOException Can be thrown while reading from the IndexReader
 */
public QueryAutoStopWordAnalyzer(
    Analyzer delegate,
    IndexReader indexReader,
    Collection<String> fields,
    int maxDocFreq) throws IOException {
  super(delegate.getReuseStrategy());
  this.delegate = delegate;
  
  for (String field : fields) {
    Set<String> stopWords = new HashSet<>();
    Terms terms = MultiFields.getTerms(indexReader, field);
    CharsRefBuilder spare = new CharsRefBuilder();
    if (terms != null) {
      TermsEnum te = terms.iterator(null);
      BytesRef text;
      while ((text = te.next()) != null) {
        if (te.docFreq() > maxDocFreq) {
          spare.copyUTF8Bytes(text);
          stopWords.add(spare.toString());
        }
      }
    }
    stopWordsPerField.put(field, stopWords);
  }
}
 
开发者ID:lamsfoundation,项目名称:lams,代码行数:37,代码来源:QueryAutoStopWordAnalyzer.java

示例2: LimitTokenCountAnalyzer

import org.apache.lucene.analysis.Analyzer; //导入方法依赖的package包/类
/**
 * Build an analyzer that limits the maximum number of tokens per field.
 * @param delegate the analyzer to wrap
 * @param maxTokenCount max number of tokens to produce
 * @param consumeAllTokens whether all tokens from the delegate should be consumed even if maxTokenCount is reached.
 */
public LimitTokenCountAnalyzer(Analyzer delegate, int maxTokenCount, boolean consumeAllTokens) {
  super(delegate.getReuseStrategy());
  this.delegate = delegate;
  this.maxTokenCount = maxTokenCount;
  this.consumeAllTokens = consumeAllTokens;
}
 
开发者ID:lamsfoundation,项目名称:lams,代码行数:13,代码来源:LimitTokenCountAnalyzer.java

示例3: ShingleAnalyzerWrapper

import org.apache.lucene.analysis.Analyzer; //导入方法依赖的package包/类
/**
 * Creates a new ShingleAnalyzerWrapper
 *
 * @param delegate Analyzer whose TokenStream is to be filtered
 * @param minShingleSize Min shingle (token ngram) size
 * @param maxShingleSize Max shingle size
 * @param tokenSeparator Used to separate input stream tokens in output shingles
 * @param outputUnigrams Whether or not the filter shall pass the original
 *        tokens to the output stream
 * @param outputUnigramsIfNoShingles Overrides the behavior of outputUnigrams==false for those
 *        times when no shingles are available (because there are fewer than
 *        minShingleSize tokens in the input stream)?
 *        Note that if outputUnigrams==true, then unigrams are always output,
 *        regardless of whether any shingles are available.
 * @param fillerToken filler token to use when positionIncrement is more than 1
 */
public ShingleAnalyzerWrapper(
    Analyzer delegate,
    int minShingleSize,
    int maxShingleSize,
    String tokenSeparator,
    boolean outputUnigrams,
    boolean outputUnigramsIfNoShingles,
    String fillerToken) {
  super(delegate.getReuseStrategy());
  this.delegate = delegate;

  if (maxShingleSize < 2) {
    throw new IllegalArgumentException("Max shingle size must be >= 2");
  }
  this.maxShingleSize = maxShingleSize;

  if (minShingleSize < 2) {
    throw new IllegalArgumentException("Min shingle size must be >= 2");
  }
  if (minShingleSize > maxShingleSize) {
    throw new IllegalArgumentException
      ("Min shingle size must be <= max shingle size");
  }
  this.minShingleSize = minShingleSize;

  this.tokenSeparator = (tokenSeparator == null ? "" : tokenSeparator);
  this.outputUnigrams = outputUnigrams;
  this.outputUnigramsIfNoShingles = outputUnigramsIfNoShingles;
  this.fillerToken = fillerToken;
}
 
开发者ID:lamsfoundation,项目名称:lams,代码行数:47,代码来源:ShingleAnalyzerWrapper.java


注:本文中的org.apache.lucene.analysis.Analyzer.getReuseStrategy方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。