當前位置: 首頁>>代碼示例>>Java>>正文


Java Analyzer.getReuseStrategy方法代碼示例

本文整理匯總了Java中org.apache.lucene.analysis.Analyzer.getReuseStrategy方法的典型用法代碼示例。如果您正苦於以下問題:Java Analyzer.getReuseStrategy方法的具體用法?Java Analyzer.getReuseStrategy怎麽用?Java Analyzer.getReuseStrategy使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.lucene.analysis.Analyzer的用法示例。


在下文中一共展示了Analyzer.getReuseStrategy方法的3個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: QueryAutoStopWordAnalyzer

import org.apache.lucene.analysis.Analyzer; //導入方法依賴的package包/類
/**
 * Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for the
 * given selection of fields from terms with a document frequency greater than
 * the given maxDocFreq
 *
 * @param delegate Analyzer whose TokenStream will be filtered
 * @param indexReader IndexReader to identify the stopwords from
 * @param fields Selection of fields to calculate stopwords for
 * @param maxDocFreq Document frequency terms should be above in order to be stopwords
 * @throws IOException Can be thrown while reading from the IndexReader
 */
public QueryAutoStopWordAnalyzer(
    Analyzer delegate,
    IndexReader indexReader,
    Collection<String> fields,
    int maxDocFreq) throws IOException {
  super(delegate.getReuseStrategy());
  this.delegate = delegate;
  
  for (String field : fields) {
    Set<String> stopWords = new HashSet<>();
    Terms terms = MultiFields.getTerms(indexReader, field);
    CharsRefBuilder spare = new CharsRefBuilder();
    if (terms != null) {
      TermsEnum te = terms.iterator(null);
      BytesRef text;
      while ((text = te.next()) != null) {
        if (te.docFreq() > maxDocFreq) {
          spare.copyUTF8Bytes(text);
          stopWords.add(spare.toString());
        }
      }
    }
    stopWordsPerField.put(field, stopWords);
  }
}
 
開發者ID:lamsfoundation,項目名稱:lams,代碼行數:37,代碼來源:QueryAutoStopWordAnalyzer.java

示例2: LimitTokenCountAnalyzer

import org.apache.lucene.analysis.Analyzer; //導入方法依賴的package包/類
/**
 * Build an analyzer that limits the maximum number of tokens per field.
 * @param delegate the analyzer to wrap
 * @param maxTokenCount max number of tokens to produce
 * @param consumeAllTokens whether all tokens from the delegate should be consumed even if maxTokenCount is reached.
 */
public LimitTokenCountAnalyzer(Analyzer delegate, int maxTokenCount, boolean consumeAllTokens) {
  super(delegate.getReuseStrategy());
  this.delegate = delegate;
  this.maxTokenCount = maxTokenCount;
  this.consumeAllTokens = consumeAllTokens;
}
 
開發者ID:lamsfoundation,項目名稱:lams,代碼行數:13,代碼來源:LimitTokenCountAnalyzer.java

示例3: ShingleAnalyzerWrapper

import org.apache.lucene.analysis.Analyzer; //導入方法依賴的package包/類
/**
 * Creates a new ShingleAnalyzerWrapper
 *
 * @param delegate Analyzer whose TokenStream is to be filtered
 * @param minShingleSize Min shingle (token ngram) size
 * @param maxShingleSize Max shingle size
 * @param tokenSeparator Used to separate input stream tokens in output shingles
 * @param outputUnigrams Whether or not the filter shall pass the original
 *        tokens to the output stream
 * @param outputUnigramsIfNoShingles Overrides the behavior of outputUnigrams==false for those
 *        times when no shingles are available (because there are fewer than
 *        minShingleSize tokens in the input stream)?
 *        Note that if outputUnigrams==true, then unigrams are always output,
 *        regardless of whether any shingles are available.
 * @param fillerToken filler token to use when positionIncrement is more than 1
 */
public ShingleAnalyzerWrapper(
    Analyzer delegate,
    int minShingleSize,
    int maxShingleSize,
    String tokenSeparator,
    boolean outputUnigrams,
    boolean outputUnigramsIfNoShingles,
    String fillerToken) {
  super(delegate.getReuseStrategy());
  this.delegate = delegate;

  if (maxShingleSize < 2) {
    throw new IllegalArgumentException("Max shingle size must be >= 2");
  }
  this.maxShingleSize = maxShingleSize;

  if (minShingleSize < 2) {
    throw new IllegalArgumentException("Min shingle size must be >= 2");
  }
  if (minShingleSize > maxShingleSize) {
    throw new IllegalArgumentException
      ("Min shingle size must be <= max shingle size");
  }
  this.minShingleSize = minShingleSize;

  this.tokenSeparator = (tokenSeparator == null ? "" : tokenSeparator);
  this.outputUnigrams = outputUnigrams;
  this.outputUnigramsIfNoShingles = outputUnigramsIfNoShingles;
  this.fillerToken = fillerToken;
}
 
開發者ID:lamsfoundation,項目名稱:lams,代碼行數:47,代碼來源:ShingleAnalyzerWrapper.java


注:本文中的org.apache.lucene.analysis.Analyzer.getReuseStrategy方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。