本文整理汇总了Java中org.apache.lucene.analysis.Analyzer.getReuseStrategy方法的典型用法代码示例。如果您正苦于以下问题:Java Analyzer.getReuseStrategy方法的具体用法?Java Analyzer.getReuseStrategy怎么用?Java Analyzer.getReuseStrategy使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类org.apache.lucene.analysis.Analyzer
的用法示例。
在下文中一共展示了Analyzer.getReuseStrategy方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。
示例1: QueryAutoStopWordAnalyzer
import org.apache.lucene.analysis.Analyzer; //导入方法依赖的package包/类
/**
* Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for the
* given selection of fields from terms with a document frequency greater than
* the given maxDocFreq
*
* @param delegate Analyzer whose TokenStream will be filtered
* @param indexReader IndexReader to identify the stopwords from
* @param fields Selection of fields to calculate stopwords for
* @param maxDocFreq Document frequency terms should be above in order to be stopwords
* @throws IOException Can be thrown while reading from the IndexReader
*/
public QueryAutoStopWordAnalyzer(
Analyzer delegate,
IndexReader indexReader,
Collection<String> fields,
int maxDocFreq) throws IOException {
super(delegate.getReuseStrategy());
this.delegate = delegate;
for (String field : fields) {
Set<String> stopWords = new HashSet<>();
Terms terms = MultiFields.getTerms(indexReader, field);
CharsRefBuilder spare = new CharsRefBuilder();
if (terms != null) {
TermsEnum te = terms.iterator(null);
BytesRef text;
while ((text = te.next()) != null) {
if (te.docFreq() > maxDocFreq) {
spare.copyUTF8Bytes(text);
stopWords.add(spare.toString());
}
}
}
stopWordsPerField.put(field, stopWords);
}
}
示例2: LimitTokenCountAnalyzer
import org.apache.lucene.analysis.Analyzer; //导入方法依赖的package包/类
/**
* Build an analyzer that limits the maximum number of tokens per field.
* @param delegate the analyzer to wrap
* @param maxTokenCount max number of tokens to produce
* @param consumeAllTokens whether all tokens from the delegate should be consumed even if maxTokenCount is reached.
*/
public LimitTokenCountAnalyzer(Analyzer delegate, int maxTokenCount, boolean consumeAllTokens) {
super(delegate.getReuseStrategy());
this.delegate = delegate;
this.maxTokenCount = maxTokenCount;
this.consumeAllTokens = consumeAllTokens;
}
示例3: ShingleAnalyzerWrapper
import org.apache.lucene.analysis.Analyzer; //导入方法依赖的package包/类
/**
* Creates a new ShingleAnalyzerWrapper
*
* @param delegate Analyzer whose TokenStream is to be filtered
* @param minShingleSize Min shingle (token ngram) size
* @param maxShingleSize Max shingle size
* @param tokenSeparator Used to separate input stream tokens in output shingles
* @param outputUnigrams Whether or not the filter shall pass the original
* tokens to the output stream
* @param outputUnigramsIfNoShingles Overrides the behavior of outputUnigrams==false for those
* times when no shingles are available (because there are fewer than
* minShingleSize tokens in the input stream)?
* Note that if outputUnigrams==true, then unigrams are always output,
* regardless of whether any shingles are available.
* @param fillerToken filler token to use when positionIncrement is more than 1
*/
public ShingleAnalyzerWrapper(
Analyzer delegate,
int minShingleSize,
int maxShingleSize,
String tokenSeparator,
boolean outputUnigrams,
boolean outputUnigramsIfNoShingles,
String fillerToken) {
super(delegate.getReuseStrategy());
this.delegate = delegate;
if (maxShingleSize < 2) {
throw new IllegalArgumentException("Max shingle size must be >= 2");
}
this.maxShingleSize = maxShingleSize;
if (minShingleSize < 2) {
throw new IllegalArgumentException("Min shingle size must be >= 2");
}
if (minShingleSize > maxShingleSize) {
throw new IllegalArgumentException
("Min shingle size must be <= max shingle size");
}
this.minShingleSize = minShingleSize;
this.tokenSeparator = (tokenSeparator == null ? "" : tokenSeparator);
this.outputUnigrams = outputUnigrams;
this.outputUnigramsIfNoShingles = outputUnigramsIfNoShingles;
this.fillerToken = fillerToken;
}