当前位置: 首页>>代码示例>>Java>>正文


Java AttributeSource.addAttribute方法代码示例

本文整理汇总了Java中org.apache.lucene.util.AttributeSource.addAttribute方法的典型用法代码示例。如果您正苦于以下问题:Java AttributeSource.addAttribute方法的具体用法?Java AttributeSource.addAttribute怎么用?Java AttributeSource.addAttribute使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在org.apache.lucene.util.AttributeSource的用法示例。


在下文中一共展示了AttributeSource.addAttribute方法的10个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: accept

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
@Override
public boolean accept(AttributeSource source) {
  if (termAtt == null) {
    termAtt = source.addAttribute(CharTermAttribute.class);
  }
  try {
    Date date = dateFormat.parse(termAtt.toString());//We don't care about the date, just that we can parse it as a date
    if (date != null) {
      return true;
    }
  } catch (ParseException e) {

  }
  
  return false;
}
 
开发者ID:lamsfoundation,项目名称:lams,代码行数:17,代码来源:DateRecognizerSinkFilter.java

示例2: setAttributeSource

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
/**
 * Sets attributeSource to a new instance.
 */
void setAttributeSource(AttributeSource attributeSource) {
  if (this.attributeSource != attributeSource) {
    this.attributeSource = attributeSource;
    termAttribute = attributeSource.getAttribute(TermToBytesRefAttribute.class);
    posIncrAttribute = attributeSource.addAttribute(PositionIncrementAttribute.class);
    offsetAttribute = attributeSource.addAttribute(OffsetAttribute.class);
    payloadAttribute = attributeSource.getAttribute(PayloadAttribute.class);
  }
}
 
开发者ID:lamsfoundation,项目名称:lams,代码行数:13,代码来源:FieldInvertState.java

示例3: accept

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
@Override
public boolean accept(AttributeSource source) {
  if (typeAtt == null) {
    typeAtt = source.addAttribute(TypeAttribute.class);
  }
  
  //check to see if this is a Category
  return (typeToMatch.equals(typeAtt.type()));
}
 
开发者ID:lamsfoundation,项目名称:lams,代码行数:10,代码来源:TokenTypeSinkFilter.java

示例4: delegatingAttributeFactory

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
/** Make this tokenizer get attributes from the delegate token stream. */
private static final AttributeFactory delegatingAttributeFactory(final AttributeSource source) {
    return new AttributeFactory() {
        @Override
        public AttributeImpl createAttributeInstance(Class<? extends Attribute> attClass) {
            return (AttributeImpl) source.addAttribute(attClass);
        }
    };
}
 
开发者ID:baidu,项目名称:Elasticsearch,代码行数:10,代码来源:NumericTokenizer.java

示例5: appendPayloads

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
private void appendPayloads(String[] tags, int length) {
    for (int i = 0; i < length; i++) {
        AttributeSource attrs = tokenAttrs.get(i);
        if (tags[i] != null) {
            try {
                PayloadAttribute payloadAtt = attrs.hasAttribute(PayloadAttribute.class) ? attrs.getAttribute(PayloadAttribute.class) : attrs.addAttribute(PayloadAttribute.class);
                BytesRef bytesRef = new BytesRef(tags[i].toUpperCase(Locale.getDefault()).getBytes("UTF-8"));
                payloadAtt.setPayload(bytesRef);
            } catch (UnsupportedEncodingException e) {
                throw new RuntimeException(e);
            }
        }
    }
}
 
开发者ID:jprante,项目名称:elasticsearch-analysis-opennlp,代码行数:15,代码来源:OpenNLPTokenFilter.java

示例6: delegatingAttributeFactory

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
/** Make this Tokenizer get attributes from the delegate token stream. */
private static final AttributeFactory delegatingAttributeFactory(final AttributeSource source) {
    return new AttributeFactory() {
        @Override
        public AttributeImpl createAttributeInstance(Class<? extends Attribute> attClass) {
            return (AttributeImpl) source.addAttribute(attClass);
        }
    };
}
 
开发者ID:shaie,项目名称:lucenelab,代码行数:10,代码来源:XMLParsingTokenizer.java

示例7: FuzzyTermsEnum

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
/**
 * Constructor for enumeration of all terms from specified <code>reader</code> which share a prefix of
 * length <code>prefixLength</code> with <code>term</code> and which have a fuzzy similarity &gt;
 * <code>minSimilarity</code>.
 * <p>
 * After calling the constructor the enumeration is already pointing to the first 
 * valid term if such a term exists. 
 * 
 * @param terms Delivers terms.
 * @param atts {@link AttributeSource} created by the rewrite method of {@link MultiTermQuery}
 * thats contains information about competitive boosts during rewrite. It is also used
 * to cache DFAs between segment transitions.
 * @param term Pattern term.
 * @param minSimilarity Minimum required similarity for terms from the reader. Pass an integer value
 *        representing edit distance. Passing a fraction is deprecated.
 * @param prefixLength Length of required common prefix. Default value is 0.
 * @throws IOException if there is a low-level IO error
 */
public FuzzyTermsEnum(Terms terms, AttributeSource atts, Term term, 
    final float minSimilarity, final int prefixLength, boolean transpositions) throws IOException {
  if (minSimilarity >= 1.0f && minSimilarity != (int)minSimilarity)
    throw new IllegalArgumentException("fractional edit distances are not allowed");
  if (minSimilarity < 0.0f)
    throw new IllegalArgumentException("minimumSimilarity cannot be less than 0");
  if(prefixLength < 0)
    throw new IllegalArgumentException("prefixLength cannot be less than 0");
  this.terms = terms;
  this.term = term;

  // convert the string into a utf32 int[] representation for fast comparisons
  final String utf16 = term.text();
  this.termText = new int[utf16.codePointCount(0, utf16.length())];
  for (int cp, i = 0, j = 0; i < utf16.length(); i += Character.charCount(cp))
         termText[j++] = cp = utf16.codePointAt(i);
  this.termLength = termText.length;
  this.dfaAtt = atts.addAttribute(LevenshteinAutomataAttribute.class);

  //The prefix could be longer than the word.
  //It's kind of silly though.  It means we must match the entire word.
  this.realPrefixLength = prefixLength > termLength ? termLength : prefixLength;
  // if minSimilarity >= 1, we treat it as number of edits
  if (minSimilarity >= 1f) {
    this.minSimilarity = 0; // just driven by number of edits
    maxEdits = (int) minSimilarity;
    raw = true;
  } else {
    this.minSimilarity = minSimilarity;
    // calculate the maximum k edits for this similarity
    maxEdits = initialMaxDistance(this.minSimilarity, termLength);
    raw = false;
  }
  if (transpositions && maxEdits > LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE) {
    throw new UnsupportedOperationException("with transpositions enabled, distances > " 
      + LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE + " are not supported ");
  }
  this.transpositions = transpositions;
  this.scale_factor = 1.0f / (1.0f - this.minSimilarity);

  this.maxBoostAtt = atts.addAttribute(MaxNonCompetitiveBoostAttribute.class);
  bottom = maxBoostAtt.getMaxNonCompetitiveBoost();
  bottomTerm = maxBoostAtt.getCompetitiveTerm();
  bottomChanged(null, true);
}
 
开发者ID:lamsfoundation,项目名称:lams,代码行数:64,代码来源:FuzzyTermsEnum.java

示例8: suggestSimilar

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
/**
 * Provide spelling corrections based on several parameters.
 *
 * @param term The term to suggest spelling corrections for
 * @param numSug The maximum number of spelling corrections
 * @param ir The index reader to fetch the candidate spelling corrections from
 * @param docfreq The minimum document frequency a potential suggestion need to have in order to be included
 * @param editDistance The maximum edit distance candidates are allowed to have
 * @param accuracy The minimum accuracy a suggested spelling correction needs to have in order to be included
 * @param spare a chars scratch
 * @return a collection of spelling corrections sorted by <code>ScoreTerm</code>'s natural order.
 * @throws IOException If I/O related errors occur
 */
protected Collection<ScoreTerm> suggestSimilar(Term term, int numSug, IndexReader ir, int docfreq, int editDistance,
                                               float accuracy, final CharsRefBuilder spare) throws IOException {
  
  AttributeSource atts = new AttributeSource();
  MaxNonCompetitiveBoostAttribute maxBoostAtt =
    atts.addAttribute(MaxNonCompetitiveBoostAttribute.class);
  Terms terms = MultiFields.getTerms(ir, term.field());
  if (terms == null) {
    return Collections.emptyList();
  }
  FuzzyTermsEnum e = new FuzzyTermsEnum(terms, atts, term, editDistance, Math.max(minPrefix, editDistance-1), true);
  final PriorityQueue<ScoreTerm> stQueue = new PriorityQueue<>();
  
  BytesRef queryTerm = new BytesRef(term.text());
  BytesRef candidateTerm;
  ScoreTerm st = new ScoreTerm();
  BoostAttribute boostAtt =
    e.attributes().addAttribute(BoostAttribute.class);
  while ((candidateTerm = e.next()) != null) {
    final float boost = boostAtt.getBoost();
    // ignore uncompetitive hits
    if (stQueue.size() >= numSug && boost <= stQueue.peek().boost)
      continue;
    
    // ignore exact match of the same term
    if (queryTerm.bytesEquals(candidateTerm))
      continue;
    
    int df = e.docFreq();
    
    // check docFreq if required
    if (df <= docfreq)
      continue;
    
    final float score;
    final String termAsString;
    if (distance == INTERNAL_LEVENSHTEIN) {
      // delay creating strings until the end
      termAsString = null;
      // undo FuzzyTermsEnum's scale factor for a real scaled lev score
      score = boost / e.getScaleFactor() + e.getMinSimilarity();
    } else {
      spare.copyUTF8Bytes(candidateTerm);
      termAsString = spare.toString();
      score = distance.getDistance(term.text(), termAsString);
    }
    
    if (score < accuracy)
      continue;
    
    // add new entry in PQ
    st.term = BytesRef.deepCopyOf(candidateTerm);
    st.boost = boost;
    st.docfreq = df;
    st.termAsString = termAsString;
    st.score = score;
    stQueue.offer(st);
    // possibly drop entries from queue
    st = (stQueue.size() > numSug) ? stQueue.poll() : new ScoreTerm();
    maxBoostAtt.setMaxNonCompetitiveBoost((stQueue.size() >= numSug) ? stQueue.peek().boost : Float.NEGATIVE_INFINITY);
  }
    
  return stQueue;
}
 
开发者ID:europeana,项目名称:search,代码行数:78,代码来源:DirectSpellChecker.java

示例9: createState

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
private static AttributeSource.State createState(AttributeSource a, Tok state, int tokenEnd) {
  a.clearAttributes();
  CharTermAttribute termAtt = a.addAttribute(CharTermAttribute.class);
  char[] tokChars = state.token.toString().toCharArray();
  termAtt.copyBuffer(tokChars, 0, tokChars.length);
  int tokenStart = tokenEnd - state.token.length();
  for (Entry<String, String> e : state.attr.entrySet()) {
    String k = e.getKey();
    if (k.equals("i")) {
      // position increment
      int incr = Integer.parseInt(e.getValue());
      PositionIncrementAttribute posIncr = a.addAttribute(PositionIncrementAttribute.class);
      posIncr.setPositionIncrement(incr);
    } else if (k.equals("s")) {
      tokenStart = Integer.parseInt(e.getValue());
    } else if (k.equals("e")) {
      tokenEnd = Integer.parseInt(e.getValue());
    } else if (k.equals("y")) {
      TypeAttribute type = a.addAttribute(TypeAttribute.class);
      type.setType(e.getValue());
    } else if (k.equals("f")) {
      FlagsAttribute flags = a.addAttribute(FlagsAttribute.class);
      int f = Integer.parseInt(e.getValue(), 16);
      flags.setFlags(f);
    } else if (k.equals("p")) {
      PayloadAttribute p = a.addAttribute(PayloadAttribute.class);
      byte[] data = hexToBytes(e.getValue());
      if (data != null && data.length > 0) {
        p.setPayload(new BytesRef(data));
      }
    } else {
      // unknown attribute
    }
  }
  // handle offset attr
  OffsetAttribute offset = a.addAttribute(OffsetAttribute.class);
  offset.setOffset(tokenStart, tokenEnd);
  State resState = a.captureState();
  a.clearAttributes();
  return resState;
}
 
开发者ID:europeana,项目名称:search,代码行数:42,代码来源:SimplePreAnalyzedParser.java

示例10: suggestSimilar

import org.apache.lucene.util.AttributeSource; //导入方法依赖的package包/类
/**
 * Provide spelling corrections based on several parameters.
 *
 * @param term The term to suggest spelling corrections for
 * @param numSug The maximum number of spelling corrections
 * @param ir The index reader to fetch the candidate spelling corrections from
 * @param docfreq The minimum document frequency a potential suggestion need to have in order to be included
 * @param editDistance The maximum edit distance candidates are allowed to have
 * @param accuracy The minimum accuracy a suggested spelling correction needs to have in order to be included
 * @param spare a chars scratch
 * @return a collection of spelling corrections sorted by <code>ScoreTerm</code>'s natural order.
 * @throws IOException If I/O related errors occur
 */
protected Collection<ScoreTerm> suggestSimilar(Term term, int numSug, IndexReader ir, int docfreq, int editDistance,
                                               float accuracy, final CharsRef spare) throws IOException {
  
  AttributeSource atts = new AttributeSource();
  MaxNonCompetitiveBoostAttribute maxBoostAtt =
    atts.addAttribute(MaxNonCompetitiveBoostAttribute.class);
  Terms terms = MultiFields.getTerms(ir, term.field());
  if (terms == null) {
    return Collections.emptyList();
  }
  FuzzyTermsEnum e = new FuzzyTermsEnum(terms, atts, term, editDistance, Math.max(minPrefix, editDistance-1), true);
  final PriorityQueue<ScoreTerm> stQueue = new PriorityQueue<ScoreTerm>();
  
  BytesRef queryTerm = new BytesRef(term.text());
  BytesRef candidateTerm;
  ScoreTerm st = new ScoreTerm();
  BoostAttribute boostAtt =
    e.attributes().addAttribute(BoostAttribute.class);
  while ((candidateTerm = e.next()) != null) {
    final float boost = boostAtt.getBoost();
    // ignore uncompetitive hits
    if (stQueue.size() >= numSug && boost <= stQueue.peek().boost)
      continue;
    
    // ignore exact match of the same term
    if (queryTerm.bytesEquals(candidateTerm))
      continue;
    
    int df = e.docFreq();
    
    // check docFreq if required
    if (df <= docfreq)
      continue;
    
    final float score;
    final String termAsString;
    if (distance == INTERNAL_LEVENSHTEIN) {
      // delay creating strings until the end
      termAsString = null;
      // undo FuzzyTermsEnum's scale factor for a real scaled lev score
      score = boost / e.getScaleFactor() + e.getMinSimilarity();
    } else {
      UnicodeUtil.UTF8toUTF16(candidateTerm, spare);
      termAsString = spare.toString();
      score = distance.getDistance(term.text(), termAsString);
    }
    
    if (score < accuracy)
      continue;
    
    // add new entry in PQ
    st.term = BytesRef.deepCopyOf(candidateTerm);
    st.boost = boost;
    st.docfreq = df;
    st.termAsString = termAsString;
    st.score = score;
    stQueue.offer(st);
    // possibly drop entries from queue
    st = (stQueue.size() > numSug) ? stQueue.poll() : new ScoreTerm();
    maxBoostAtt.setMaxNonCompetitiveBoost((stQueue.size() >= numSug) ? stQueue.peek().boost : Float.NEGATIVE_INFINITY);
  }
    
  return stQueue;
}
 
开发者ID:pkarmstr,项目名称:NYBC,代码行数:78,代码来源:DirectSpellChecker.java


注:本文中的org.apache.lucene.util.AttributeSource.addAttribute方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。