當前位置: 首頁>>代碼示例>>Java>>正文


Java CharTermAttribute.buffer方法代碼示例

本文整理匯總了Java中org.apache.lucene.analysis.tokenattributes.CharTermAttribute.buffer方法的典型用法代碼示例。如果您正苦於以下問題:Java CharTermAttribute.buffer方法的具體用法?Java CharTermAttribute.buffer怎麽用?Java CharTermAttribute.buffer使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.lucene.analysis.tokenattributes.CharTermAttribute的用法示例。


在下文中一共展示了CharTermAttribute.buffer方法的5個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: stemHinglish

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; //導入方法依賴的package包/類
public static void stemHinglish(CharTermAttribute termAtt)
{
    char [] buffer                  = termAtt.buffer();
    String strInput 		= new String(termAtt.toString());
    //System.out.println("Before " + strInput + " " + termAtt.toString());
    Iterator itr			= lsRegexs.iterator();
    while (itr.hasNext())
    {
            List<Object> lsInputs 	= (List<Object>)itr.next();
            Matcher matcher		= ((Pattern)lsInputs.get(0)).matcher(strInput);
            if (matcher.matches())
            {
                    Matcher replMatcher	= ((Pattern)lsInputs.get(1)).matcher(strInput);
                    strInput		= replMatcher.replaceAll((String)lsInputs.get(2));
            }
    }

    //strInput = strInput.trim();
    for (int iCounter = 0; iCounter < strInput.length(); iCounter++)
    {
        buffer[iCounter] = strInput.charAt(iCounter);
    }
    termAtt.setLength(strInput.length());
    //System.out.println("After " + strInput + " " + termAtt.toString());
}
 
開發者ID:Mangu-Singh-Rajpurohit,項目名稱:hinglish-stemmer,代碼行數:26,代碼來源:HinglishTokenFilter.java

示例2: walkTokens

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; //導入方法依賴的package包/類
private String[] walkTokens() throws IOException {
    List<String> wordList = new ArrayList<>();
    while (input.incrementToken()) {
        CharTermAttribute textAtt = input.getAttribute(CharTermAttribute.class);
        OffsetAttribute offsetAtt = input.getAttribute(OffsetAttribute.class);
        char[] buffer = textAtt.buffer();
        String word = new String(buffer, 0, offsetAtt.endOffset() - offsetAtt.startOffset());
        wordList.add(word);
        AttributeSource attrs = input.cloneAttributes();
        tokenAttrs.add(attrs);
    }
    String[] words = new String[wordList.size()];
    for (int i = 0; i < words.length; i++) {
        words[i] = wordList.get(i);
    }
    return words;
}
 
開發者ID:jprante,項目名稱:elasticsearch-analysis-opennlp,代碼行數:18,代碼來源:OpenNLPTokenFilter.java

示例3: handleTokenStream

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; //導入方法依賴的package包/類
private void handleTokenStream(Map<Integer, List<Token>> tokenPosMap, TokenStream tokenStream) throws IOException {
    tokenStream.reset();
    int pos = 0;

    CharTermAttribute charTermAttribute = getCharTermAttribute(tokenStream);
    OffsetAttribute offsetAttribute = getOffsetAttribute(tokenStream);
    TypeAttribute typeAttribute = getTypeAttribute(tokenStream);
    PositionIncrementAttribute positionIncrementAttribute = getPositionIncrementAttribute(tokenStream);

    while (tokenStream.incrementToken()) {
        if (null == charTermAttribute || null == offsetAttribute) {
            return;
        }
        Token token = new Token(charTermAttribute.buffer(), 0, charTermAttribute.length(),
                offsetAttribute.startOffset(), offsetAttribute.endOffset());
        if (null != typeAttribute) {
            token.setType(typeAttribute.type());
        }
        pos += null != positionIncrementAttribute ? positionIncrementAttribute.getPositionIncrement() : 1;
        if (!tokenPosMap.containsKey(pos)) {
            tokenPosMap.put(pos, new LinkedList<Token>());
        }
        tokenPosMap.get(pos).add(token);
    }
    tokenStream.close();
}
 
開發者ID:smalldirector,項目名稱:solr-multilingual-analyzer,代碼行數:27,代碼來源:MultiLangTokenizer.java

示例4: setText

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; //導入方法依賴的package包/類
void setText(final CharTermAttribute token) {
  this.token = token;
  this.buffer = token.buffer();
  this.length = token.length();
}
 
開發者ID:europeana,項目名稱:search,代碼行數:6,代碼來源:ICUTransformFilter.java

示例5: toFormattedString

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; //導入方法依賴的package包/類
@Override
public String toFormattedString(Field f) throws IOException {
  Map<String,Object> map = new LinkedHashMap<>();
  map.put(VERSION_KEY, VERSION);
  if (f.fieldType().stored()) {
    String stringValue = f.stringValue();
    if (stringValue != null) {
      map.put(STRING_KEY, stringValue);
    }
    BytesRef binaryValue = f.binaryValue();
    if (binaryValue != null) {
      map.put(BINARY_KEY, Base64.byteArrayToBase64(binaryValue.bytes, binaryValue.offset, binaryValue.length));
    }
  }
  TokenStream ts = f.tokenStreamValue();
  if (ts != null) {
    List<Map<String,Object>> tokens = new LinkedList<>();
    while (ts.incrementToken()) {
      Iterator<Class<? extends Attribute>> it = ts.getAttributeClassesIterator();
      String cTerm = null;
      String tTerm = null;
      Map<String,Object> tok = new TreeMap<>();
      while (it.hasNext()) {
        Class<? extends Attribute> cl = it.next();
        Attribute att = ts.getAttribute(cl);
        if (att == null) {
          continue;
        }
        if (cl.isAssignableFrom(CharTermAttribute.class)) {
          CharTermAttribute catt = (CharTermAttribute)att;
          cTerm = new String(catt.buffer(), 0, catt.length());
        } else if (cl.isAssignableFrom(TermToBytesRefAttribute.class)) {
          TermToBytesRefAttribute tatt = (TermToBytesRefAttribute)att;
          tTerm = tatt.getBytesRef().utf8ToString();
        } else {
          if (cl.isAssignableFrom(FlagsAttribute.class)) {
            tok.put(FLAGS_KEY, Integer.toHexString(((FlagsAttribute)att).getFlags()));
          } else if (cl.isAssignableFrom(OffsetAttribute.class)) {
            tok.put(OFFSET_START_KEY, ((OffsetAttribute)att).startOffset());
            tok.put(OFFSET_END_KEY, ((OffsetAttribute)att).endOffset());
          } else if (cl.isAssignableFrom(PayloadAttribute.class)) {
            BytesRef p = ((PayloadAttribute)att).getPayload();
            if (p != null && p.length > 0) {
              tok.put(PAYLOAD_KEY, Base64.byteArrayToBase64(p.bytes, p.offset, p.length));
            }
          } else if (cl.isAssignableFrom(PositionIncrementAttribute.class)) {
            tok.put(POSINCR_KEY, ((PositionIncrementAttribute)att).getPositionIncrement());
          } else if (cl.isAssignableFrom(TypeAttribute.class)) {
            tok.put(TYPE_KEY, ((TypeAttribute)att).type());
          } else {
            tok.put(cl.getName(), att.toString());
          }
        }
      }
      String term = null;
      if (cTerm != null) {
        term = cTerm;
      } else {
        term = tTerm;
      }
      if (term != null && term.length() > 0) {
        tok.put(TOKEN_KEY, term);
      }
      tokens.add(tok);
    }
    map.put(TOKENS_KEY, tokens);
  }
  return JSONUtil.toJSON(map, -1);
}
 
開發者ID:europeana,項目名稱:search,代碼行數:70,代碼來源:JsonPreAnalyzedParser.java


注:本文中的org.apache.lucene.analysis.tokenattributes.CharTermAttribute.buffer方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。