當前位置: 首頁>>代碼示例>>Java>>正文


Java PartOfSpeechAttribute類代碼示例

本文整理匯總了Java中org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute的典型用法代碼示例。如果您正苦於以下問題:Java PartOfSpeechAttribute類的具體用法?Java PartOfSpeechAttribute怎麽用?Java PartOfSpeechAttribute使用的例子?那麽, 這裏精選的類代碼示例或許可以為您提供幫助。


PartOfSpeechAttribute類屬於org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes包,在下文中一共展示了PartOfSpeechAttribute類的4個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: kuromojineologd

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //導入依賴的package包/類
private List<TokenElement> kuromojineologd(String src) throws IOException {
    tokenizer.setReader(new StringReader(src));
    List<TokenElement> tokens = new ArrayList<>();
    BaseFormAttribute baseAttr = tokenizer.addAttribute(BaseFormAttribute.class);
    CharTermAttribute charAttr = tokenizer.addAttribute(CharTermAttribute.class);
    PartOfSpeechAttribute posAttr = tokenizer.addAttribute(PartOfSpeechAttribute.class);
    ReadingAttribute readAttr = tokenizer.addAttribute(ReadingAttribute.class);
    OffsetAttribute offsetAttr  = tokenizer.addAttribute(OffsetAttribute.class);
    InflectionAttribute inflectionAttr = tokenizer.addAttribute(InflectionAttribute.class);
    tokenizer.reset();
    while (tokenizer.incrementToken()) {
        String surface = charAttr.toString();
        tokens.add(new TokenElement(surface,
                getTagList(posAttr, inflectionAttr),
                offsetAttr.startOffset(),
                readAttr.getReading()
        ));
    }
    tokenizer.close();
    return tokens;
}
 
開發者ID:redpen-cc,項目名稱:redpen,代碼行數:22,代碼來源:NeologdJapaneseTokenizer.java

示例2: create

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //導入依賴的package包/類
@Override
public TokenStream create(TokenStream tokenStream) {
    final PartOfSpeechAttribute posAtt = tokenStream.addAttribute(PartOfSpeechAttribute.class);
    return new PosConcatenationFilter(tokenStream, posTags, new PartOfSpeechSupplier() {
        @Override
        public String get() {
            return posAtt.getPartOfSpeech();
        }
    });
}
 
開發者ID:codelibs,項目名稱:elasticsearch-analysis-kuromoji-neologd,代碼行數:11,代碼來源:PosConcatenationFilterFactory.java

示例3: tokenize

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //導入依賴的package包/類
private List<Token> tokenize(Reader reader)
{
    List<Token> list = Lists.newArrayList();
    try (TokenStream tokenStream = japaneseAnalyzer.tokenStream("", reader)) {
        BaseFormAttribute baseAttr = tokenStream.addAttribute(BaseFormAttribute.class);
        CharTermAttribute charAttr = tokenStream.addAttribute(CharTermAttribute.class);
        PartOfSpeechAttribute posAttr = tokenStream.addAttribute(PartOfSpeechAttribute.class);
        ReadingAttribute readAttr = tokenStream.addAttribute(ReadingAttribute.class);

        tokenStream.reset();
        while (tokenStream.incrementToken()) {
            Token token = new Token();
            token.setCharTerm(charAttr.toString());
            token.setBaseForm(baseAttr.getBaseForm());
            token.setReading(readAttr.getReading());
            token.setPartOfSpeech(posAttr.getPartOfSpeech());
            if (!isOkPartsOfSpeech(token)) {
                continue;
            }
            list.add(token);
        }
    }
    catch (Exception e) {
        logger.error("neologd error", e);
    }
    return list;
}
 
開發者ID:toyama0919,項目名稱:embulk-filter-kuromoji,代碼行數:28,代碼來源:NeologdPageOutput.java

示例4: getTagList

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //導入依賴的package包/類
private List<String> getTagList(PartOfSpeechAttribute posAttr, InflectionAttribute inflectionAttr) {
    List<String> posList = new ArrayList<>();
    posList.addAll(Arrays.asList(posAttr.getPartOfSpeech().split("-")));
    String form = inflectionAttr.getInflectionForm() == null ? "*" : inflectionAttr.getInflectionForm();
    String type = inflectionAttr.getInflectionType() == null ? "*" : inflectionAttr.getInflectionType();
    posList.add(type);
    posList.add(form);
    return posList;
}
 
開發者ID:redpen-cc,項目名稱:redpen,代碼行數:10,代碼來源:NeologdJapaneseTokenizer.java


注:本文中的org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute類示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。