当前位置: 首页>>代码示例>>Java>>正文


Java PartOfSpeechAttribute类代码示例

本文整理汇总了Java中org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute的典型用法代码示例。如果您正苦于以下问题:Java PartOfSpeechAttribute类的具体用法?Java PartOfSpeechAttribute怎么用?Java PartOfSpeechAttribute使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。


PartOfSpeechAttribute类属于org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes包,在下文中一共展示了PartOfSpeechAttribute类的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: kuromojineologd

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //导入依赖的package包/类
private List<TokenElement> kuromojineologd(String src) throws IOException {
    tokenizer.setReader(new StringReader(src));
    List<TokenElement> tokens = new ArrayList<>();
    BaseFormAttribute baseAttr = tokenizer.addAttribute(BaseFormAttribute.class);
    CharTermAttribute charAttr = tokenizer.addAttribute(CharTermAttribute.class);
    PartOfSpeechAttribute posAttr = tokenizer.addAttribute(PartOfSpeechAttribute.class);
    ReadingAttribute readAttr = tokenizer.addAttribute(ReadingAttribute.class);
    OffsetAttribute offsetAttr  = tokenizer.addAttribute(OffsetAttribute.class);
    InflectionAttribute inflectionAttr = tokenizer.addAttribute(InflectionAttribute.class);
    tokenizer.reset();
    while (tokenizer.incrementToken()) {
        String surface = charAttr.toString();
        tokens.add(new TokenElement(surface,
                getTagList(posAttr, inflectionAttr),
                offsetAttr.startOffset(),
                readAttr.getReading()
        ));
    }
    tokenizer.close();
    return tokens;
}
 
开发者ID:redpen-cc,项目名称:redpen,代码行数:22,代码来源:NeologdJapaneseTokenizer.java

示例2: create

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //导入依赖的package包/类
@Override
public TokenStream create(TokenStream tokenStream) {
    final PartOfSpeechAttribute posAtt = tokenStream.addAttribute(PartOfSpeechAttribute.class);
    return new PosConcatenationFilter(tokenStream, posTags, new PartOfSpeechSupplier() {
        @Override
        public String get() {
            return posAtt.getPartOfSpeech();
        }
    });
}
 
开发者ID:codelibs,项目名称:elasticsearch-analysis-kuromoji-neologd,代码行数:11,代码来源:PosConcatenationFilterFactory.java

示例3: tokenize

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //导入依赖的package包/类
private List<Token> tokenize(Reader reader)
{
    List<Token> list = Lists.newArrayList();
    try (TokenStream tokenStream = japaneseAnalyzer.tokenStream("", reader)) {
        BaseFormAttribute baseAttr = tokenStream.addAttribute(BaseFormAttribute.class);
        CharTermAttribute charAttr = tokenStream.addAttribute(CharTermAttribute.class);
        PartOfSpeechAttribute posAttr = tokenStream.addAttribute(PartOfSpeechAttribute.class);
        ReadingAttribute readAttr = tokenStream.addAttribute(ReadingAttribute.class);

        tokenStream.reset();
        while (tokenStream.incrementToken()) {
            Token token = new Token();
            token.setCharTerm(charAttr.toString());
            token.setBaseForm(baseAttr.getBaseForm());
            token.setReading(readAttr.getReading());
            token.setPartOfSpeech(posAttr.getPartOfSpeech());
            if (!isOkPartsOfSpeech(token)) {
                continue;
            }
            list.add(token);
        }
    }
    catch (Exception e) {
        logger.error("neologd error", e);
    }
    return list;
}
 
开发者ID:toyama0919,项目名称:embulk-filter-kuromoji,代码行数:28,代码来源:NeologdPageOutput.java

示例4: getTagList

import org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute; //导入依赖的package包/类
private List<String> getTagList(PartOfSpeechAttribute posAttr, InflectionAttribute inflectionAttr) {
    List<String> posList = new ArrayList<>();
    posList.addAll(Arrays.asList(posAttr.getPartOfSpeech().split("-")));
    String form = inflectionAttr.getInflectionForm() == null ? "*" : inflectionAttr.getInflectionForm();
    String type = inflectionAttr.getInflectionType() == null ? "*" : inflectionAttr.getInflectionType();
    posList.add(type);
    posList.add(form);
    return posList;
}
 
开发者ID:redpen-cc,项目名称:redpen,代码行数:10,代码来源:NeologdJapaneseTokenizer.java


注:本文中的org.codelibs.neologd.ipadic.lucene.analysis.ja.tokenattributes.PartOfSpeechAttribute类示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。