當前位置: 首頁>>代碼示例>>Java>>正文


Java POSTaggerME.tag方法代碼示例

本文整理匯總了Java中opennlp.tools.postag.POSTaggerME.tag方法的典型用法代碼示例。如果您正苦於以下問題:Java POSTaggerME.tag方法的具體用法?Java POSTaggerME.tag怎麽用?Java POSTaggerME.tag使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在opennlp.tools.postag.POSTaggerME的用法示例。


在下文中一共展示了POSTaggerME.tag方法的7個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: doRun

import opennlp.tools.postag.POSTaggerME; //導入方法依賴的package包/類
@Override
public List<Word> doRun(Language language, String sentence) {
    Tokenizer tokenizer = new TokenizerME(getTokenizerModel(language));
    POSTaggerME tagger = new POSTaggerME(getPOSModel(language));
    String[] tokens = tokenizer.tokenize(sentence);
    String[] tags = tagger.tag(tokens);

    PartOfSpeechSet posSet = PartOfSpeechSet.getPOSSet(language);

    List<Word> words = new ArrayList<>();
    for (int i = 0; i < tokens.length; i++) {
        words.add(new Word(posSet.valueOf(tags[i]), tokens[i]));
    }

    return words;
}
 
開發者ID:Lambda-3,項目名稱:Stargraph,代碼行數:17,代碼來源:OpenNLPAnnotator.java

示例2: testLemmatizing

import opennlp.tools.postag.POSTaggerME; //導入方法依賴的package包/類
@Test
public void testLemmatizing() throws Exception {
    try (InputStream posInputStream = this.getClass().getResourceAsStream("/nlp/en-pos-maxent.bin");
         InputStream lemmaInputStream = getClass().getResourceAsStream("/nlp/en-lemmatizer.dict")) {

        POSModel model = new POSModel(posInputStream);
        POSTaggerME tagger = new POSTaggerME(model);

        String sent[] = new String[]{"Fine", "Most", "large", "cities", "in", "the", "US", "had",
                "morning", "and", "afternoon", "newspapers", "."};
        String tags[] = tagger.tag(sent);

        DictionaryLemmatizer dictionaryLemmatizer = new DictionaryLemmatizer(lemmaInputStream);

        String[] lemmatize = dictionaryLemmatizer.lemmatize(sent, tags);

        logger.info("lemmas: {}", Arrays.asList(lemmatize));

    }
}
 
開發者ID:bpark,項目名稱:chlorophytum-semantics,代碼行數:21,代碼來源:NlpTest.java

示例3: POSExample

import opennlp.tools.postag.POSTaggerME; //導入方法依賴的package包/類
public void POSExample() {
    try (InputStream input = new FileInputStream(
            new File("en-pos-maxent.bin"));) {

        // To lower case example
        String lowerCaseVersion = sentence.toLowerCase();
        out.println(lowerCaseVersion);

        // Pull out tokens
        List<String> list = new ArrayList<>();
        Scanner scanner = new Scanner(sentence);
        while (scanner.hasNext()) {
            list.add(scanner.next());
        }
        // Convert list to an array
        String[] words = new String[1];
        words = list.toArray(words);

        // Build model
        POSModel posModel = new POSModel(input);
        POSTaggerME posTagger = new POSTaggerME(posModel);

        // Tag words
        String[] posTags = posTagger.tag(words);
        for (int i = 0; i < posTags.length; i++) {
            out.println(words[i] + " - " + posTags[i]);
        }

        // Find top sequences
        Sequence sequences[] = posTagger.topKSequences(words);
        for (Sequence sequence : sequences) {
            out.println(sequence);
        }
    } catch (IOException ex) {
        ex.printStackTrace();
    }
}
 
開發者ID:PacktPublishing,項目名稱:Machine-Learning-End-to-Endguide-for-Java-developers,代碼行數:38,代碼來源:NLPExamples.java

示例4: testPosTagging

import opennlp.tools.postag.POSTaggerME; //導入方法依賴的package包/類
@Test
public void testPosTagging() throws Exception {
    try (InputStream inputStream = this.getClass().getResourceAsStream("/nlp/en-pos-maxent.bin")) {
        POSModel model = new POSModel(inputStream);
        POSTaggerME tagger = new POSTaggerME(model);

        String sent[] = new String[]{"Most", "large", "cities", "in", "the", "US", "had",
                "morning", "and", "afternoon", "newspapers", "."};
        String tags[] = tagger.tag(sent);

        logger.info("tags: {}", Arrays.asList(tags));
    }
}
 
開發者ID:bpark,項目名稱:chlorophytum-semantics,代碼行數:14,代碼來源:NlpTest.java

示例5: testTagger

import opennlp.tools.postag.POSTaggerME; //導入方法依賴的package包/類
public String[] testTagger(){
	String[] tags = {};
	try (InputStream modelIn = BasicActions.class.getClassLoader().
				getResourceAsStream(Consts.EN_POS_MODEL);){
				
		POSModel posModel = new POSModel(modelIn);
		POSTaggerME tagger = new POSTaggerME(posModel);
		tags = tagger.tag(testTokenizer());
			System.out.println(Arrays.toString(tags));
	} catch (IOException e) {
		e.printStackTrace();
	}
	return tags;
}
 
開發者ID:5agado,項目名稱:knowledge-extraction,代碼行數:15,代碼來源:BasicActions.java

示例6: annotate

import opennlp.tools.postag.POSTaggerME; //導入方法依賴的package包/類
/**
 * Annotates the document using the Apache OpenNLP tools.
 *
 * @param component the component to annotate.
 */
@Override
public void annotate(Blackboard blackboard, DocumentComponent component) {

    // set up the annotator
    setup();

    // Language tag used to retrieve the datasets
    String langTag = component.getLanguage().getLanguage();

    // Split the text into sentences
    SentenceModel sentModel = getSentenceModel(langTag + "-sent");

    SentenceDetectorME sentenceDetector = new SentenceDetectorME(sentModel);
    String sentences[] = sentenceDetector.sentDetect(component.getText());

    // Get the right models
    TokenizerModel tokenModel = getTokenizerModel(langTag + "-token");
    POSModel POSModel = getPOSTaggerModel(langTag + "-pos-maxent");

    // Iterate through sentences and produce the distilled objects, 
    // i.e. a sentence object with pos-tagged and stemmed tokens.
    for (String sentenceString : sentences) {

        // the distilled sentence object
        Sentence sentence = new Sentence(sentenceString,
                "" + sentenceCounter++);
        sentence.setLanguage(component.getLanguage());

        // Tokenize the sentence
        Tokenizer tokenizer = new TokenizerME(tokenModel);
        String tokens[] = tokenizer.tokenize(sentenceString);

        // POS tag the tokens
        POSTaggerME tagger = new POSTaggerME(POSModel);
        String tags[] = tagger.tag(tokens);

        // put the features detected by OpenNLP in the distiller's
        // sentence
        for (int i = 0; i < tokens.length; i++) {
            Token t = new Token(tokens[i]);
            t.setPoS(tags[i]);
            sentence.addToken(t);

        } // for 
        ((DocumentComposite) component).addComponent(sentence);

    } // for (String sentenceString : sentences)
}
 
開發者ID:ailab-uniud,項目名稱:distiller-CORE,代碼行數:54,代碼來源:OpenNlpBootstrapperAnnotator.java

示例7: posTag

import opennlp.tools.postag.POSTaggerME; //導入方法依賴的package包/類
private String[] posTag(String[] tokens) {
  POSTaggerME posTagger = new POSTaggerME(posModel);
  return posTagger.tag(tokens);
}
 
開發者ID:languagetool-org,項目名稱:languagetool,代碼行數:5,代碼來源:EnglishChunker.java


注:本文中的opennlp.tools.postag.POSTaggerME.tag方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。