Java AbstractSequenceClassifier.classifyToCharacterOffsets方法代码示例

本文整理汇总了Java中edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyToCharacterOffsets方法的典型用法代码示例。如果您正苦于以下问题：Java AbstractSequenceClassifier.classifyToCharacterOffsets方法的具体用法？Java AbstractSequenceClassifier.classifyToCharacterOffsets怎么用？Java AbstractSequenceClassifier.classifyToCharacterOffsets使用的例子？那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类edu.stanford.nlp.ie.AbstractSequenceClassifier的用法示例。

在下文中一共展示了AbstractSequenceClassifier.classifyToCharacterOffsets方法的3个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: testConvertNERtoCLAVIN

import edu.stanford.nlp.ie.AbstractSequenceClassifier; //导入方法依赖的package包/类
/**
 * Checks conversion of Stanford NER output format into
 * {@link com.bericotech.clavin.resolver.ClavinLocationResolver}
 * input format.
 *
 * @throws IOException
 */
@Test
public void testConvertNERtoCLAVIN() throws IOException {
    InputStream mpis = this.getClass().getClassLoader().getResourceAsStream("models/english.all.3class.distsim.prop");
    Properties mp = new Properties();
    mp.load(mpis);
    AbstractSequenceClassifier<CoreMap> namedEntityRecognizer =
            CRFClassifier.getJarClassifier("/models/english.all.3class.distsim.crf.ser.gz", mp);

    String text = "I was born in Springfield and grew up in Boston.";
    List<Triple<String, Integer, Integer>> entitiesFromNER = namedEntityRecognizer.classifyToCharacterOffsets(text);

    List<LocationOccurrence> locationsForCLAVIN = convertNERtoCLAVIN(entitiesFromNER, text);
    assertEquals("wrong number of entities", 2, locationsForCLAVIN.size());
    assertEquals("wrong text for first entity", "Springfield", locationsForCLAVIN.get(0).getText());
    assertEquals("wrong position for first entity", 14, locationsForCLAVIN.get(0).getPosition());
    assertEquals("wrong text for second entity", "Boston", locationsForCLAVIN.get(1).getText());
    assertEquals("wrong position for second entity", 41, locationsForCLAVIN.get(1).getPosition());
}

开发者ID:Berico-Technologies，项目名称:CLAVIN-NERD，代码行数:26，代码来源:StanfordExtractorTest.java

示例2: getCharacters

import edu.stanford.nlp.ie.AbstractSequenceClassifier; //导入方法依赖的package包/类
/**
 * Get the characters.
 *
 * @return an ArrayList of characters
 */
public ArrayList<Person> getCharacters() {
    ArrayList<Person> people = new ArrayList<Person>();
    Genderize api = GenderizeIoAPI.create();

    AbstractSequenceClassifier<CoreLabel> classifier;
    String fileContents;
    List<Triple<String, Integer, Integer>> list;
    HashSet<String> existingNames;

    try {
        classifier = CRFClassifier.getClassifier(CLASSIFIER);
        fileContents = IOUtils.slurpFile(filename);
        list = classifier.classifyToCharacterOffsets(fileContents);

        existingNames = new HashSet<String>();
        for (Triple<String, Integer, Integer> item : list) {
            if (item.first().equals("PERSON")) {
                String nameStr = fileContents.substring(item.second(),
                                                        item.third());
                nameStr = nameStr.replace("\n", " ")
                    .replace("\r", " ")
                    .replaceAll("\\s+", " ")
                    .trim();

                if (!existingNames.contains(nameStr)) {
                    existingNames.add(nameStr);

                    String[] names = nameStr.split(" ");
                    Person p = new Person();
                    p.setFirstname(names[0]);
                    if (names.length > 1) {
                        p.setLastname(names[1]);
                    }

                    NameGender gender = api.getGender(p.getFirstname());
                    if (gender.getGender() != null) {
                        p.setGender(gender.isMale() ? male : female);
                    } else {
                        p.setGender(getRandomGender());
                    }

                    people.add(p);
                }
            }
        }
    } catch (Exception e) {
        e.printStackTrace();
    }

    return people;
}

开发者ID:kennanmeyer，项目名称:SE-410-Project，代码行数:57，代码来源:PersonImporter.java

示例3: resolveStanfordEntities

import edu.stanford.nlp.ie.AbstractSequenceClassifier; //导入方法依赖的package包/类
/**
 * Sometimes, you might already be using Stanford NER elsewhere in
 * your application, and you'd like to just pass the output from
 * Stanford NER directly into CLAVIN, without having to re-run the
 * input through Stanford NER just to use CLAVIN. This example
 * shows you how to very easily do exactly that.
 *
 * @throws IOException
 * @throws ClavinException
 */
private static void resolveStanfordEntities() throws IOException, ClavinException {

    /*#####################################################################
     *
     * Start with Stanford NER -- no need to get CLAVIN involved for now.
     *
     *###################################################################*/

    // instantiate Stanford NER entity extractor
    InputStream mpis = WorkflowDemoNERD.class.getClassLoader().getResourceAsStream("models/english.all.3class.distsim.prop");
    Properties mp = new Properties();
    mp.load(mpis);
    AbstractSequenceClassifier<CoreMap> namedEntityRecognizer =
            CRFClassifier.getJarClassifier("/models/english.all.3class.distsim.crf.ser.gz", mp);

    // Unstructured text file about Somalia to be geoparsed
    File inputFile = new File("src/test/resources/sample-docs/Somalia-doc.txt");

    // Grab the contents of the text file as a String
    String inputString = TextUtils.fileToString(inputFile);

    // extract entities from input text using Stanford NER
    List<Triple<String, Integer, Integer>> entitiesFromNER = namedEntityRecognizer.classifyToCharacterOffsets(inputString);

    /*#####################################################################
     *
     * Now, CLAVIN comes into play...
     *
     *###################################################################*/

    // convert Stanford NER output to ClavinLocationResolver input
    List<LocationOccurrence> locationsForCLAVIN = convertNERtoCLAVIN(entitiesFromNER, inputString);

    // instantiate the CLAVIN location resolver
    ClavinLocationResolver clavinLocationResolver = new ClavinLocationResolver(new LuceneGazetteer(new File("./IndexDirectory")));

    // resolve location entities extracted from input text
    List<ResolvedLocation> resolvedLocations = clavinLocationResolver.resolveLocations(locationsForCLAVIN, 1, 1, false);

    // Display the ResolvedLocations found for the location names
    for (ResolvedLocation resolvedLocation : resolvedLocations)
        System.out.println(resolvedLocation);
}

开发者ID:Berico-Technologies，项目名称:CLAVIN-NERD，代码行数:54，代码来源:WorkflowDemoNERD.java

注：本文中的edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyToCharacterOffsets方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。