本文整理汇总了Java中info.ephyra.nlp.SnowballStemmer.stemAllTokens方法的典型用法代码示例。如果您正苦于以下问题:Java SnowballStemmer.stemAllTokens方法的具体用法?Java SnowballStemmer.stemAllTokens怎么用?Java SnowballStemmer.stemAllTokens使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类info.ephyra.nlp.SnowballStemmer
的用法示例。
在下文中一共展示了SnowballStemmer.stemAllTokens方法的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。
示例1: add
import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
* Adds a word to the dictionary.
*
* @param word the word to add
*/
public void add(String word) {
if (word != null) {
word = NETagger.tokenizeWithSpaces(word.trim().toLowerCase());
word = SnowballStemmer.stemAllTokens(word);
// add whole word
if (word.length() > 0) words.add(word);
// add tokens of word
String[] tokens = word.split(" ");
if (tokens.length > maxTokens) maxTokens = tokens.length;
for (int p = 0; p < tokens.length; p++)
if (tokens[p].length() > 0) this.tokens.add(tokens[p]);
}
}
示例2: HashDictionary
import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
* Creates a <code>HashDictionary</code> from a list of words in a file.
*
* @param fileName file containing a list of words
* @throws IOException if the list could not be read from the file
*/
public HashDictionary(String fileName) throws IOException {
this();
if (fileName != null) {
File file = new File(fileName);
BufferedReader in = new BufferedReader(new FileReader(file));
while (in.ready()) {
// read and normalize word
String word = in.readLine().trim();
if (word.startsWith("//")) continue; // skip comments
word = NETagger.tokenizeWithSpaces(word.toLowerCase());
word = SnowballStemmer.stemAllTokens(word);
// add whole word
if (word.length() > 0) words.add(word);
// add tokens of word
String[] tokens = word.split(" ");
if (tokens.length > maxTokens) maxTokens = tokens.length;
for (int p = 0; p < tokens.length; p++)
if (tokens[p].length() > 0) this.tokens.add(tokens[p]);
}
in.close();
}
}
示例3: contains
import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
* Looks up a word.
*
* @param word the word to look up
* @return <code>true</code> iff the word was found
*/
public boolean contains(String word) {
word = NETagger.tokenizeWithSpaces(word.trim().toLowerCase());
word = SnowballStemmer.stemAllTokens(word);
return words.contains(word);
}
示例4: fuzzyContains
import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
* Does a fuzzy lookup for a word. The specified word w is considered as
* contained in the dictionary is there is a word W in the dictionary such
* that <code>LevenshteinDistance(w, W) <= maxDistance</code>
*
* @param word the word to look up
* @param maxDistance the maximum Levenshtein edit distance for fuzzy
* comparison
* @return <code>true</code> iff the word was found
*/
public boolean fuzzyContains(String word, int maxDistance) {
word = NETagger.tokenizeWithSpaces(word.trim().toLowerCase());
word = SnowballStemmer.stemAllTokens(word);
if (maxDistance == 0) return this.words.contains(word);
else if (this.words.contains(word)) return true;
Iterator<String> wordIter = this.words.iterator();
while (wordIter.hasNext())
if (getLevenshteinDistance(word, wordIter.next(), maxDistance, true, 1, 1) <= maxDistance) return true;
return false;
}