当前位置: 首页>>代码示例>>Java>>正文


Java SnowballStemmer.stemAllTokens方法代码示例

本文整理汇总了Java中info.ephyra.nlp.SnowballStemmer.stemAllTokens方法的典型用法代码示例。如果您正苦于以下问题:Java SnowballStemmer.stemAllTokens方法的具体用法?Java SnowballStemmer.stemAllTokens怎么用?Java SnowballStemmer.stemAllTokens使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在info.ephyra.nlp.SnowballStemmer的用法示例。


在下文中一共展示了SnowballStemmer.stemAllTokens方法的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: add

import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
 * Adds a word to the dictionary.
 * 
 * @param word the word to add
 */
public void add(String word) {
	if (word != null) {
		word = NETagger.tokenizeWithSpaces(word.trim().toLowerCase());
		word = SnowballStemmer.stemAllTokens(word);
		
		// add whole word
		if (word.length() > 0) words.add(word);
		
		// add tokens of word
		String[] tokens = word.split(" ");
		if (tokens.length > maxTokens) maxTokens = tokens.length;
		for (int p = 0; p < tokens.length; p++)
			if (tokens[p].length() > 0) this.tokens.add(tokens[p]);
	}
}
 
开发者ID:claritylab,项目名称:lucida,代码行数:21,代码来源:HashDictionary.java

示例2: HashDictionary

import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
 * Creates a <code>HashDictionary</code> from a list of words in a file.
 * 
 * @param fileName file containing a list of words
 * @throws IOException if the list could not be read from the file
 */
public HashDictionary(String fileName) throws IOException {
	this();
	
	if (fileName != null) {
		File file = new File(fileName);
		BufferedReader in = new BufferedReader(new FileReader(file));
		
		while (in.ready()) {
			// read and normalize word
			String word = in.readLine().trim();
			if (word.startsWith("//")) continue;  // skip comments
			word = NETagger.tokenizeWithSpaces(word.toLowerCase());
			word = SnowballStemmer.stemAllTokens(word);
			
			// add whole word
			if (word.length() > 0) words.add(word);
			
			// add tokens of word
			String[] tokens = word.split(" ");
			if (tokens.length > maxTokens) maxTokens = tokens.length;
			for (int p = 0; p < tokens.length; p++)
				if (tokens[p].length() > 0) this.tokens.add(tokens[p]);
		}
		
		in.close();
	}
}
 
开发者ID:claritylab,项目名称:lucida,代码行数:34,代码来源:HashDictionary.java

示例3: contains

import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
 * Looks up a word.
 * 
 * @param word the word to look up
 * @return <code>true</code> iff the word was found
 */
public boolean contains(String word) {
	word = NETagger.tokenizeWithSpaces(word.trim().toLowerCase());
	word = SnowballStemmer.stemAllTokens(word);
	
	return words.contains(word);
}
 
开发者ID:claritylab,项目名称:lucida,代码行数:13,代码来源:HashDictionary.java

示例4: fuzzyContains

import info.ephyra.nlp.SnowballStemmer; //导入方法依赖的package包/类
/**
 * Does a fuzzy lookup for a word. The specified word w is considered as
 * contained in the dictionary is there is a word W in the dictionary such
 * that <code>LevenshteinDistance(w, W) &lt;= maxDistance</code>
 * 
 * @param word the word to look up
 * @param maxDistance the maximum Levenshtein edit distance for fuzzy
 *            comparison
 * @return <code>true</code> iff the word was found
 */
public boolean fuzzyContains(String word, int maxDistance) {
	word = NETagger.tokenizeWithSpaces(word.trim().toLowerCase());
	word = SnowballStemmer.stemAllTokens(word);
	
	if (maxDistance == 0) return this.words.contains(word);
	else if (this.words.contains(word)) return true;
	
	Iterator<String> wordIter = this.words.iterator();
	while (wordIter.hasNext())
		if (getLevenshteinDistance(word, wordIter.next(), maxDistance, true, 1, 1) <= maxDistance) return true;
	
	return false;
}
 
开发者ID:claritylab,项目名称:lucida,代码行数:24,代码来源:HashDictionary.java


注:本文中的info.ephyra.nlp.SnowballStemmer.stemAllTokens方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。