当前位置: 首页>>代码示例>>Java>>正文


Java IKAnalyzer类代码示例

本文整理汇总了Java中org.wltea.analyzer.lucene.IKAnalyzer的典型用法代码示例。如果您正苦于以下问题:Java IKAnalyzer类的具体用法?Java IKAnalyzer怎么用?Java IKAnalyzer使用的例子?那么, 这里精选的类代码示例或许可以为您提供帮助。


IKAnalyzer类属于org.wltea.analyzer.lucene包,在下文中一共展示了IKAnalyzer类的15个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: getContent

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
/**
 * 获取内容
 * 
 * @return 内容
 */
@Field(store = Store.YES, index = Index.TOKENIZED, analyzer = @Analyzer(impl = IKAnalyzer.class))
@Lob
public String getContent() {
	if (pageNumber != null) {
		String[] pageContents = getPageContents();
		if (pageNumber < 1) {
			pageNumber = 1;
		}
		if (pageNumber > pageContents.length) {
			pageNumber = pageContents.length;
		}
		return pageContents[pageNumber - 1];
	} else {
		return content;
	}
}
 
开发者ID:justinbaby,项目名称:my-paper,代码行数:22,代码来源:Article.java

示例2: segment

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
/**
 * 对一段文本进行分词,并将分词及其位置加入到urlInfo中
 * @param text 待分词的文本
 */
private void segment(String text) {
	IKAnalyzer analyzer = new IKAnalyzer(true);
	StringReader reader = new StringReader(text);
	TokenStream tokenStream = analyzer.tokenStream("*", reader);
	TermAttribute termAtt = tokenStream.getAttribute(TermAttribute.class);
	
	try {
		while (tokenStream.incrementToken()) {
			location ++;
			String term = termAtt.term();		
			urlInfo.putURLLocation(term, location);
		}
	}
	catch(IOException exp) {
		exp.printStackTrace();
	}
}
 
开发者ID:uraplutonium,项目名称:hadoop-distributed-crawler,代码行数:22,代码来源:URLAnalyzer.java

示例3: queryIndex

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
public static LuceneSearchResult queryIndex(String keyword, int offset, int pagesize){
	// 查询
	List<Query> querys = new ArrayList<>();
	if (keyword!=null && keyword.trim().length()>0) {
		try {
			//Analyzer analyzer = new SmartChineseAnalyzer();
			Analyzer analyzer = new IKAnalyzer();
			QueryParser parser = new QueryParser(ExcelUtil.KEYWORDS, analyzer);
			Query shopNameQuery = parser.parse(keyword);
			querys.add(shopNameQuery);
		} catch (ParseException e) {
			e.printStackTrace();
		}
	}
	LuceneSearchResult result = search(querys, offset, pagesize);
	return result;
}
 
开发者ID:xuxueli,项目名称:xxl-search,代码行数:18,代码来源:LuceneUtil.java

示例4: analyze

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
public static ArrayList<String> analyze(final String content) {
  try {
    ArrayList<String> _xblockexpression = null;
    {
      final IKAnalyzer ikAnalyzer = new IKAnalyzer(true);
      final TokenStream ts = ikAnalyzer.tokenStream("field", content);
      final CharTermAttribute ch = ts.<CharTermAttribute>addAttribute(CharTermAttribute.class);
      ts.reset();
      final ArrayList<String> words = CollectionLiterals.<String>newArrayList();
      while (ts.incrementToken()) {
        String _string = ch.toString();
        words.add(_string);
      }
      ts.end();
      ts.close();
      _xblockexpression = words;
    }
    return _xblockexpression;
  } catch (Throwable _e) {
    throw Exceptions.sneakyThrow(_e);
  }
}
 
开发者ID:East196,项目名称:maker,代码行数:23,代码来源:My.java

示例5: main

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
public static void main(final String[] args) {
  try {
    final IKAnalyzer ikAnalyzer = new IKAnalyzer(true);
    final String text = "lucene分析器使用分词器和过滤器构成一个“管道”,文本在流经这个管道后成为可以进入索引的最小单位,因此,一个标准的分析器有两个部分组成,一个是分词器tokenizer,它用于将文本按照规则切分为一个个可以进入索引的最小单位。另外一个是TokenFilter,它主要作用是对切出来的词进行进一步的处理(如去掉敏感词、英文大小写转换、单复数处理)等。lucene中的Tokenstram方法首先创建一个tokenizer对象处理Reader对象中的流式文本,然后利用TokenFilter对输出流进行过滤处理";
    final TokenStream ts = ikAnalyzer.tokenStream("field", text);
    final CharTermAttribute ch = ts.<CharTermAttribute>addAttribute(CharTermAttribute.class);
    ts.reset();
    while (ts.incrementToken()) {
      String _string = ch.toString();
      String _plus = (_string + " | ");
      InputOutput.<String>print(_plus);
    }
    ts.end();
    ts.close();
  } catch (Throwable _e) {
    throw Exceptions.sneakyThrow(_e);
  }
}
 
开发者ID:East196,项目名称:maker,代码行数:19,代码来源:IKAnalyzerDemo.java

示例6: tokenizer

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
public List<ElementDict> tokenizer(String str) {
	List<ElementDict> list = new ArrayList<ElementDict>();
	IKAnalyzer analyzer = new IKAnalyzer(true);
	try {
		TokenStream stream = analyzer.tokenStream("", str);
		CharTermAttribute cta = stream.addAttribute(CharTermAttribute.class);
		stream.reset();
		int index = -1;
		while (stream.incrementToken()) {
			if ((index = isContain(cta.toString(), list)) >= 0) {
				list.get(index).setFreq(list.get(index).getFreq() + 1);
			}
			else {
				list.add(new ElementDict(cta.toString(), 1));
			}
		}
		analyzer.close();
	} catch (IOException e) {
		e.printStackTrace();
	} 
	return list;
}
 
开发者ID:lyssym,项目名称:MapReduce,代码行数:23,代码来源:TextCosine.java

示例7: tokenizer

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
/**
 * �����ӽ����ض��ִ��з֣���ȡ��Ԫ�б�
 * @param str
 * @return ���ķִ��б�
 */
public List<String> tokenizer(String str) {
	List<String> list = new ArrayList<String>();
	IKAnalyzer analyzer = new IKAnalyzer(true);
	try {
		TokenStream stream = analyzer.tokenStream("", str);
		CharTermAttribute cta = stream.addAttribute(CharTermAttribute.class);
		stream.reset();
		while (stream.incrementToken()) {
			list.add(cta.toString());
		}
		analyzer.close();
	} catch (IOException e) {
		e.printStackTrace();
	} 
	return list;
}
 
开发者ID:lyssym,项目名称:textmining,代码行数:22,代码来源:Segment.java

示例8: tokenizer

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
public List<TermDict> tokenizer()
{
	List<TermDict> terms = new ArrayList<TermDict>();
	IKAnalyzer analyzer = new IKAnalyzer(true);
	try {
		TokenStream stream = analyzer.tokenStream("", this.tokens);
		CharTermAttribute cta = stream.addAttribute(CharTermAttribute.class);
		stream.reset();
		int index = -1;
		while (stream.incrementToken()) 
		{
			if ((index = isContain(cta.toString(), terms)) >= 0)
			{
				terms.get(index).setFreq(terms.get(index).getFreq()+1);
			}
			else 
			{
				terms.add(new TermDict(cta.toString(), 1));
			}
		}
		analyzer.close();
	} catch (IOException e) {
		e.printStackTrace();
	}
	return terms;
}
 
开发者ID:lyssym,项目名称:textmining,代码行数:27,代码来源:SimHash.java

示例9: search

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
public final TopDocs search(String keyword) {
	System.out.println("正在检索关键字 : " + keyword);
	try {
		Analyzer analyzer = new IKAnalyzer();
		QueryParser parser = new QueryParser(Version.LUCENE_4_9, field, analyzer);
		// 将关键字包装成Query对象
		query = parser.parse(keyword);
		Date start = new Date();
		TopDocs results = searcher.search(query, 5 * 2);
		Date end = new Date();
		System.out.println("检索完成,用时" + (end.getTime() - start.getTime()) + "毫秒");
		return results;
	} catch (Exception e) {
		e.printStackTrace();
	}
	return null;
}
 
开发者ID:irfen,项目名称:lucene-example,代码行数:18,代码来源:LuceneIKSearch.java

示例10: testIKAnalyzer

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
@Test
public void testIKAnalyzer() throws Exception {
	@SuppressWarnings("resource")
	Analyzer analyzer = new IKAnalyzer();// IK分词
	TokenStream token = analyzer.tokenStream("a", new StringReader(TEXT));
	token.reset();

	CharTermAttribute term = token.addAttribute(CharTermAttribute.class);// term信息
	OffsetAttribute offset = token.addAttribute(OffsetAttribute.class);// 位置数据

	while (token.incrementToken()) {
		System.out.println(term + "   " + offset.startOffset() + "   " + offset.endOffset());
	}

	token.end();
	token.close();
}
 
开发者ID:irfen,项目名称:lucene-example,代码行数:18,代码来源:IKAnalyzerTest.java

示例11: getFullTextQuery

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
/**
 * 获取全文查询对象
 * @param q 查询关键字
 * @param fields 查询字段
 * @return 全文查询对象
 */
public BooleanQuery getFullTextQuery(String q, String... fields){
	Analyzer analyzer = new IKAnalyzer();
	BooleanQuery query = new BooleanQuery();
	try {
		if (StringUtils.isNotBlank(q)){
			for (String field : fields){
				QueryParser parser = new QueryParser(Version.LUCENE_36, field, analyzer);   
				query.add(parser.parse(q), Occur.SHOULD);
			}
		}
	} catch (ParseException e) {
		e.printStackTrace();
	}
	return query;
}
 
开发者ID:cncduLee,项目名称:bbks,代码行数:22,代码来源:BaseDaoImpl.java

示例12: indexInit

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
public void indexInit() throws Exception {
	Analyzer analyzer = new IKAnalyzer();
//	Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_44);
	this.indexSettings = new LuceneIndexSettings(analyzer);
	this.indexSettings.createFSDirectory("f:\\file");
	this.luceneIndex = new LuceneIndex(this.indexSettings);
	this.luceneIndexSearch = new LuceneIndexSearch(indexSettings, new LuceneResultCollector(indexSettings));
}
 
开发者ID:zhangjikai,项目名称:sdudoc,代码行数:9,代码来源:LuceneIndexManager.java

示例13: getTitle

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
/**
 * 获取标题
 * 
 * @return 标题
 */
@Field(store = Store.YES, index = Index.TOKENIZED, analyzer = @Analyzer(impl = IKAnalyzer.class))
@NotEmpty
@Length(max = 200)
@Column(nullable = false)
public String getTitle() {
	return title;
}
 
开发者ID:justinbaby,项目名称:my-paper,代码行数:13,代码来源:Article.java

示例14: getName

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
/**
 * 获取名称
 * 
 * @return 名称
 */
@JsonProperty
@Field(store = Store.YES, index = Index.TOKENIZED, analyzer = @Analyzer(impl = IKAnalyzer.class))
@NotEmpty
@Length(max = 200)
@Column(nullable = false)
public String getName() {
	return name;
}
 
开发者ID:justinbaby,项目名称:my-paper,代码行数:14,代码来源:Product.java

示例15: search

import org.wltea.analyzer.lucene.IKAnalyzer; //导入依赖的package包/类
@SuppressWarnings("unchecked")
@Transactional(readOnly = true)
public Page<Article> search(String keyword, Pageable pageable) {
	if (StringUtils.isEmpty(keyword)) {
		return new Page<Article>();
	}
	if (pageable == null) {
		pageable = new Pageable();
	}
	try {
		String text = QueryParser.escape(keyword);
		QueryParser titleParser = new QueryParser(Version.LUCENE_35, "title", new IKAnalyzer());
		titleParser.setDefaultOperator(QueryParser.AND_OPERATOR);
		Query titleQuery = titleParser.parse(text);
		FuzzyQuery titleFuzzyQuery = new FuzzyQuery(new Term("title", text), FUZZY_QUERY_MINIMUM_SIMILARITY);
		Query contentQuery = new TermQuery(new Term("content", text));
		Query isPublicationQuery = new TermQuery(new Term("isPublication", "true"));
		BooleanQuery textQuery = new BooleanQuery();
		BooleanQuery query = new BooleanQuery();
		textQuery.add(titleQuery, Occur.SHOULD);
		textQuery.add(titleFuzzyQuery, Occur.SHOULD);
		textQuery.add(contentQuery, Occur.SHOULD);
		query.add(isPublicationQuery, Occur.MUST);
		query.add(textQuery, Occur.MUST);
		FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
		FullTextQuery fullTextQuery = fullTextEntityManager.createFullTextQuery(query, Article.class);
		fullTextQuery.setSort(new Sort(new SortField[] { new SortField("isTop", SortField.STRING, true), new SortField(null, SortField.SCORE), new SortField("createDate", SortField.LONG, true) }));
		fullTextQuery.setFirstResult((pageable.getPageNumber() - 1) * pageable.getPageSize());
		fullTextQuery.setMaxResults(pageable.getPageSize());
		return new Page<Article>(fullTextQuery.getResultList(), fullTextQuery.getResultSize(), pageable);
	} catch (ParseException e) {
		e.printStackTrace();
	}
	return new Page<Article>();
}
 
开发者ID:justinbaby,项目名称:my-paper,代码行数:36,代码来源:SearchServiceImpl.java


注:本文中的org.wltea.analyzer.lucene.IKAnalyzer类示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。