當前位置: 首頁>>代碼示例>>Java>>正文


Java BreakIterator.getSentenceInstance方法代碼示例

本文整理匯總了Java中java.text.BreakIterator.getSentenceInstance方法的典型用法代碼示例。如果您正苦於以下問題:Java BreakIterator.getSentenceInstance方法的具體用法?Java BreakIterator.getSentenceInstance怎麽用?Java BreakIterator.getSentenceInstance使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在java.text.BreakIterator的用法示例。


在下文中一共展示了BreakIterator.getSentenceInstance方法的15個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: getBoundaryScanner

import java.text.BreakIterator; //導入方法依賴的package包/類
private static BoundaryScanner getBoundaryScanner(Field field) {
    final FieldOptions fieldOptions = field.fieldOptions();
    final Locale boundaryScannerLocale = fieldOptions.boundaryScannerLocale();
    switch(fieldOptions.boundaryScannerType()) {
    case SENTENCE:
        if (boundaryScannerLocale != null) {
            return new BreakIteratorBoundaryScanner(BreakIterator.getSentenceInstance(boundaryScannerLocale));
        }
        return DEFAULT_SENTENCE_BOUNDARY_SCANNER;
    case WORD:
        if (boundaryScannerLocale != null) {
            return new BreakIteratorBoundaryScanner(BreakIterator.getWordInstance(boundaryScannerLocale));
        }
        return DEFAULT_WORD_BOUNDARY_SCANNER;
    default:
        if (fieldOptions.boundaryMaxScan() != SimpleBoundaryScanner.DEFAULT_MAX_SCAN
                || fieldOptions.boundaryChars() != SimpleBoundaryScanner.DEFAULT_BOUNDARY_CHARS) {
            return new SimpleBoundaryScanner(fieldOptions.boundaryMaxScan(), fieldOptions.boundaryChars());
        }
        return DEFAULT_SIMPLE_BOUNDARY_SCANNER;
    }
}
 
開發者ID:justor,項目名稱:elasticsearch_my,代碼行數:23,代碼來源:FastVectorHighlighter.java

示例2: DocumentWordTokenizer

import java.text.BreakIterator; //導入方法依賴的package包/類
/**
 * Creates a new DocumentWordTokenizer to work on a document
 * @param document The document to spell check
 */
public DocumentWordTokenizer(Document document) {
  this.document = document;
  //Create a text segment over the entire document
  text = new Segment();
  sentenceIterator = BreakIterator.getSentenceInstance();
  try {
    document.getText(0, document.getLength(), text);
    sentenceIterator.setText(text);
    // robert: use text.getBeginIndex(), not 0, for segment's first offset
    currentWordPos = getNextWordStart(text, text.getBeginIndex());
    //If the current word pos is -1 then the string was all white space
    if (currentWordPos != -1) {
      currentWordEnd = getNextWordEnd(text, currentWordPos);
      nextWordPos = getNextWordStart(text, currentWordEnd);
    } else {
      moreTokens = false;
    }
  } catch (BadLocationException ex) {
    moreTokens = false;
  }
}
 
開發者ID:Thecarisma,項目名稱:powertext,代碼行數:26,代碼來源:DocumentWordTokenizer.java

示例3: splitSentences

import java.text.BreakIterator; //導入方法依賴的package包/類
/**
 * Splits string into sentences by line breaks and punctuation marks.
 *
 * @param text the text to be split
 * @return Sentences as string array
 * @see java.text.BreakIterator#getSentenceInstance(Locale)
 */
private static String[] splitSentences(String text) {
    BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.GERMAN);
    iterator.setText(text);
    ArrayList<String> sentenceList = new ArrayList<>(text.length() / 6); // Avg word length in german is 5.7
    int start = iterator.first();
    for (int end = iterator.next();
         end != BreakIterator.DONE;
         start = end, end = iterator.next()) {

        String sentence = text.substring(start, end).trim();
        // Exclude empty sentences
        if (sentence.length() > 0) {
            Stream.of(sentence.split("\n"))
                    .filter(s -> s.length() > 0 && !s.equals("\r"))
                    .forEach(sentenceList::add);
        }
    }
    sentenceList.trimToSize(); // Remove unused indices

    // Convert ArrayList to array
    String[] sentences = new String[sentenceList.size()];
    sentenceList.toArray(sentences);
    return sentences;
}
 
開發者ID:AudiophileDev,項目名稱:T2M,代碼行數:32,代碼來源:TextAnalyser.java

示例4: splitBySentence

import java.text.BreakIterator; //導入方法依賴的package包/類
private static String[] splitBySentence(String text) {
    List<String> sentences = new ArrayList<String>();
    // Use Locale.US since the customizer is setting the default (US) locale text only:
    BreakIterator it = BreakIterator.getSentenceInstance(Locale.US);
    it.setText(text);
    int start = it.first();
    int end;
    while ((end = it.next()) != BreakIterator.DONE) {
        sentences.add(text.substring(start, end));
        start = end;
    }
    return sentences.toArray(new String[sentences.size()]);
}
 
開發者ID:apache,項目名稱:incubator-netbeans,代碼行數:14,代碼來源:LocalizedBundleInfo.java

示例5: testSingleSentences

import java.text.BreakIterator; //導入方法依賴的package包/類
public void testSingleSentences() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new CustomSeparatorBreakIterator(randomSeparator());
    assertSameBreaks("a", expected, actual);
    assertSameBreaks("ab", expected, actual);
    assertSameBreaks("abc", expected, actual);
    assertSameBreaks("", expected, actual);
}
 
開發者ID:justor,項目名稱:elasticsearch_my,代碼行數:9,代碼來源:CustomSeparatorBreakIteratorTests.java

示例6: testSliceEnd

import java.text.BreakIterator; //導入方法依賴的package包/類
public void testSliceEnd() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new CustomSeparatorBreakIterator(randomSeparator());
    assertSameBreaks("a000", 0, 1, expected, actual);
    assertSameBreaks("ab000", 0, 1, expected, actual);
    assertSameBreaks("abc000", 0, 1, expected, actual);
    assertSameBreaks("000", 0, 0, expected, actual);
}
 
開發者ID:justor,項目名稱:elasticsearch_my,代碼行數:9,代碼來源:CustomSeparatorBreakIteratorTests.java

示例7: testSliceStart

import java.text.BreakIterator; //導入方法依賴的package包/類
public void testSliceStart() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new CustomSeparatorBreakIterator(randomSeparator());
    assertSameBreaks("000a", 3, 1, expected, actual);
    assertSameBreaks("000ab", 3, 2, expected, actual);
    assertSameBreaks("000abc", 3, 3, expected, actual);
    assertSameBreaks("000", 3, 0, expected, actual);
}
 
開發者ID:justor,項目名稱:elasticsearch_my,代碼行數:9,代碼來源:CustomSeparatorBreakIteratorTests.java

示例8: testSliceMiddle

import java.text.BreakIterator; //導入方法依賴的package包/類
public void testSliceMiddle() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new CustomSeparatorBreakIterator(randomSeparator());
    assertSameBreaks("000a000", 3, 1, expected, actual);
    assertSameBreaks("000ab000", 3, 2, expected, actual);
    assertSameBreaks("000abc000", 3, 3, expected, actual);
    assertSameBreaks("000000", 3, 0, expected, actual);
}
 
開發者ID:justor,項目名稱:elasticsearch_my,代碼行數:9,代碼來源:CustomSeparatorBreakIteratorTests.java

示例9: useSentenceIterator

import java.text.BreakIterator; //導入方法依賴的package包/類
public void useSentenceIterator(String source){
	BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.US);
	iterator.setText(source);
	int start = iterator.first();
	for (int end = iterator.next();
	    end != BreakIterator.DONE;
	    start = end, end = iterator.next()) {
	  System.out.println(source.substring(start,end));
	}
}
 
開發者ID:PacktPublishing,項目名稱:Java-Data-Science-Cookbook,代碼行數:11,代碼來源:SentenceDetection.java

示例10: DocLocale

import java.text.BreakIterator; //導入方法依賴的package包/類
/**
 * Constructor
 */
DocLocale(DocEnv docenv, String localeName, boolean useBreakIterator) {
    this.docenv = docenv;
    this.localeName = localeName;
    this.useBreakIterator = useBreakIterator;
    locale = getLocale();
    if (locale == null) {
        docenv.exit();
    } else {
        Locale.setDefault(locale); // NOTE: updating global state
    }
    collator = Collator.getInstance(locale);
    sentenceBreaker = BreakIterator.getSentenceInstance(locale);
}
 
開發者ID:SunburstApps,項目名稱:OpenJSharp,代碼行數:17,代碼來源:DocLocale.java

示例11: getSegmentAt

import java.text.BreakIterator; //導入方法依賴的package包/類
/**
 * Returns the Segment at <code>index</code> representing either
 * the paragraph or sentence as identified by <code>part</code>, or
 * null if a valid paragraph/sentence can't be found. The offset
 * will point to the start of the word/sentence in the array, and
 * the modelOffset will point to the location of the word/sentence
 * in the model.
 */
private IndexedSegment getSegmentAt(int part, int index)
    throws BadLocationException {

    IndexedSegment seg = getParagraphElementText(index);
    if (seg == null) {
        return null;
    }
    BreakIterator iterator;
    switch (part) {
    case AccessibleText.WORD:
        iterator = BreakIterator.getWordInstance(getLocale());
        break;
    case AccessibleText.SENTENCE:
        iterator = BreakIterator.getSentenceInstance(getLocale());
        break;
    default:
        return null;
    }
    seg.first();
    iterator.setText(seg);
    int end = iterator.following(index - seg.modelOffset + seg.offset);
    if (end == BreakIterator.DONE) {
        return null;
    }
    if (end > seg.offset + seg.count) {
        return null;
    }
    int begin = iterator.previous();
    if (begin == BreakIterator.DONE ||
        begin >= seg.offset + seg.count) {
        return null;
    }
    seg.modelOffset = seg.modelOffset + begin - seg.offset;
    seg.offset = begin;
    seg.count = end - begin;
    return seg;
}
 
開發者ID:AdoptOpenJDK,項目名稱:openjdk-jdk10,代碼行數:46,代碼來源:AccessibleHTML.java

示例12: BreakIteratorTest

import java.text.BreakIterator; //導入方法依賴的package包/類
public BreakIteratorTest()
{
    characterBreak = BreakIterator.getCharacterInstance();
    wordBreak = BreakIterator.getWordInstance();
    lineBreak = BreakIterator.getLineInstance();
    sentenceBreak = BreakIterator.getSentenceInstance();
}
 
開發者ID:AdoptOpenJDK,項目名稱:openjdk-jdk10,代碼行數:8,代碼來源:BreakIteratorTest.java

示例13: BreakIteratorSentenceSplitter

import java.text.BreakIterator; //導入方法依賴的package包/類
/**
 * Constructor for the default locale.
 */
public BreakIteratorSentenceSplitter() {
    boundary = BreakIterator.getSentenceInstance();
}
 
開發者ID:takun2s,項目名稱:smile_1.5.0_java7,代碼行數:7,代碼來源:BreakIteratorSentenceSplitter.java

示例14: testFirstPosition

import java.text.BreakIterator; //導入方法依賴的package包/類
/** the current position must be ignored, initial position is always first() */
public void testFirstPosition() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new CustomSeparatorBreakIterator(randomSeparator());
    assertSameBreaks("000ab000", 3, 2, 4, expected, actual);
}
 
開發者ID:justor,項目名稱:elasticsearch_my,代碼行數:7,代碼來源:CustomSeparatorBreakIteratorTests.java

示例15: init

import java.text.BreakIterator; //導入方法依賴的package包/類
/**
 * Initializes the sentenseIterator
 */
protected void init() {
  sentenceIterator = BreakIterator.getSentenceInstance();
  sentenceIterator.setText(text);
}
 
開發者ID:Thecarisma,項目名稱:powertext,代碼行數:8,代碼來源:AbstractWordFinder.java


注:本文中的java.text.BreakIterator.getSentenceInstance方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。