當前位置: 首頁>>代碼示例>>PHP>>正文


PHP CharsetConverter::utf8_strlen方法代碼示例

本文整理匯總了PHP中TYPO3\CMS\Core\Charset\CharsetConverter::utf8_strlen方法的典型用法代碼示例。如果您正苦於以下問題:PHP CharsetConverter::utf8_strlen方法的具體用法?PHP CharsetConverter::utf8_strlen怎麽用?PHP CharsetConverter::utf8_strlen使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在TYPO3\CMS\Core\Charset\CharsetConverter的用法示例。


在下文中一共展示了CharsetConverter::utf8_strlen方法的2個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的PHP代碼示例。

示例1: addWords

 /**
  * Add word to word-array
  * This function should be used to make sure CJK sequences are split up in the right way
  *
  * @param 	array		Array of accumulated words
  * @param 	string		Complete Input string from where to extract word
  * @param 	integer		Start position of word in input string
  * @param 	integer		The Length of the word string from start position
  * @return 	void
  * @todo Define visibility
  */
 public function addWords(&$words, &$wordString, $start, $len)
 {
     // Get word out of string:
     $theWord = substr($wordString, $start, $len);
     // Get next chars unicode number and find type:
     $bc = 0;
     $cp = $this->utf8_ord($theWord, $bc);
     list($cType) = $this->charType($cp);
     // If string is a CJK sequence we follow this algorithm:
     /*
     		DESCRIPTION OF (CJK) ALGORITHMContinuous letters and numbers make up words. Spaces and symbols
     		separate letters and numbers into words. This is sufficient for
     		all western text.CJK doesn't use spaces or separators to separate words, so the only
     		way to really find out what constitutes a word would be to have a
     		dictionary and advanced heuristics. Instead, we form pairs from
     		consecutive characters, in such a way that searches will find only
     		characters that appear more-or-less the right sequence. For example:ABCDE => AB BC CD DEThis works okay since both the index and the search query is split
     		in the same manner, and since the set of characters is huge so the
     		extra matches are not significant.(Hint taken from ZOPEs chinese user group)[Kasper: As far as I can see this will only work well with or-searches!]
     */
     if ($cType == 'cjk') {
         // Find total string length:
         $strlen = $this->csObj->utf8_strlen($theWord);
         // Traverse string length and add words as pairs of two chars:
         for ($a = 0; $a < $strlen; $a++) {
             if ($strlen == 1 || $a < $strlen - 1) {
                 $words[] = $this->csObj->utf8_substr($theWord, $a, 2);
             }
         }
     } else {
         // Normal "single-byte" chars:
         // Remove chars:
         foreach ($this->lexerConf['removeChars'] as $skipJoin) {
             $theWord = str_replace($this->csObj->UnumberToChar($skipJoin), '', $theWord);
         }
         // Add word:
         $words[] = $theWord;
     }
 }
開發者ID:khanhdeux,項目名稱:typo3test,代碼行數:50,代碼來源:Lexer.php

示例2: utf8_strlenForNonEmptyAsciiOnlyStringReturnsNumberOfCharacters

 /**
  * @test
  */
 public function utf8_strlenForNonEmptyAsciiOnlyStringReturnsNumberOfCharacters()
 {
     $this->assertEquals(10, $this->subject->utf8_strlen('good omens'));
 }
開發者ID:plan2net,項目名稱:TYPO3.CMS,代碼行數:7,代碼來源:CharsetConverterTest.php


注:本文中的TYPO3\CMS\Core\Charset\CharsetConverter::utf8_strlen方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。