当前位置: 首页>>代码示例>>PHP>>正文


PHP CharsetConverter::UnumberToChar方法代码示例

本文整理汇总了PHP中TYPO3\CMS\Core\Charset\CharsetConverter::UnumberToChar方法的典型用法代码示例。如果您正苦于以下问题:PHP CharsetConverter::UnumberToChar方法的具体用法?PHP CharsetConverter::UnumberToChar怎么用?PHP CharsetConverter::UnumberToChar使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在TYPO3\CMS\Core\Charset\CharsetConverter的用法示例。


在下文中一共展示了CharsetConverter::UnumberToChar方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的PHP代码示例。

示例1: addWords

 /**
  * Add word to word-array
  * This function should be used to make sure CJK sequences are split up in the right way
  *
  * @param 	array		Array of accumulated words
  * @param 	string		Complete Input string from where to extract word
  * @param 	integer		Start position of word in input string
  * @param 	integer		The Length of the word string from start position
  * @return 	void
  * @todo Define visibility
  */
 public function addWords(&$words, &$wordString, $start, $len)
 {
     // Get word out of string:
     $theWord = substr($wordString, $start, $len);
     // Get next chars unicode number and find type:
     $bc = 0;
     $cp = $this->utf8_ord($theWord, $bc);
     list($cType) = $this->charType($cp);
     // If string is a CJK sequence we follow this algorithm:
     /*
     		DESCRIPTION OF (CJK) ALGORITHMContinuous letters and numbers make up words. Spaces and symbols
     		separate letters and numbers into words. This is sufficient for
     		all western text.CJK doesn't use spaces or separators to separate words, so the only
     		way to really find out what constitutes a word would be to have a
     		dictionary and advanced heuristics. Instead, we form pairs from
     		consecutive characters, in such a way that searches will find only
     		characters that appear more-or-less the right sequence. For example:ABCDE => AB BC CD DEThis works okay since both the index and the search query is split
     		in the same manner, and since the set of characters is huge so the
     		extra matches are not significant.(Hint taken from ZOPEs chinese user group)[Kasper: As far as I can see this will only work well with or-searches!]
     */
     if ($cType == 'cjk') {
         // Find total string length:
         $strlen = $this->csObj->utf8_strlen($theWord);
         // Traverse string length and add words as pairs of two chars:
         for ($a = 0; $a < $strlen; $a++) {
             if ($strlen == 1 || $a < $strlen - 1) {
                 $words[] = $this->csObj->utf8_substr($theWord, $a, 2);
             }
         }
     } else {
         // Normal "single-byte" chars:
         // Remove chars:
         foreach ($this->lexerConf['removeChars'] as $skipJoin) {
             $theWord = str_replace($this->csObj->UnumberToChar($skipJoin), '', $theWord);
         }
         // Add word:
         $words[] = $theWord;
     }
 }
开发者ID:khanhdeux,项目名称:typo3test,代码行数:50,代码来源:Lexer.php


注:本文中的TYPO3\CMS\Core\Charset\CharsetConverter::UnumberToChar方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。