本文整理汇总了PHP中TYPO3\CMS\Core\Charset\CharsetConverter::utf8_strlen方法的典型用法代码示例。如果您正苦于以下问题:PHP CharsetConverter::utf8_strlen方法的具体用法?PHP CharsetConverter::utf8_strlen怎么用?PHP CharsetConverter::utf8_strlen使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类TYPO3\CMS\Core\Charset\CharsetConverter
的用法示例。
在下文中一共展示了CharsetConverter::utf8_strlen方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的PHP代码示例。
示例1: addWords
/**
* Add word to word-array
* This function should be used to make sure CJK sequences are split up in the right way
*
* @param array Array of accumulated words
* @param string Complete Input string from where to extract word
* @param integer Start position of word in input string
* @param integer The Length of the word string from start position
* @return void
* @todo Define visibility
*/
public function addWords(&$words, &$wordString, $start, $len)
{
// Get word out of string:
$theWord = substr($wordString, $start, $len);
// Get next chars unicode number and find type:
$bc = 0;
$cp = $this->utf8_ord($theWord, $bc);
list($cType) = $this->charType($cp);
// If string is a CJK sequence we follow this algorithm:
/*
DESCRIPTION OF (CJK) ALGORITHMContinuous letters and numbers make up words. Spaces and symbols
separate letters and numbers into words. This is sufficient for
all western text.CJK doesn't use spaces or separators to separate words, so the only
way to really find out what constitutes a word would be to have a
dictionary and advanced heuristics. Instead, we form pairs from
consecutive characters, in such a way that searches will find only
characters that appear more-or-less the right sequence. For example:ABCDE => AB BC CD DEThis works okay since both the index and the search query is split
in the same manner, and since the set of characters is huge so the
extra matches are not significant.(Hint taken from ZOPEs chinese user group)[Kasper: As far as I can see this will only work well with or-searches!]
*/
if ($cType == 'cjk') {
// Find total string length:
$strlen = $this->csObj->utf8_strlen($theWord);
// Traverse string length and add words as pairs of two chars:
for ($a = 0; $a < $strlen; $a++) {
if ($strlen == 1 || $a < $strlen - 1) {
$words[] = $this->csObj->utf8_substr($theWord, $a, 2);
}
}
} else {
// Normal "single-byte" chars:
// Remove chars:
foreach ($this->lexerConf['removeChars'] as $skipJoin) {
$theWord = str_replace($this->csObj->UnumberToChar($skipJoin), '', $theWord);
}
// Add word:
$words[] = $theWord;
}
}
示例2: utf8_strlenForNonEmptyAsciiOnlyStringReturnsNumberOfCharacters
/**
* @test
*/
public function utf8_strlenForNonEmptyAsciiOnlyStringReturnsNumberOfCharacters()
{
$this->assertEquals(10, $this->subject->utf8_strlen('good omens'));
}