mb_strpos() used in a loop on a long string may become very slow even if you provide the $offset. Unlike strpos(), mb_strpos() has to skip the number of characters
every call specified by $offset to get the real byte position used internally. (Whereas strpos can just add the offset.)
If your encoding is UTF-8 and you try to find only single characters with ordinal <= 127 you may still use strpos(), substr(), ... This works cause every byte of a UTF-8 sequence is >= 128.
Greetz maz
mb_strpos
(PHP 4 >= 4.0.6, PHP 5)
mb_strpos — Find position of first occurrence of string in a string
Descrizione
Finds position of the first occurrence of a string in a string.
Performs a multi-byte safe strpos() operation based on number of characters. The first character's position is 0, the second character position is 1, and so on.
Elenco dei parametri
- haystack
-
The string being checked.
- needle
-
The position counted from the beginning of haystack .
- offset
-
The search offset. If it is not specified, 0 is used.
- encoding
-
Il parametro encoding è la codifica dei caratteri. Se è omesso, verrà utilizzata la codifica interna.
Valori restituiti
Returns the numeric position of the first occurrence of needle in the haystack string. If needle is not found, it returns FALSE.
mb_strpos
11-Mar-2008 05:42
05-Jul-2007 10:42
Hello,
Just replaced strpos() with mb_strpos() and now I am getting following error:
PHP Warning: mb_strpos() [<a href='function.mb-strpos'>function.mb-strpos</a>]: Empty delimiter
PHP version: 5.2.3
OS: Win XP Prof
Web Server: IIS
I checked your bugs and mentioned that mb_string functions have been fixed as of 5.2.0 but it does not seem to be the case (Bug #39400).
My code:
==============================================
$charOut = mb_substr($tmpStr, $tmpKey[0], 1);
$posOut = mb_strpos($charList, $charOut);
if ($posOut !== FALSE) {
// do something here
}
==============================================
05-Aug-2006 05:12
sorry, my previous post had an error. replace the 1000 with strlen($haystack) to handle strings longer than 1000 chars.
btw. This is an issue with the mbstring functions. you can't specify the $encoding without specifying a $length, thus this reduces the functionality of mb_substr compared to substr
04-Aug-2006 09:42
a sample mb_str_replace function:
function mb_str_replace($haystack, $search,$replace, $offset=0,$encoding='auto'){
$len_sch=mb_strlen($search,$encoding);
$len_rep=mb_strlen($replace,$encoding);
while (($offset=mb_strpos($haystack,$search,$offset,$encoding))!==false){
$haystack=mb_substr($haystack,0,$offset,$encoding)
.$replace
.mb_substr($haystack,$offset+$len_sch,1000,$encoding);
$offset=$offset+$len_rep;
if ($offset>mb_strlen($haystack,$encoding))break;
}
return $haystack;
}
04-Aug-2006 09:39
It appears that the $offset value is a character count not a byte count. (This may seem obvious but it isn't explicitly stated)
