mb_strlen

(PHP 4 >= 4.0.6, PHP 5, PHP 7, PHP 8)

mb_strlen — Получает длину строки

Описание

function mb_strlen(string $string, ?string $encoding = null): int

Функция получает длину строки (string).

Список параметров

string: Строка (string), длина которой измеряется.
encoding: Параметр encoding указывает кодировку символов. При пропуске параметра или передаче значения null функция интерпретирует символы в предустановленной кодировке модуля.

Возвращаемые значения

Функция возвращает количество символов в строке (string) аргумента string на основе кодировки символов, которую определили в параметре encoding. Функция воспринимает многобайтовый символ как 1 символ.

Ошибки

Функция выдаёт ошибку уровня E_WARNING, если не знает кодировку.

Список изменений

Версия	Описание
8.0.0	Параметр `encoding` теперь принимает значение `null`.

Смотрите также

mb_internal_encoding() - Устанавливает или получает внутреннюю кодировку символов файла скрипта
grapheme_strlen() - Получает количество графемных кластеров в строке
iconv_strlen() - Возвращает количество символов в строке
strlen() - Получает длину строки

Нашли ошибку?

Инструкция • Исправление • Сообщение об ошибке

＋Добавить

Примечания пользователей 5 notes

down

Yzmir Ramirez ¶

15 years ago

If you are unsure about what $encoding can be set to, here's a full list of all the encodings supported by this extension:

http://www.php.net/manual/en/mbstring.supported-encodings.php

down

drake127 ¶

18 years ago

Speed of mb_strlen varies a lot according to specified character set.

If you need length of string in bytes (strlen cannot be trusted anymore because of mbstring.func_overload) you should use <?php mb_strlen($string, '8bit'); ?>.
It's the fastest way (still a way slower than strlen, though) to determine byte length of string. Other single byte character sets (ASCII, ISO-8859-1, ...) are several times slower than 8bit.

down

koala at example dot com ¶

19 years ago

Just did a little benchmarking (1.000.000 times with lorem ipsum text) on the mbs functions

especially mb_strtolower and mb_strtoupper are really slow (up to 100 times slower compared to normal functions). Other functions are alike-ish, but sometimes up to 5 times slower.

just be cautious when using mb_ functions in high frequented scripts.

# test runs: 1000000
# benchmarking strlen vs. mb_strlen
# normal strlen: 3.6795361042023 ms, average: 3.6795361042023E-6 ms
# mb_strlen: 5.5934538841248 ms, average: 5.5934538841248E-6 ms
ok 1 - mb_strlen is slower than strlen
# mb_strlen is 1.52 slower than strlen
#
#
# benchmarking strpos vs. mb_strpos
# normal strpos: 5.5523281097412 ms, average: 5.5523281097412E-6 ms
# mb_strlen: 31.180974960327 ms, average: 3.1180974960327E-5 ms
ok 2 - mb_strlen is slower than strlen
# mb_strpos is 5.62 slower than strpos
#
#
# benchmarking substr vs. mb_substr
# normal substr: 3.4437320232391 ms, average: 3.4437320232391E-6 ms
# mb_strlen: 3.5374181270599 ms, average: 3.5374181270599E-6 ms
ok 3 - mb_strlen is slower than strlen
# mb_substr is 1.03 slower than substr
#
#
# benchmarking strtolower vs. mb_strtolower
# normal strtolower: 4.446839094162 ms, average: 4.446839094162E-6 ms
# mb_strlen: 193.44901108742 ms, average: 0.00019344901108742 ms
ok 4 - mb_strlen is slower than strlen
# mb_strtolower is 43.5 slower than strtolower
#
#
# benchmarking strtoupper vs. mb_strtoupper
# normal strtoupper: 3.0210740566254 ms, average: 3.0210740566254E-6 ms
# mb_strlen: 340.71775603294 ms, average: 0.00034071775603294 ms
ok 5 - mb_strlen is slower than strlen
# mb_strtoupper is 112.78 slower than strtoupper

down

Ben ¶

17 years ago

If you find yourself without the mb string functions and can't easily change it, a quick hack replacement for mb_strlen for utf8 characters is to use a a PCRE regex with utf8 turned on.

$strlen = preg_match_all("/.{1}/us",$utf8string,$dummy);

This is basically an ugly hack which counts all single character matches, and I'd expect it to be painfully slow on large strings.

down

-2

David Spector ¶

6 years ago

It may not be clear whether PHP actually supports utf-8, which is the current de facto standard character encoding for Web documents, which supports most human languages. The good news is: it does.

I wrote a test program which successfully reads in a utf-8 file (without BOM) and manipulates the characters using mb_substr, mb_strlen, and mb_strpos (mb_substr should normally be avoided, as it must always start its search at character position 0).

The results with a variety of Unicode test characters in utf-8 encoding, up to four bytes in length, were mostly correct, except that accent marks were always mistakenly treated as separate characters instead of being combined with the previous character; this problem can be worked around by programming, when necessary.

＋Добавить