International PHP Conference Berlin 2021


这些函数的行为受 php.ini 中的设置影响。

mbstring 配置选项
名字 默认 可修改范围 更新日志
mbstring.language "neutral" PHP_INI_ALL PHP_INI_PERDIR 位于 PHP <= 5.2.6
mbstring.detect_order NULL PHP_INI_ALL  
mbstring.http_input "pass" PHP_INI_ALL  
mbstring.http_output "pass" PHP_INI_ALL  
mbstring.internal_encoding NULL PHP_INI_ALL  
mbstring.script_encoding NULL PHP_INI_ALL 在 PHP 5.4.0. 中移除, 使用 zend.script_encoding 代替。
mbstring.substitute_character NULL PHP_INI_ALL  
mbstring.func_overload "0" PHP_INI_SYSTEM PHP <= 5.2.6 是 PHP_INI_PERDIR。 Deprecated as of PHP 7.2.0; removed as of PHP 8.0.0.
mbstring.encoding_translation "0" PHP_INI_PERDIR  
mbstring.http_output_conv_mimetypes "^(text/|application/xhtml\+xml)" PHP_INI_ALL Available as of PHP 5.3.0.
mbstring.strict_detection "0" PHP_INI_ALL 自 PHP 5.1.2 起有效。
有关 PHP_INI_* 样式的更多详情与定义,见 配置可被设定范围


mbstring.language string

mbstring 使用了国家默认语言设置(NLS)。 注意,该选项自动地定义了 mbstring.internal_encodingmbstring.internal_encoding,在 php.ini 里应当放置在 mbstring.language 之后。

mbstring.encoding_translation bool

为传入的 HTTP 查询启用透明字符编码过滤器,将检测和转换输入的编码为内部字符编码(internal character encoding)。

mbstring.internal_encoding string

本特性已自 PHP 5.6.0 起废弃。强烈建议不要使用本特性。


PHP 5.6 及更新版的用户应该将此选项留空,并设置 default_charset 作为代替。

mbstring.http_input string

本特性已自 PHP 5.6.0 起废弃。强烈建议不要使用本特性。

定义 HTTP 输入字符的默认编码。

PHP 5.6 及更新版的用户应该将此选项留空,并设置 default_charset 作为代替。

mbstring.http_output string

本特性已自 PHP 5.6.0 起废弃。强烈建议不要使用本特性。

定义 HTTP 输出字符的默认编码。

PHP 5.6 及更新版的用户应该将此选项留空,并设置 default_charset 作为代替。

mbstring.detect_order string

定义字符编码的默认检测顺序。参见 mb_detect_order()

mbstring.substitute_character string

为无效编码的字符定义替代字符。 参见 mb_substitute_character() ,查看支持的值。

mbstring.func_overload string

This feature has been DEPRECATED as of PHP 7.2.0, and REMOVED as of PHP 8.0.0. Relying on this feature is highly discouraged.

用 mbstring 对应的函数覆盖单字节版本的函数集。更多信息参见函数的覆盖

该设置仅能通过 php.ini 文件来修改。

mbstring.http_output_conv_mimetypes string

mbstring.strict_detection bool


根据 » HTML4.01 规范,允许 Web 浏览器以页面不同的字符编码来提交表单。 参见用 mb_http_input() 来检测浏览器使用的字符编码。

尽管流行的浏览器能够根据给出的 HTML 文档合理猜测正确的编码,但如果能通过 header() 函数在 HTTP 的 Content-Type 头内或 ini 的 default_charset 里设置适当的 charset 参数则会更佳。

示例 #1 php.ini 设置例子

; 设置默认语言
mbstring.language        = Neutral; 设置默认语言 Neutral(UTF-8) (默认的值)
mbstring.language        = English; 设置默认语言为 English 
mbstring.language        = Japanese; 设置默认语言为 Japanese

;; 设置内部的默认编码
;; 注意:请确保这个编码能被 PHP 所处理
mbstring.internal_encoding    = UTF-8  ; 设置内部的默认编码为 UTF-8

;; 启用 HTTP 输入编码的转换
mbstring.encoding_translation = On

;; 设置 HTTP 输入的默认编码
;; 注意:脚本不能修改 http_input 的设置
mbstring.http_input           = pass    ; 不转换
mbstring.http_input           = auto    ; 设置 HTTP 输入为 auto
                                ; "auto" 会根据 mbstring.language 自动扩展
mbstring.http_input           = SJIS    ; 设置 HTTP 输入编码为 SJIS
mbstring.http_input           = UTF-8,SJIS,EUC-JP ; 指定顺序

;; 设置 HTTP 输出的默认编码
mbstring.http_output          = pass    ; 不转换
mbstring.http_output          = UTF-8   ; 设置 HTTP 输出编码为 UTF-8

;; 设置字符编码的默认检测顺序
mbstring.detect_order         = auto    ; Set detect order to auto
mbstring.detect_order         = ASCII,JIS,UTF-8,SJIS,EUC-JP ; Specify order

;; 设置默认的替代字符
mbstring.substitute_character = 12307   ; 指定 Unicode 值
mbstring.substitute_character = none    ; 不打印字符
mbstring.substitute_character = long    ; Long 的例子: U+3000,JIS+7E7E

示例 #2 php.iniEUC-JP 用户的设置

;; 禁用输出缓冲
output_buffering      = Off

;; 设置 HTTP header 字符编码
default_charset       = EUC-JP    

;; 设置默认语言为 Japanese
mbstring.language = Japanese

;; 启用 HTTP 输入编码的转换
mbstring.encoding_translation = On

;; 启用 HTTP 输入转换的编码为 auto
mbstring.http_input   = auto 

;; 转换 HTTP 输出的编码为 EUC-JP
mbstring.http_output  = EUC-JP    

;; 设置内部编码为 EUC-JP
mbstring.internal_encoding = EUC-JP    

;; 不要打印无效的字符
mbstring.substitute_character = none   

示例 #3 php.iniSJIS 用户的设置

;; 启用输出缓冲
output_buffering     = On

;; 设置 mb_output_handler 来启用输出编码的转换
output_handler       = mb_output_handler

;; 设置 HTTP header 的字符编码
default_charset      = Shift_JIS

;; 设置默认语言为 Japanese
mbstring.language = Japanese

;; 设置 http 输入转换的编码为 auto
mbstring.http_input  = auto 

;; 转换成 SJIS
mbstring.http_output = SJIS    

;; 设置内部变量为 EUC-JP
mbstring.internal_encoding = EUC-JP    

;; 不要打印无效的字符
mbstring.substitute_character = none   

add a note add a note

User Contributed Notes 3 notes

Hayley Watson
2 years ago
String literals in the PHP script are encoded with the same encoding that the PHP file was saved with. This is not affected by default_charset or other .ini settings.

Scenario: The default_charset is KOI8-R, and there is a text file "input.txt" containing the string "Это текст для поиска." in KOI8-R encoding.

A PHP script is written:

// mb_internal_encoding('KOI8-R');

$string  = 'текст.';

$data = file_get_contents('input.txt');

mb_strpos($data, $string);

But unfortunately it was saved as UTF-8.

It doesn't work; mb_strpos() returns false because it can't find the UTF-8-encoded "текст" inside the KOI8-R-encoded "Это текст для поиска.".

Adjusting the default_charset had no effect. Not even fiddling with mb_internal_encoding could fix it, simply because the strings involved had *different* encodings and without actually changing one of them they just weren't going to match.

Either re-save the source file as KOI8-R to match the data file, or re-save the data file as UTF-8 to match the source code. Only then will the script properly echo '4'.
ASchmidt at Anamera dot net
2 years ago
The documentation is vague, on WHAT precisely the valid "NLS" language strings are that are valid for "mbstring.language".

According to the values are "Japanese", "ja", "English", "en", or "uni" for UTF-8.
On the other hand, the sample on this current page omits "uni" but introduces "Neutral" as an undocumented option - which is also the default value:

( mb_language() );   // "neutral" (default if not set)
var_dump( mb_language( 'uni' ) );    // TRUE, valid language string
var_dump( mb_language() );    // "uni"
var_dump( mb_language( 'neutral' ) );    // TRUE, valid language string
var_dump( mb_language() );    // "neutral"
7 years ago
Note that you should better at least set "mbstring.internal_encoding".

Just check as below:


echo mb_internal_encoding() . '<br />';


You might be surprised at unexpected values.


mbstring.language Japanese
;mbstring.internal_encoding (commented out showing "no value" in phpinfo() )

These two lines in "php.ini" are the same values as


in Win / Linux servers.

"mbstring.internal_encoding" defines the default encoding for "mb_" Functions such as "mb_strlen()".

It also defines the same for "mb_ereg_" Functions such as "mb_ereg()" when you don't set "mb_regex_encoding".
To Top