mb_convert_case

(PHP 4 >= 4.3.0, PHP 5, PHP 7, PHP 8)

mb_convert_case — Меняет регистр символов в строке

Описание

function mb_convert_case(string $string, int $mode, ?string $encoding = null): string

Функция преобразовывает регистр символов в строке (string) способом, который указали в параметре mode.

Список параметров

string: Строка (string), которую требуется преобразовать.
mode: Режим преобразования. Параметр принимает значение константы из списка: MB_CASE_UPPER, MB_CASE_LOWER, MB_CASE_TITLE, MB_CASE_FOLD, MB_CASE_UPPER_SIMPLE, MB_CASE_LOWER_SIMPLE, MB_CASE_TITLE_SIMPLE или MB_CASE_FOLD_SIMPLE.
encoding: Параметр encoding указывает кодировку символов. При пропуске параметра или передаче значения null функция интерпретирует символы в предустановленной кодировке модуля.

Возвращаемые значения

Функция возвращает строку string, которую преобразовала способом, который указали в параметре mode.

Список изменений

Версия	Описание
8.3.0	Реализовали правила условного регистра для греческой буквы сигма, которые применяются только к режимам `MB_CASE_LOWER` и `MB_CASE_TITLE`, но не к режимам `MB_CASE_LOWER_SIMPLE` и `MB_CASE_TITLE_SIMPLE`.
7.3.0	Добавили поддержку режимов для параметра `mode`: `MB_CASE_FOLD`, `MB_CASE_UPPER_SIMPLE`, `MB_CASE_LOWER_SIMPLE`, `MB_CASE_TITLE_SIMPLE` и `MB_CASE_FOLD_SIMPLE`.

Примеры

Пример #1 Пример изменения регистра символов в строке функцией mb_convert_case()

<?php

$str = "у мэри был маленький ягнёнок и она его очень любила";
$str = mb_convert_case($str, MB_CASE_UPPER, "UTF-8");
echo $str, PHP_EOL;
$str = mb_convert_case($str, MB_CASE_TITLE, "UTF-8");
echo $str, PHP_EOL;

?>

Пример #2 Пример изменения функцией mb_convert_case() регистра символов в строке с нелатинским текстом в кодировке UTF-8

<?php

$str = "Τάχιστη αλώπηξ βαφής ψημένη γη, δρασκελίζει υπέρ νωθρού κυνός";
$str = mb_convert_case($str, MB_CASE_UPPER, "UTF-8");
echo $str, PHP_EOL;
$str = mb_convert_case($str, MB_CASE_TITLE, "UTF-8");
echo $str, PHP_EOL;

?>

Примечания

В отличие от стандартных функций преобразования регистра наподобие strtolower() и strtoupper(), регистр меняется на основе свойств символа Юникода. Поэтому на поведение этой функции не влияют региональные настройки системы, и она умеет конвертировать символы с Unicode-свойством 'alphabetic' наподобие символа буквы «а» с умлаутом — ä.

Подробнее о свойствах Юникода рассказывает страница » http://www.unicode.org/reports/tr21/.

Смотрите также

mb_strtolower() - Приводит строку к нижнему регистру
mb_strtoupper() - Приводит строку к верхнему регистру
strtolower() - Приводит строку к нижнему регистру
strtoupper() - Приводит строку к верхнему регистру
ucfirst() - Переводит первый символ строки в верхний регистр
ucwords() - Переводит в верхний регистр первый символ каждого слова в строке

Нашли ошибку?

Инструкция • Исправление • Сообщение об ошибке

＋Добавить

Примечания пользователей 10 notes

down

alNzy ¶

6 years ago

You can use this function to fix problems related to Turkish "ı", "I", "i", "İ" characters. This function also replaces the weird "i̇" character with regular "i" character ("i̇ => i").

function mb_convert_case_tr($str, $type, $encoding = "UTF-8")
{

  switch ($type) {
    case "u":
    case "upper":
    case MB_CASE_UPPER:
      $type = MB_CASE_UPPER;
      break;
    case "l":
    case "lower":
    case MB_CASE_LOWER:
      $type = MB_CASE_LOWER;
      break;
    case "t":
    case "title":
    case MB_CASE_TITLE:
      $type = MB_CASE_TITLE;
      break;
  }

  $str = str_replace("i", "İ", $str);
  $str = str_replace("I", "ı", $str);

  $str = mb_convert_case($str, $type, $encoding);
  $str = str_replace("i̇", "i", $str);

  return $str;
}

down

agash at freemail dot hu ¶

17 years ago

as the previouly posted version of this function doesn't handle UTF-8 characters, I simply tried to replace ucfirst to mb_convert_case, but then any previous case foldings were lost while looping through delimiters. 
So I decided to do an mb_convert_case on the input string (it also deals with words is uppercase wich may also be problematic when doing case-sensitive search), and do the rest of checking after that.

As with mb_convert_case, words are capitalized, I also added lowercase convertion for the exceptions, but, for the above mentioned reason, I left ucfirst unchanged.

Now it works fine for utf-8 strings as well, except for string delimiters followed by an UTF-8 character ("Mcádám" is unchanged, while "mcdunno's" is converted to "McDunno's" and "ökrös-TÓTH éDUa" in also put in the correct form)

I use it for checking user input on names and addresses, so exceptions list contains some hungarian words too.

<?php

function titleCase($string, $delimiters = array(" ", "-", ".", "'", "O'", "Mc"), $exceptions = array("út", "u", "s", "és", "utca", "tér", "krt", "körút", "sétány", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX", "X", "XI", "XII", "XIII", "XIV", "XV", "XVI", "XVII", "XVIII", "XIX", "XX", "XXI", "XXII", "XXIII", "XXIV", "XXV", "XXVI", "XXVII", "XXVIII", "XXIX", "XXX" )) {
       /*
        * Exceptions in lower case are words you don't want converted
        * Exceptions all in upper case are any words you don't want converted to title case
        *   but should be converted to upper case, e.g.:
        *   king henry viii or king henry Viii should be King Henry VIII
        */
        $string = mb_convert_case($string, MB_CASE_TITLE, "UTF-8");

       foreach ($delimiters as $dlnr => $delimiter){
               $words = explode($delimiter, $string);
               $newwords = array();
               foreach ($words as $wordnr => $word){
               
                       if (in_array(mb_strtoupper($word, "UTF-8"), $exceptions)){
                               // check exceptions list for any words that should be in upper case
                               $word = mb_strtoupper($word, "UTF-8");
                       }
                       elseif (in_array(mb_strtolower($word, "UTF-8"), $exceptions)){
                               // check exceptions list for any words that should be in upper case
                               $word = mb_strtolower($word, "UTF-8");
                       }
                       
                       elseif (!in_array($word, $exceptions) ){
                               // convert to uppercase (non-utf8 only)
                             
                               $word = ucfirst($word);
                               
                       }
                       array_push($newwords, $word);
               }
               $string = join($delimiter, $newwords);
       }//foreach
       return $string;
} 

?>

down

Rasa Ravi at tantrajoga dot cz ¶

21 years ago

For CZECH characters:
<?php
$text = mb_convert_case($text, MB_CASE_LOWER, "Windows-1251");
?>
The right encoding Windows-1250 is not valid (see the list mb_list_encodings), but Windows-1251 will do the same 100%. The function strtolower() ignores czech characters with diacritics.

down

info at yasarnet dot com ¶

18 years ago

For my case following did the work to capitalize UTF-8 encoded string. 

function capitalize($str, $encoding = 'UTF-8') {
    return mb_strtoupper(mb_substr($str, 0, 1, $encoding), $encoding) . mb_strtolower(mb_substr($str, 1, mb_strlen($str), $encoding), $encoding);
}

down

the at psychoticneurotic dot com ¶

17 years ago

Building upon Justin's and Alex's work... 

This function allows you to specify which delimiter(s) to explode on (not just the default space). Now you can correctly capitalize Irish names and hyphenated words (if you want)!

<?php
function titleCase($string, $delimiters = array(" ", "-", "O'"), $exceptions = array("to", "a", "the", "of", "by", "and", "with", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX", "X")) {
       /*
        * Exceptions in lower case are words you don't want converted
        * Exceptions all in upper case are any words you don't want converted to title case
        *   but should be converted to upper case, e.g.:
        *   king henry viii or king henry Viii should be King Henry VIII
        */
       foreach ($delimiters as $delimiter){
               $words = explode($delimiter, $string);
               $newwords = array();
               foreach ($words as $word){
                       if (in_array(strtoupper($word), $exceptions)){
                               // check exceptions list for any words that should be in upper case
                               $word = strtoupper($word);
                       } elseif (!in_array($word, $exceptions)){
                               // convert to uppercase
                               $word = ucfirst($word);
                       }
                       array_push($newwords, $word);
               }
               $string = join($delimiter, $newwords);
       }
       return $string;
}
?>

down

turabgarip at gmail dot com ¶

2 years ago

As with other string functions, there is a problem with Turkish "i" with this function. There is a bug report from 2015 about the issue but PHP team says "language-specific conditional special case mappings is not implemented", although actually it breaks the logic of the function and renders it non-usable for the purpose.

https://bugs.php.net/bug.php?id=70072

The problem arises from the letter "i" in Latin being a COMPLETELY different letter from "i" in Turkish. Turkish "ı" becomes "I" for capital; while Latin "I" capital is actually capital for "i" and not "ı".

PHP takes this into consideration in some cases and ignores it in other cases; which causes an unpredictable behavior. When the letters in question is in the middle or at the beginning of a word, when some of multibyte chars are next to standard Latin chars or another multibyte character etc. These all behave differently, which is simply wrong.

There are some user notes trying to cover this but not very efficiently. Because some of them doesn't cover word boundaries and some produce non-standard characters. Here is what I tested and have been using for quite a time:

<?php

function mb_convert_case_i(string $string, int $mode = MB_CASE_TITLE, string $encoding = 'UTF-8'): string {
    // Turkish "i" is a special case
    $string = match($mode) {
        MB_CASE_UPPER, MB_CASE_UPPER_SIMPLE => str_replace(['i', 'ı'], ['İ', 'I'], $string),
        MB_CASE_LOWER, MB_CASE_LOWER_SIMPLE => str_replace(['İ', 'I'], ['i', 'ı'], $string),
        // PHP behaves differently when i and ı are at the beginning of the word
        MB_CASE_TITLE, MB_CASE_TITLE_SIMPLE => preg_replace(['/İ/u', '/I/u', '/\b(i)/u'], ['i', 'ı', 'İ'], $string),
        default => $string,
    };
    return mb_convert_case($string, $mode, $encoding);
}

?>

As you have noticed, it uses match syntax which requires PHP 8. For lower versions, you can replace it with switch properly. I haven't tested it for case folding. If you need it, just add another condition to the match.

down

tavhane at gmail dot com ¶

8 years ago

for turkish simple:

$str = mb_convert_case(str_replace(['i','I'], ['İ','ı'], $str), MB_CASE_TITLE,"UTF-8");

down

dave at wp dot pl ¶

10 years ago

MB_CASE_TITLE doesn't change letters in quotation marks.

Example:
mb_convert_case('AAA "aaa"', MB_CASE_TITLE); 
// Result: Aaa "aaa"

down

-2

Anonymous ¶

4 years ago

$str = "Τάχιστη αλώπηξ βαφής ψημένη γη, δρασκελίζει υπέρ νωθρού κυνός";
$str = mb_convert_case($str, MB_CASE_UPPER, "UTF-8");
this convertation does not give the example that you already post 
but this one

$str = mb_convert_case($str, MB_CASE_UPPER, "UTF-8");
"ΤΆΧΙΣΤΗ ΑΛΏΠΗΞ ΒΑΦΉΣ ΨΗΜΈΝΗ ΓΗ, ΔΡΑΣΚΕΛΊΖΕΙ ΥΠΈΡ ΝΩΘΡΟΎ ΚΥΝΌΣ"

down

-3

webenformasyon at gmail dot com ¶

8 years ago

for turkish language I => i  and i => I conversion is a problem. It must be I => ı and i => İ so my simple solution is

    public function title_case_turkish($str){

 
        $str = str_replace("i", "İ", $str);
        $str = str_replace("I", "ı", $str);

        $str = mb_convert_case($str, MB_CASE_TITLE,"UTF-8");

        return $str;

    }

＋Добавить