PHP
downloads | documentation | faq | getting help | mailing lists | reporting bugs | php.net sites | links | conferences | my php.net

search for in the

mb_convert_kana> <mb_convert_case
Last updated: Fri, 18 Jul 2008

view this page in

mb_convert_encoding

(PHP 4 >= 4.0.6, PHP 5)

mb_convert_encoding — Convert character encoding

Описание

string mb_convert_encoding ( string $str , string $to_encoding [, mixed $from_encoding ] )

Converts the character encoding of string str to to_encoding from optionally from_encoding .

Список параметров

str

The string being encoded.

to_encoding

The type of encoding that str is being converted to.

from_encoding

Is specified by character code names before conversion. It is either an array, or a comma separated enumerated list. If from_encoding is not specified, the internal encoding will be used.

"auto" may be used, which expands to "ASCII,JIS,UTF-8,EUC-JP,SJIS".

Возвращаемые значения

The encoded string.

Примеры

Пример #1 mb_convert_encoding() example

<?php
/* Convert internal character encoding to SJIS */
$str mb_convert_encoding($str"SJIS");

/* Convert EUC-JP to UTF-7 */
$str mb_convert_encoding($str"UTF-7""EUC-JP");

/* Auto detect encoding from JIS, eucjp-win, sjis-win, then convert str to UCS-2LE */
$str mb_convert_encoding($str"UCS-2LE""JIS, eucjp-win, sjis-win");

/* "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS" */
$str mb_convert_encoding($str"EUC-JP""auto");
?>

Смотрите также



mb_convert_kana> <mb_convert_case
Last updated: Fri, 18 Jul 2008
 
add a note add a note User Contributed Notes
mb_convert_encoding
nospam at nihonbunka dot com
15-May-2008 06:51
rodrigo at bb2 dot co dot jp wrote that inconv works better than mb_convert_encoding, I find that when converting from uft8 to shift_jis
$conv_str = mb_convert_encoding($str,$toCS,$fromCS);
works while
$conv_str = iconv($fromCS,$toCS.'//IGNORE',$str);
removes tildes from $str.
katzlbtjunk at hotmail dot com
25-Jan-2008 04:36
Clean a string for use as filename by simply replacing all unwanted characters with underscore (ASCII converts to 7bit). It removes slightly more chars than necessary. Hope its useful.

$fileName = 'Test:!"$%&/()=ÖÄÜöäü<<';
echo strtr(mb_convert_encoding($fileName,'ASCII'),
    ' ,;:?*#!§$%&/(){}<>=`´|\\\'"',
    '____________________________');
rodrigo at bb2 dot co dot jp
15-Jan-2008 03:47
For those who can´t use mb_convert_encoding() to convert from one charset to another as a metter of lower version of php, try iconv().

I had this problem converting to japanese charset:

$txt=mb_convert_encoding($txt,'SJIS',$this->encode);

And I could fix it by using this:

$txt = iconv('UTF-8', 'SJIS', $txt);

Maybe it´s helpfull for someone else! ;)
mightye at gmail dot com
13-Nov-2007 09:24
To petruzanauticoyahoo?com!ar

If you don't specify a source encoding, then it assumes the internal (default) encoding.  ñ is a multi-byte character whose bytes in your configuration default (often iso-8859-1) would actually mean ñ.  mb_convert_encoding() is upgrading those characters to their multi-byte equivalents within UTF-8.

Try this instead:
<?php
print mb_convert_encoding( "ñ", "UTF-8", "UTF-8" );
?>
Of course this function does no work (for the most part - it can actually be used to strip characters which are not valid for UTF-8).
volker at machon dot biz
24-Sep-2007 09:05
Hey guys. For everybody who's looking for a function that is converting an iso-string to utf8 or an utf8-string to iso, here's your solution:

public function encodeToUtf8($string) {
     return mb_convert_encoding($string, "UTF-8", mb_detect_encoding($string, "UTF-8, ISO-8859-1, ISO-8859-15", true));
}

public function encodeToIso($string) {
     return mb_convert_encoding($string, "ISO-8859-1", mb_detect_encoding($string, "UTF-8, ISO-8859-1, ISO-8859-15", true));
}

For me these functions are working fine. Give it a try
aofg
21-Aug-2007 06:49
When converting Japanese strings to ISO-2022-JP or JIS on PHP >= 5.2.1, you can use "ISO-2022-JP-MS" instead of them.
Kishu-Izon (platform dependent) characters are converted correctly with the encoding, as same as with eucJP-win or with SJIS-win.
David Hull
20-Dec-2006 10:52
As an alternative to Johannes's suggestion for converting strings from other character sets to a 7bit representation while not just deleting latin diacritics, you might try this:

<?php
$text
= iconv($from_enc, 'US-ASCII//TRANSLIT', $text);
?>

The only disadvantage is that it does not convert "ä" to "ae", but it handles punctuation and other special characters better.
--
David
phpdoc at jeudi dot de
05-Sep-2006 06:46
I'd like to share some code to convert latin diacritics to their
traditional 7bit representation, like, for example,

- à,ç,é,î,... to a,c,e,i,...
- ß to ss
- ä,Ä,... to ae,Ae,...
- ë,... to e,...

(mb_convert "7bit" would simply delete any offending characters).

I might have missed on your country's typographic
conventions--correct me then.
<?php
/**
 * @args string $text line of encoded text
 *       string $from_enc (encoding type of $text, e.g. UTF-8, ISO-8859-1)
 *
 * @returns 7bit representation
 */
function to7bit($text,$from_enc) {
   
$text = mb_convert_encoding($text,'HTML-ENTITIES',$from_enc);
   
$text = preg_replace(
        array(
'/&szlig;/','/&(..)lig;/',
            
'/&([aouAOU])uml;/','/&(.)[^;]*;/'),
        array(
'ss',"$1","$1".'e',"$1"),
       
$text);
    return
$text;
}  
?>

Enjoy :-)
Johannes
mac.com@nemo
08-Jul-2006 07:38
For those wanting to convert from $set to MacRoman, use iconv():

<?php

$string
= iconv('UTF-8', 'macintosh', $string);

?>

('macintosh' is the IANA name for the MacRoman character set.)
Tom Class
11-Nov-2005 07:35
Why did you use the php html encode functions? mbstring has it's own Encoding which is (as far as I tested it) much more usefull:

HTML-ENTITIES

Example:

$text = mb_convert_encoding($text, 'HTML-ENTITIES', "UTF-8");
Stephan van der Feest
09-Sep-2005 04:47
To add to the Flash conversion comment below, here's how I convert back from what I've stored in a database after converting from Flash HTML text field output, in order to load it back into a Flash HTML text field:

function htmltoflash($htmlstr)
{
  return str_replace("&lt;br /&gt;","\n",
    str_replace("<","&lt;",
      str_replace(">","&gt;",
        mb_convert_encoding(html_entity_decode($htmlstr),
        "UTF-8","ISO-8859-1"))));
}
Stephan van der Feest
09-Sep-2005 03:50
Here's a tip for anyone using Flash and PHP for storing HTML output submitted from a Flash text field in a database or whatever.

Flash submits its HTML special characters in UTF-8, so you can use the following function to convert those into HTML entity characters:

function utf8html($utf8str)
{
  return htmlentities(mb_convert_encoding($utf8str,"ISO-8859-1","UTF-8"));
}
jamespilcher1 - hotmail
01-Feb-2004 07:55
be careful when converting from iso-8859-1 to utf-8.

even if you explicitly specify the character encoding of a page as iso-8859-1(via headers and strict xml defs), windows 2000 will ignore that and interpret it as whatever character set it has natively installed.

for example, i wrote char #128 into a page, with char encoding iso-8859-1, and it displayed in internet explorer (& mozilla) as a euro symbol.

it should have displayed a box, denoting that char #128 is undefined in iso-8859-1. The problem was it was displaying in "Windows: western europe" (my native character set).

this led to confusion when i tried to convert this euro to UTF-8 via mb_convert_encoding() 

IE displays UTF-8 correctly- and because PHP correctly converted #128 into a box in UTF-8, IE would show a box.

so all i saw was mb_convert_encoding() converting a euro symbol into a box. It took me a long time to figure out what was going on.
lanka at eurocom dot od dot ua
07-Feb-2003 08:03
Another sample of recoding without MultiByte enabling.
(Russian koi->win, if input in win-encoding already, function recode() returns unchanged string)

<?php
 
// 0 - win
  // 1 - koi
 
function detect_encoding($str) {
   
$win = 0;
   
$koi = 0;

    for(
$i=0; $i<strlen($str); $i++) {
      if(
ord($str[$i]) >224 && ord($str[$i]) < 255) $win++;
      if(
ord($str[$i]) >192 && ord($str[$i]) < 223) $koi++;
    }

    if(
$win < $koi ) {
      return
1;
    } else return
0;

  }

 
// recodes koi to win
 
function koi_to_win($string) {

   
$kw = array(128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183184, 185, 186, 187, 188, 189, 190, 191, 254, 224, 225, 246, 228, 229, 244, 227, 245, 232, 233, 234, 235, 236, 237, 238, 239, 255, 240, 241, 242, 243, 230, 226, 252, 251, 231, 248, 253, 249, 247, 250, 222, 192, 193, 214, 196, 197, 212, 195, 213, 200, 201, 202, 203, 204, 205, 206, 207, 223, 208, 209, 210, 211, 198, 194, 220, 219, 199, 216, 221, 217, 215, 218);
   
$wk = array(128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183184, 185, 186, 187, 188, 189, 190, 191, 225, 226, 247, 231, 228, 229, 246, 250, 233, 234, 235, 236, 237, 238, 239, 240, 242243, 244, 245, 230, 232, 227, 254, 251, 253, 255, 249, 248, 252, 224, 241, 193, 194, 215, 199, 196, 197, 214, 218, 201, 202, 203, 204, 205, 206, 207, 208, 210, 211, 212, 213, 198, 200, 195, 222, 219, 221, 223, 217, 216, 220, 192, 209);

   
$end = strlen($string);
   
$pos = 0;
    do {
     
$c = ord($string[$pos]);
      if (
$c>128) {
       
$string[$pos] = chr($kw[$c-128]);
      }

    } while (++
$pos < $end);

    return
$string;
  }

  function
recode($str) {

   
$enc = detect_encoding($str);
    if (
$enc==1) {
     
$str = koi_to_win($str);
    }

    return
$str;
  }
?>

mb_convert_kana> <mb_convert_case
Last updated: Fri, 18 Jul 2008
 
 
show source | credits | stats | sitemap | contact | advertising | mirror sites