ConFoo 2025

Character sets

Ideally a proper character set will be set at the server level, and doing this is described within the » Character Set Configuration section of the MySQL Server manual. Alternatively, each MySQL API offers a method to set the character set at runtime.

Caution

The character set and character escaping

The character set should be understood and defined, as it has an affect on every action, and includes security implications. For example, the escaping mechanism (e.g., mysqli_real_escape_string() for mysqli and PDO::quote() for PDO_MySQL) will adhere to this setting. It is important to realize that these functions will not use the character set that is defined with a query, so for example the following will not have an effect on them:

Example #1 Problems with setting the character set with SQL

<?php

$mysqli
= new mysqli("localhost", "my_user", "my_password", "world");

// Will NOT affect $mysqli->real_escape_string();
$mysqli->query("SET NAMES utf8mb4");

// Will NOT affect $mysqli->real_escape_string();
$mysqli->query("SET CHARACTER SET utf8mb4");

// But, this will affect $mysqli->real_escape_string();
$mysqli->set_charset('utf8mb4');

// But, this will NOT affect it (UTF-8 vs utf8mb4) -- don't use dashes here
$mysqli->set_charset('UTF-8');
?>

Below are examples that demonstrate how to properly alter the character set at runtime using each API.

Note: Possible UTF-8 confusion

Because character set names in MySQL do not contain dashes, the string "utf8" is valid in MySQL to set the character set to UTF-8 (up to 3 byte UTF-8 Unicode Encoding). The string "UTF-8" is not valid, as using "UTF-8" will fail to change the character set and will throw an error.

Example #2 Setting the character set example: mysqli

<?php
$mysqli
= new mysqli("localhost", "my_user", "my_password", "world");

echo
'Initial character set: ' . $mysqli->character_set_name() . "\n";

if (!
$mysqli->set_charset('utf8mb4')) {
printf("Error loading character set utf8mb4: %s\n", $mysqli->error);
exit;
}

echo
'Your current character set is: ' . $mysqli->character_set_name() . "\n";
?>

Example #3 Setting the character set example: pdo_mysql

<?php
$pdo
= new PDO("mysql:host=localhost;dbname=world;charset=utf8mb4", 'my_user', 'my_pass');
?>
add a note

User Contributed Notes 2 notes

up
22
mkroese at eljakim dot nl
7 years ago
Please note that MySQL's utf8 encoding has a maximum of 3 bytes and is unable to encode *all* unicode characters.

If you need to encode characters beyond the BMP (Basic Multilingual Plane), like emoji or other special characters, you will need to use a different encoding like utf8mb4 or any other encoding supporting the higher planes. Mysql will discard any characters encoded in 4 bytes (or more).

See https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-utf8mb4.html for more information on the matter
up
-1
legrand dot jeremie at gmail dot com
1 year ago
After setting the charset, you should define the 'collation' too, to give information on how sorting results on requests. By default, it is 'utf8mb4_general_ci', which is a simplified set of sorting rules. For the official rules, edicted by Unicode, it should be 'utf8mb4_unicode_ci'.

For example:
\mysqli_set_charset($hdl, 'utf8mb4');
\mysqli_query($hdl, 'SET collation_connection = utf8mb4_unicode_520_ci');
To Top