PHPerKaigi 2024

mb_decode_numericentity

(PHP 4 >= 4.0.6, PHP 5, PHP 7, PHP 8)

mb_decode_numericentityDecodifica referência de string numérica HTML para caractere

Descrição

mb_decode_numericentity(string $string, array $map, ?string $encoding = null): string

Converte a referência de string numérica de string string em um bloco especificado para caractere.

Parâmetros

string

A string sendo decodificada.

map

map é um array que especifica a área de código a ser convertida.

encoding

O parâmetro encoding é a codificação de caracteres. Se for omitido ou null, o valor da codificação de caracteres interna será usado.

is_hex

Este parâmetro não é utilizado.

Valor Retornado

A string convertida.

Registro de Alterações

Versão Descrição
8.0.0 O parâmetro encoding agora pode ser nulo.

Exemplos

Exemplo #1 Exemplo de map

<?php
$convmap
= array (
int start_code1, int end_code1, int offset1, int mask1,
int start_code2, int end_code2, int offset2, int mask2,
........
int start_codeN, int end_codeN, int offsetN, int maskN );
// Especifique o valor Unicode para start_codeN e end_codeN
// Adicione offsetN ao valor e faça 'AND' bit a bit com maskN,
// em seguida, converta o valor para referência de string numérica.
?>

Exemplo #2 map exemplo de escape de string JavaScript

<?php
function escape_javascript_string($str) {
$map = [
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,0,0, // 49
0,0,0,0,0,0,0,0,1,1,
1,1,1,1,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,1,1,1,1,1,1,0,0,0, // 99
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1, // 149
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1, // 199
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1, // 249
1,1,1,1,1,1,1, // 255
];
// Codificação de caractere é UTF-8
$mblen = mb_strlen($str, 'UTF-8');
$utf32 = bin2hex(mb_convert_encoding($str, 'UTF-32', 'UTF-8'));
for (
$i=0, $encoded=''; $i < $mblen; $i++) {
$u = substr($utf32, $i*8, 8);
$v = base_convert($u, 16, 10);
if (
$v < 256 && $map[$v]) {
$encoded .= '\\x'.substr($u, 6,2);
} else if (
$v == 2028) {
$encoded .= '\\u2028';
} else if (
$v == 2029) {
$encoded .= '\\u2029';
} else {
$encoded .= mb_convert_encoding(hex2bin($u), 'UTF-8', 'UTF-32');
}
}
return
$encoded;
}

// Dados de teste
$convmap = [ 0x0, 0xffff, 0, 0xffff ];
$msg = '';
for (
$i=0; $i < 1000; $i++) {
// chr() não pode gerar dados UTF-8 corretos com valor maior que 128, use mb_decode_numericentity().
$msg .= mb_decode_numericentity('&#'.$i.';', $convmap, 'UTF-8');
}

// var_dump($msg);
var_dump(escape_javascript_string($msg));

Veja Também

add a note

User Contributed Notes 4 notes

up
1
donovan at conduit it
17 years ago
note that at this time it seems that mb_decode_numericentity() only works with decimal entities and not hexadecimal entities. This fact would have saved me a good hour of time in debugging.

For those who need to convert hex entities try first converting them all to decimal entities with a combination of the preg_replace() and hexdec() functions.
up
1
abderrahmanekaddour dot aissat at gmail dot com
1 year ago
<?php

// the following documentation depending on understanding of the code source of php mbr
// first in order to optimise the work of php
// the string must contain "&" or else php won't bother trying to decode.
// for the map : int start_codeN, int end_codeN, int offsetN, int maskN
// the entity must be in the range [start_codeN, end_codeN] , if the entity is greater or less
// mb_decode_numericentity will ignore the decode process and return the $string as it is.
// in the late version of php, $map : "must have a multiple of 4 elements"

$map = [ 0x0, 0xFFFF, 0, 0];
echo
mb_decode_numericentity('&#109;', $map ); // result "m"
// if offsetN = 1 result "l" ; the more you increase the decimal the more it use OR operrand.
$map_2 = [ 0x0, 0xFFFF, 60, 0];
echo
mb_decode_numericentity('&#109;', $map_2 ); // decode ( &#49; ) result : "1"

// entity Reference to check the result : https://cs.stanford.edu/people/miles/iso8859.html#ISO

?>
up
1
dev at glossword info
20 years ago
Just two great functions for daily use:

/* Converts any HTML-entities into characters */
function my_numeric2character($t)
{
$convmap = array(0x0, 0x2FFFF, 0, 0xFFFF);
return mb_decode_numericentity($t, $convmap, 'UTF-8');
}
/* Converts any characters into HTML-entities */
function my_character2numeric($t)
{
$convmap = array(0x0, 0x2FFFF, 0, 0xFFFF);
return mb_encode_numericentity($t, $convmap, 'UTF-8');
}
print my_numeric2character('&#8217; &#7936; &#226;');
print my_character2numeric(' ? ? ');
up
-1
fernandosilveira at yahoo dot com dot br
3 years ago
Be careful!
In addition to translate numeric entities to chars on specified target encoding, this function encodes every character from input string to the specified target encodin, even if the characters are outside the range defined by the conversion map.
To Top