php[world] 2015 Schedule Announced

mb_ereg

(PHP 4 >= 4.2.0, PHP 5)

mb_eregRegular expression match with multibyte support

Beschreibung

int mb_ereg ( string $pattern , string $string [, array $regs ] )

Executes the regular expression match with multibyte support.

Parameter-Liste

pattern

The search pattern.

string

The search string.

regs

Contains a substring of the matched string.

Rückgabewerte

Executes the regular expression match with multibyte support, and returns 1 if matches are found. If the optional regs parameter was specified, the function returns the byte length of matched part, and the array regs will contain the substring of matched string. The function returns 1 if it matches with the empty string. If no matches are found or an error happens, FALSE will be returned.

Anmerkungen

Hinweis:

Die interne Kodierung oder die mit mb_regex_encoding() festgelegte Zeichenkodierung wird als Zeichenkodierung für diese Funktion genutzt.

Siehe auch

  • mb_regex_encoding() - Set/Get character encoding for multibyte regex
  • mb_eregi() - Regular expression match ignoring case with multibyte support

add a note add a note

User Contributed Notes 4 notes

up
1
Riikka K
11 months ago
While hardly mentioned anywhere, it may be useful to note that mb_ereg uses Oniguruma library internally. The syntax for the default mode (ruby) is described here:

http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt
up
1
pressler at hotmail dot de
2 years ago
Note that mb_ereg() does not support the \uFFFF unicode syntax but uses \x{FFFF} instead:

<?PHP

$text
= 'Peter is a boy.'; // english
$text = 'بيتر هو صبي.'; // arabic
//$text = 'פיטר הוא ילד.'; // hebrew

mb_regex_encoding('UTF-8');

if(
mb_ereg('[\x{0600}-\x{06FF}]', $text)) // arabic range
//if(mb_ereg('[\x{0590}-\x{05FF}]', $text)) // hebrew range
{
    echo
"Text has some arabic/hebrew characters.";
}
else
{
    echo
"Text doesnt have arabic/hebrew characters.";
}

?>
up
1
Jon
6 years ago
Hebrew regex tested on PHP 5, Ubuntu 8.04.
Seems to work fine without the mb_regex_encoding lines (commented out).
Didn't seem to work with \uxxxx (also commented out).

<?php
echo "Line ";
//mb_regex_encoding("ISO-8859-8");
//if(mb_ereg(".*([\u05d0-\u05ea]).*", $this->current_line))
if(mb_ereg(".*([א-ת]).*", $this->current_line))
{
    echo
"has";
}
else
{
    echo
"doesn't have";
}
echo
" Hebrew characters.<br>";   
//mb_regex_encoding("UTF-8");
?>
up
0
mb_ereg() seems unable to Use &#34;named sub
1 month ago
mb_ereg() seems unable to Use "named subpattern".
preg_match() seems a substitute only in UTF-8 encoding.

<?php

$text
= 'multi_byte_string';
$pattern = '.*(?<name>string).*';        // "?P" causes "mbregex compile err" in PHP 5.3.5

if(mb_ereg($pattern, $text, $matches)){
    echo
'<pre>'.print_r($matches, true).'</pre>';
}else{
    echo
'no match';
}

?>

This code ignores "?<name>" in $pattern and displays below.

Array
(
    [0] => multi_byte_string
    [1] => string
)

$pattern = '/.*(?<name>string).*/u';
if(preg_match($pattern, $text, $matches)){

instead of lines 2 & 3
displays below (in UTF-8 encoding).

Array
(
    [0] => multi_byte_string
    [name] => string
    [1] => string
)
To Top