CascadiaPHP 2024

Работа с ошибками XML

Работа с ошибками XML при загрузке документов — простая задача. Функциями модуля libxml можно подавить все XML-ошибки при загрузке документа и затем обработать их.

Объект libXMLError, который возвращает функция libxml_get_errors(), содержит ряд свойств, в том числе сообщение, номер строки и колонку (позицию) этой ошибки.

Пример #1 Загрузка XML-строки с неправильным синтаксисом


$sxe = simplexml_load_string("<?xml version='1.0'><broken><xml></broken>");
if (!
$sxe) {
"Ошибка загрузки XML\n";
libxml_get_errors() as $error) {
"\t", $error->message;


Результат выполнения приведённого примера:

Ошибка загрузки XML
    Blank needed here
    parsing XML declaration: '?>' expected
    Opening and ending tag mismatch: xml line 1 and broken
    Premature end of data in tag broken line 1

add a note

User Contributed Notes 4 notes

openbip at gmail dot com
14 years ago
Note that "if (! $sxe) {" may give you a false-negative if the XML document was empty (e.g. "<root />"). In that case, $sxe will be:

object(SimpleXMLElement)#1 (0) {

which will evaluate to false, even though nothing technically went wrong.

Consider instead: "if ($sxe === false) {"
1337 at netapp dot com
8 years ago
If you need to process the content of your broken XML-doc you might find this interesting. It has blown past a few simple corruptions for me.
9 years ago
Now that the /e modifier is considered deprecated in preg_replace, you can use a negative lookahead to replace unescaped ampersands with &amp; without throwing warnings:

$str = preg_replace('/&(?!;{6})/', '&amp;', $str);

You probably should have been doing this before /e was deprecated, actually.
Jacob Tabak
14 years ago
If you are trying to load an XML string with some escaped and some unescaped ampersands, you can pre-parse the string to ecsape the unescaped ampersands without modifying the already escaped ones:
= preg_replace('/&[^; ]{0,6}.?/e', "((substr('\\0',-1) == ';') ? '\\0' : '&amp;'.substr('\\0',1))", $s);
To Top