DOMXPath クラス

(PHP 5, PHP 7, PHP 8)

はじめに

HTML や XML 文書に対して、 XPath 1.0 のクエリを使うことを許可します。

クラス概要

class DOMXPath {

/* プロパティ */

public readonly DOMDocument $document;

public bool $registerNodeNamespaces;

/* メソッド */

public function __construct(DOMDocument $document, bool $registerNodeNS = true)

public function evaluate(string $expression, ?DOMNode $contextNode = null, bool $registerNodeNS = true): mixed

public function query(string $expression, ?DOMNode $contextNode = null, bool $registerNodeNS = true): mixed

public static function quote(string $str): string

public function registerNamespace(string $prefix, string $namespace): bool

public function registerPhpFunctionNS(string $namespaceURI, string $name, callable $callable): void

public function registerPhpFunctions(string|array|null $restrict = null): void

}

プロパティ

document: このオブジェクトにリンクした文書
registerNodeNamespaces: true に設定すると、ノードの名前空間が登録されます。

変更履歴

バージョン	説明
8.4.0	DOMXPath オブジェクトはクローンできなくなりました。クローンすると例外がスローされます。これより前のバージョンでは、 DOMXPath オブジェクトをクローンしても、返されるオブジェクトは使用できませんでした。
8.0.0	プロパティ `registerNodeNamespaces` が追加されました。

DOMXPath::__construct — 新しい DOMXPath オブジェクトを作成する
DOMXPath::evaluate — 与えられた XPath 式を評価し、可能であれば結果を返す
DOMXPath::query — 与えられた XPath 式を評価する
DOMXPath::quote — XPath 式で使用できるよう、文字列のまわりに引用符を付ける
DOMXPath::registerNamespace — DOMXPath オブジェクトの名前空間を登録する
DOMXPath::registerPhpFunctionNS — PHP の関数を、名前空間付きの XPath 関数として登録する
DOMXPath::registerPhpFunctions — PHP の関数を XPath 関数として登録する

Found A Problem?

Learn How To Improve This Page • Submit a Pull Request • Report a Bug

＋add a note

User Contributed Notes 5 notes

down

Mark Omohundro, ajamyajax dot com ¶

17 years ago

<?php
// to retrieve selected html data, try these DomXPath examples:

$file = $DOCUMENT_ROOT. "test.html";
$doc = new DOMDocument();
$doc->loadHTMLFile($file);

$xpath = new DOMXpath($doc);

// example 1: for everything with an id
//$elements = $xpath->query("//*[@id]");

// example 2: for node data in a selected id
//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");

// example 3: same as above with wildcard
$elements = $xpath->query("*/div[@id='yourTagIdHere']");

if (!is_null($elements)) {
  foreach ($elements as $element) {
    echo "<br/>[". $element->nodeName. "]";

    $nodes = $element->childNodes;
    foreach ($nodes as $node) {
      echo $node->nodeValue. "\n";
    }
  }
}
?>

down

TechNyquist ¶

6 years ago

When working with XML (as a strict format) might be very important to give a namespace to XPath object in order to make it work properly.

I was experiencing "query" always returning empty node lists, it could not find anything. Only a broad "//*" was able to show off only the root element.

Then found out that registering the namespace reported in the "xmlns" attribute of the root element in the XPath object, and writing the namespace near the elements name, made it work properly.

So for an XML like this (from a sitemap):

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>http://example.com/index.php</loc>
        <lastmod>2005-01-01</lastmod>
        <changefreq>monthly</changefreq>
        <priority>0.5</priority>
    </url>
</urlset>

I needed the following XPath configuration:

<?php

    $doc = new DOMDocument;
    $doc->load("sitemap.xml");
    $xpath = new DOMXPath($doc);
    $xpath->registerNamespace('ns', 'http://www.sitemaps.org/schemas/sitemap/0.9');
    $nodes = $xpath->query('//ns:urlset/ns:url');

?>

Then again, that "xmlns" could be provided dynamically from the root element attribute of course.

down

peter at softcoded dot com ¶

9 years ago

You may not always know at runtime whether your file has
a namespace or not. This can make it difficult to create
XPath queries. Use the seriously underdocumented
"namespaceURI" property of the documentElement of a
DOMDocument to determine if there is a namespace.
Use code such as the following:

$doc = new DOMDocument();
$doc->load($file);
$xpath = new DOMXPath($doc);
$ns = $doc->documentElement->namespaceURI;
if($ns) {
  $xpath->registerNamespace("ns", $ns);
  $nodes = $xpath->query("//ns:em[@class='glossterm']");
} else {
  $nodes = $xpath->query("//em[@class='glossterm']");
}
//look at nodes here

down

peter at softcoded dot com ¶

9 years ago

Using XPath expressions can save a lot of programming
and allow you to home in on only the nodes you want.
Suppose you want to delete all empty <p> tags.
If you create a query using the following XPath expression,
you can find <p> tags that do not have any text
(other than spaces), any attributes,
any children or comments:

$expression = "//p[not(@*)  
   and not(*) 
   and not(./comment())
   and normalize-space(text())='']";
   
This expression will only find para tags that look like:

<p>[any number of spaces]</p>
<p></p>

Imagine the code you would have to add if you used
DOMDocument::getElementsByTagName("p") instead.

down

-4

archimedix32783262 at mailinator dot com ¶

11 years ago

Note that evaluate() will use the same encoding as the XML document.

So if you have a UTF-16 XML, you will have to query using UTF-16 strings.

You can use iconv() to convert from your code's encoding to the target encoding for better legibility.

＋add a note

DOMXPath クラス

はじめに

クラス概要

プロパティ

変更履歴

目次

Found A Problem?

User Contributed Notes 5 notes