parse_url
(PHP 4, PHP 5, PHP 7, PHP 8)
parse_url — Parse a URL and return its components
Description
This function is not meant to validate
the given URL, it only breaks it up into the parts listed below. Partial and invalid
URLs are also accepted, parse_url() tries its best to
parse them correctly.
Caution
This function does not follow any established URI or URL standard.
It will return incorrect or non-sense results for relative or malformed
URLs. Even for valid URLs the result may differ from that of a
different URL parser, since there are multiple different URL-related
standards that target different use cases and that differ in their
requirements.
Processing an URL with parsers following different URL standards is a
common source of security vulnerabilities. As an example, validating
an URL against an allow-list of acceptable hostnames with parser A
might be ineffective when the actual retrieval of the resource uses
parser B that extracts hostnames differently.
The Uri\Rfc3986\Uri and Uri\WhatWg\Url
classes strictly follow the RFC 3986 and WHATWG URL Standards respectively.
It is strongly recommended to use these classes for all newly written code
and to migrate existing uses of the parse_url() function
to these classes, unless the parse_url() behavior needs
to be preserved for compatibility reasons.
Return Values
On seriously malformed URLs, parse_url() may return
false.
If the component parameter is omitted, an
associative array is returned. At least one element will be
present within the array. Potential keys within this array are:
-
scheme - e.g.
http
-
host
-
port
-
user
-
pass
-
path
-
query - after the question mark
?
-
fragment - after the hashmark
#
If the component parameter is specified,
parse_url() returns a string (or an
int, in the case of PHP_URL_PORT)
instead of an array. If the requested component doesn't exist
within the given URL, null will be returned.
As of PHP 8.0.0, parse_url() distinguishes absent and empty
queries and fragments:
Previously all cases resulted in query and fragment being null.
Note that control characters (cf. ctype_cntrl()) in the
components are replaced with underscores (_).
Examples
Example #1 A parse_url() example
<?php
$url = 'http://username:password@hostname:9090/path?arg=value#anchor';
var_dump(parse_url($url));
var_dump(parse_url($url, PHP_URL_SCHEME));
var_dump(parse_url($url, PHP_URL_USER));
var_dump(parse_url($url, PHP_URL_PASS));
var_dump(parse_url($url, PHP_URL_HOST));
var_dump(parse_url($url, PHP_URL_PORT));
var_dump(parse_url($url, PHP_URL_PATH));
var_dump(parse_url($url, PHP_URL_QUERY));
var_dump(parse_url($url, PHP_URL_FRAGMENT));
?>
The above example will output:
array(8) {
["scheme"]=>
string(4) "http"
["host"]=>
string(8) "hostname"
["port"]=>
int(9090)
["user"]=>
string(8) "username"
["pass"]=>
string(8) "password"
["path"]=>
string(5) "/path"
["query"]=>
string(9) "arg=value"
["fragment"]=>
string(6) "anchor"
}
string(4) "http"
string(8) "username"
string(8) "password"
string(8) "hostname"
int(9090)
string(5) "/path"
string(9) "arg=value"
string(6) "anchor"
Example #2 A parse_url() example with missing scheme
<?php
$url = '//www.example.com/path?googleguy=googley';
// Prior to 5.4.7 this would show the path as "//www.example.com/path"
var_dump(parse_url($url));
?>
The above example will output:
array(3) {
["host"]=>
string(15) "www.example.com"
["path"]=>
string(5) "/path"
["query"]=>
string(17) "googleguy=googley"
}
Notes
Note:
This function is intended specifically for the purpose of parsing URLs
and not URIs. However, to comply with PHP's backwards compatibility
requirements it makes an exception for the file:// scheme where triple
slashes (file:///...) are allowed. For any other scheme this is invalid.