When using XMLHttpRequest or another AJAX technique to submit data to a PHP script using GET (or POST with content-type header set to 'x-www-form-urlencoded') you must urlencode your data before you upload it. (In fact, if you don't urlencode POST data MS Internet Explorer may pop a "syntax error" dialog when you call XMLHttpRequest.send().) But, you can't call PHP's urlencode() function in Javascript! In fact, NO native Javascript function will urlencode data correctly for form submission. So here is a function to do the job fairly efficiently:
<?php /******
<script type="text/javascript" language="javascript1.6">
// PHP-compatible urlencode() for Javascript
function urlencode(s) {
s = encodeURIComponent(s);
return s.replace(/~/g,'%7E').replace(/%20/g,'+');
}
// sample usage: suppose form has text input fields for
// country, postcode, and city with id='country' and so-on.
// We'll use GET to send values of country and postcode
// to "city_lookup.php" asynchronously, then update city
// field in form with the reply (from database lookup)
function lookup_city() {
var elm_country = document.getElementById('country');
var elm_zip = document.getElementById('postcode');
var elm_city = document.getElementById('city');
var qry = '?country=' + urlencode(elm_country.value) +
'&postcode=' + urlencode(elm_zip.value);
var xhr;
try {
xhr = new XMLHttpRequest(); // recent browsers
} catch (e) {
alert('No XMLHttpRequest!');
return;
}
xhr.open('GET',('city_lookup.php'+qry),true);
xhr.onreadystatechange = function(){
if ((xhr.readyState != 4) || (xhr.status != 200)) return;
elm_city.value = xhr.responseText;
}
xhr.send(null);
}
</script>
******/ ?>
urlencode
(PHP 4, PHP 5)
urlencode — 문자열을 URL 인코드합니다.
설명
-_.을 제외한 모든 영숫자가 아닌 문자를 퍼센트(%) 사인에 이어지는 두 16진수로 교체하고 공백은 플러스(+) 사인으로 교체한 문자열을 반환합니다. 이는 WWW 폼에서 인코드한 포스트 데이터, application/x-www-form-urlencoded 매체형과 같은 방식의 인코드입니다. 역사적인 이유로 공백을 플러스(+) 사인으로 인코드 하는 점이 RFC1738 인코딩(rawurlencode() 참고)과 다릅니다. 이 함수는 URL의 쿼리 부분에 사용하는 문자열을 인코딩할 때 편리합니다. 다음과 같이 다음 페이지로 변수를 전달합니다:
Example#1 urlencode() 예제
<?php
echo '<a href="mycgi?foo=', urlencode($userinput), '">';
?>
주의: HTML 엔티티가 들어있는 변수에는 주의를 기울이십시오. &, ©, £ 같은 것은 브라우저에 의해 파싱되어 변수명 대신에 엔티티가 사용됩니다. 이는 몇년동안 W3C가 두드러지게 이야기해온 문제입니다. 레퍼런스는 여기에 있습니다: » http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2 PHP는 .ini 지시어의 arg_seperator를 통하여 W3C가 제시한 세미콜론 인수 구분자 변경을 지원합니다. 불행하게도 대부분의 유저 에이전트는 폼 데이터를 세미콜론 구분 형태로 전송하지 않습니다. 이를 해결하는 가장 현실적인 방법은 구분자로 &를 사용하는 대신 &를 사용하는 것입니다. 이렇게 하면, PHP의 arg_separator를 변경할 필요가 없습니다. &로 내버려두고, 간단히 URL을 htmlentities(urlencode($data))를 사용하여 인코드하십시오.
Example#2 urlencode()와 htmlentities() 예제
<?php
echo '<a href="mycgi?foo=', htmlentities(urlencode($userinput)), '">';
?>
참고: urldecode(), htmlentities(), rawurldecode(), rawurlencode().
urlencode
23-Sep-2008 04:34
18-Sep-2008 10:31
Regarding issues with %2Fs for slashes in encoded URLs, you simply need to enable the AllowEncodedSlashes directive in Apache:
http://httpd.apache.org/docs/2.2/mod/core.html#allowencodedslashes
Hope that helps
12-Aug-2008 05:12
I'm running PHP version 5.0.5 and urlencode() doesn't seem to encode the "#" character, although the function's description says it encodes "all non-alphanumeric" characters. This was a particular problem for me when trying to open local files with a "#" in the filename as Firefox will interpret this as an anchor target (for better or worse). It seems a manual str_replace is required unless this was fixed in a future PHP version.
Example:
$str = str_replace("#", "%23", $str);
15-Jun-2008 12:53
>> Hi muthuishere , i saw your excellent contribution, but couldnt make it work, so i corrected some bits and pieces and had the following done:
<?php
function SmartUrlEncode($url){
if (strpos($url, '=') === false):
return $url;
else:
$startpos = strpos($url, "?");
$tmpurl=substr($url, 0 , $startpos+1) ;
$qryStr=substr($url, $startpos+1 ) ;
$qryvalues=explode("&", $qryStr);
foreach($qryvalues as $value):
$buffer=explode("=", $value);
$buffer[1]=urlencode($buffer[1]);
endforeach;
$finalqrystr=implode("&", &$qryvalues);
$finalURL=$tmpurl . $finalqrystr;
return $finalURL;
endif;
}
?>
As you see its very much yours, modfied primarily using '&' instead of '&', and ofcourse an if test to see if anything in input is to be cursed... Thanks for great function !
07-Jun-2008 09:50
Most of us may need a function
if they have entire URL but you need to be encoding only the query values , not the URL and not the parameters
The below function takes an URL as input and applies url encoding only to the parameter values
/******************************************
For eg www.google.com/?q=alka rani&start=100
urlencode
==> www.google.com/search?q=alka+rani&start=100
if you change it to rawurlencode
==> www.google.com/search?q=alka%20rani&start=100
**************************************/
function SmartUrlEncode($url){
//Extract the Querystr pos after ? mark
$startpos = stripos($url, "?");
//Extract the URl alone
$tmpurl=substr($url, 0 , $startpos+1) ;
//echo $tmpurl . "<br>";
//Extract the Querystr alone
$qryStr=substr($url, $startpos+1 ) ;
//echo $qryStr . "<br>";
//Split the querystring into & pairs
$qryvalues=explode("&", $qryStr);
foreach($qryvalues as &$value)
{
//Split the single data into two i.e data | value
$buffer=explode("=", $value);
// Urlencode only the value now , Change it to rawurlencode if necessary
$buffer[1]=urlencode($buffer[1]);
//Join values back in the array
$value = implode("=", $tmp);
}
//Output Querystr ,Join all the pairs with &
$finalqrystr=implode("&", $qryvalues);
// echo $finalqrystr . "<br>";
$finalURL=$tmpurl . $finalqrystr;
//echo $finalURL . "<br>";
return finalURL;
}
}
12-Apr-2008 03:20
Kerdster's function works like a charm. It has only a minor beauty flaw in my humble opinion: It encodes every character, even the plain ascii ones. This just doesn't like nice in the browsers address bar. ;-)
Inspired by Mkaganer's utf8_urldecode example in urldecode comments here's the enhanced code:
<?php
function utf16_urlencode ( $str ) {
# convert characters > 255 into HTML entities
$convmap = array( 0xFF, 0x2FFFF, 0, 0xFFFF );
$str = mb_encode_numericentity( $str, $convmap, "UTF-8");
# escape HTML entities, so they are not urlencoded
$str = preg_replace( '/&#([0-9a-fA-F]{2,5});/i', 'mark\\1mark', $str );
$str = urlencode($str);
# now convert escaped entities into unicode url syntax
$str = preg_replace( '/mark([0-9a-fA-F]{2,5})mark/i', '%u\\1', $str );
return $str;
}
?>
Probably the above code could be optimized further, comments are highly appreciated!
Thanks, Simon
28-Feb-2008 04:02
> php dot net at samokhvalov dot com
> 12-Dec-2006 09:49
Thanx for idea!
I have wrote more simple function based on your function to simulate JS function escape (); It uses mb_string functions unstead of iconv.
<?php
function utf16urlencode($str)
{
$str = mb_convert_encoding($str, 'UTF-16', 'UTF-8');
$out = '';
for ($i = 0; $i < mb_strlen($str, 'UTF-16'); $i++)
{
$out .= '%u'.bin2hex(mb_substr($str, $i, 1, 'UTF-16'));
}
return $out;
}
?>
10-Feb-2008 01:34
Some people have difficulties with all urlencode and so on solutions. So I decided to solve using base64_encode several times for more security from this way:
//- First page:
<?
$url='mypage.php';
?>
<a href="index.php?page=<? echo encode($url,5); ?>">My page</a>
//- Second page:
<?
$mypage=$_GET['page'];
$mypage=decode($mypage,5);
echo file_get_contents($mypage);
?>
*file_get_contents could not run your php scripts see same function
-----------
function encode($ss,$ntime){
for($i=0;$i<$ntime;$i++){
$ss=base64_encode($ss);
}
retrun $ss;
}
function decode($ss,$ntime){
for($i=0;$i<$ntime;$i++){
$ss=base64_decode($ss);
}
retrun $ss;
}
07-Feb-2008 04:49
As a reply to: mmj48.com
Your method of replacing just the slash would be BAD practice... UNLESS, it was used STRICTLY on the PATH part of the URL.
You must account for the URLQUERIES, but also the scheme, user, password, and fragment characters (:, /, &, #, etc)
However, these may change depending on the environment (mainly refering to the &, query var separator)
Escaping each of these would also be a bad practice, and impractical. Rather, build a class / tool which will generate your URL's, and render escapes. You could also use the PHP routine: parse_url() for some interesting results.
05-Sep-2007 02:30
Reply to 'peter at mailinator dot com'
If you are having problems using urldecode in PHP following the escape() function in Javascript, try to do a decodeURI() before the escape(). This fixed it for me at least.
Thomas
07-Aug-2007 08:28
What I use instead:
<?php
function escape($url)
{
return str_replace("%2F", "/", urlencode($url));
}
?>
04-Aug-2007 05:04
Like "Benjamin dot Bruno at web dot de" earlier has writen, you can have problems with encode strings with special characters to flash. Benjamin write that:
<?php
function flash_encode ($input)
{
return rawurlencode(utf8_encode($input));
}
?>
... could do the problem. Unfortunately flash still have problems with read some quotations, but with this one:
<?php
function flash_encode($string)
{
$string = rawurlencode(utf8_encode($string));
$string = str_replace("%C2%96", "-", $string);
$string = str_replace("%C2%91", "%27", $string);
$string = str_replace("%C2%92", "%27", $string);
$string = str_replace("%C2%82", "%27", $string);
$string = str_replace("%C2%93", "%22", $string);
$string = str_replace("%C2%94", "%22", $string);
$string = str_replace("%C2%84", "%22", $string);
$string = str_replace("%C2%8B", "%C2%AB", $string);
$string = str_replace("%C2%9B", "%C2%BB", $string);
return $string;
}
?>
... should solve this problem.
26-Jul-2007 06:29
I had difficulties with all above solutions. So I applied a dirty simple solution by using:
base64_encode($param)
and
base64_decode($param)
The string's length is a bit longer but no more problem with encoding.
05-Mar-2007 09:04
quote: "Apache's mod_rewrite and mod_proxy are unable to handle urlencoded URLs properly - http://issues.apache.org/bugzilla/show_bug.cgi?id=34602"
The most simple solution is to use urlencode twice!
echo urlencode(urlencode($var));
Apache's mod_rewrite will handle it like a normal string using urlencode once.
20-Feb-2007 01:23
kL's example is very bugged since it loops itself and the encode function is two-way.
Why do you replace all %27 through ' in the same string in that you replace all ' through %27?
Lets say I have a string: Hello %27World%27. It's a nice day.
I get: Hello Hello 'World'. It%27s a nice day.
With other words that solution is pretty useless.
Solution:
Just replace ' through %27 when encoding
Just replace %27 through ' when decoding. Or just use url_decode.
14-Feb-2007 05:32
Another thing to keep in mind is that urlencode is not unicode.
For example, urlencoding enquête from an UTF-8 project will produce enqu%C3%AAte.
However, urlencode(utf8_decode('enquête')) produces enqu%EAte, like expected.
12-Dec-2006 05:18
Addition to the previous note:
to make it work on *nix systems (where big-endian byte order in UTF-16 is being used, in contrast to WIN32) add following lines right after the second iconv():
if (strtoupper(substr(PHP_OS, 0, 3)) !== 'WIN') {
$b = $a;
$a[1] = $b[0];
$a[0] = $b[1];
}
12-Dec-2006 10:49
In AJAX era you might need to use UCS-2 (UTF-16) url-encoding (chars represented in form '%uXXXX' - e.g. '%u043e' for Russian 'o'). But PHP is weak in working with multibyte encoded strings, so you cannot simply use urlencode() for the string in UCS-2. Here is simple function serving for this purpose.
Note, that this function takes UTF8-encoded string as input and, then, for internal purposes use some 1-byte encoding (cp1251 in my case). If you have the string in some 1-byte encoding, you may remove the first iconv() and modify the second one and thus slightly simplify the function.
function utf16urlencode($str)
{
$str = iconv("utf-8", "cp1251", $str);
$res = "";
for ($i = 0; $i < strlen($str); $i++) {
$res .= "%u";
$a = iconv("cp1251", "ucs-2", $str[$i]);
for ($j = 0; $j < strlen($a); $j++) {
$n = dechex(ord($a[$j]));
if (strlen($n) == 1) {
$n = "0$n";
}
$res .= $n;
}
}
return $res;
}
03-Oct-2006 09:54
If you need to prepare strings with special characters (like German Umlaut characters) in order to import them into flash files via GET, try using utf8_encode and rawurlencode sequentially, like this:
<?php
function flash_encode ($input) {
return rawurlencode(utf8_encode($input));
}
?>
Thus, you can avoid having use encodeURI in JavaScript, which is only availabe in JavaScript 1.5.
06-Sep-2006 05:13
Apache's mod_rewrite and mod_proxy are unable to handle urlencoded URLs properly - http://issues.apache.org/bugzilla/show_bug.cgi?id=34602
If you need to use any of these modules and handle paths that contain %2F or %3A (and few other encoded special url characters), you'll have use a different encoding scheme.
My solution is to replace "%" with "'".
<?php
function urlencode($u)
{
return str_replace(array("'",'%'),array('%27',"'"),urlencode($u));
}
function urldecode($u)
{
return urldecode(strtr($u,"'",'%'));
}
?>
06-Aug-2006 02:09
I think this was mentioned earlier but it was confusing.. But I had some problems with the urlencode eating my '/' so I did a simple str_replace like the following:
$url = urlencode($img);
$img2 = "$url";
$img2 = str_replace('%2F54', '/', $img2);
$img2 = str_replace('+' , '%20' , $img2);
You don't need to replace the '+' but I just feel comfortable with my %20, although it may present a problem if whatever you're using the str_replace for has a '+' in it where it shouldn't be.
But that fixed my problem.. all the other encodes like htmlentities and rawurlencode just ate my /'s
21-Jul-2006 11:11
Be carefull when using this function with JavaScript escape function.
In JavaScript when you try to encode utf-8 data with escape function you will get some strange encoded string which you wont be able to decode with php url(de)encode funtions.
I found a website which has some very good tool regarding this problem: http://www.webtoolkit.info/
It has components which deal with url (en)decode.
28-Jan-2006 11:58
<?// urlencode + urldecode 4 Linux/Unix-Servers:=============
//==================================================
//=====This small script matches all encoded String for ========
//=====Linux/Unix-Servers For IIS it got to be The Other Way ==
//===== around...and remember in a propper Url =============
//===== there shoudn't be the 'dirty Letter': %C3==============
//==================================================
function int2hex($intega){
$Ziffer = "0123456789ABCDEF";
return $Ziffer[($intega%256)/16].$Ziffer[$intega%16];
}
function url_decode($text){
if(!strpos($text,"%C3"))
for($i=129;$i<255;$i++){
$in = "%".int2hex($i);
$out = "%C3%".int2hex($i-64);
$text = str_replace($in,$out,$text);
}
return urldecode($text);
}
function url_encode($text){
$text = urlencode($text);
if(!strpos($text,"%C3"))
for($i=129;$i<255;$i++){
$in = "%".int2hex($i);
$out = "%C3%".int2hex($i-64);
$text = str_replace($in,$out,$text);
}
return $text;
}//==================================================
?>
10-Jan-2006 03:30
This very simple function makes an valid parameters part of an URL, to me it looks like several of the other versions here are decoding wrongly as they do not convert & seperating the variables into &.
$vars=array('name' => 'tore','action' => 'sell&buy');
echo MakeRequestUrl($vars);
/* Makes an valid html request url by parsing the params array
* @param $params The parameters to be converted into URL with key as name.
*/
function MakeRequestUrl($params)
{
$querystring=null;
foreach ($params as $name => $value)
{
$querystring=$name.'='.urlencode($value).'&'.$querystring;
}
// Cut the last '&'
$querystring=substr($querystring,0,strlen($querystring)-1);
return htmlentities($querystring);
}
Will output: action=sell%26buy&name=tore
22-Nov-2005 07:00
I rewrote inus at flowingcreativity dot net function to generate an encoded url string from the POST, or GET array. It handles properly POST/GET array vars.
function _HTTPRequestToString($arr_request, $var_name, $separator='&') {
$ret = "";
if (is_array($arr_request)) {
foreach ($arr_request as $key => $value) {
if (is_array($value)) {
if ($var_name) {
$ret .= $this->_HTTPRequestToString($value, "{$var_name}[{$key}]", $separator);
} else {
$ret .= $this->_HTTPRequestToString($value, "{$key}", $separator);
}
} else {
if ($var_name) {
$ret .= "{$var_name}[{$key}]=".urlencode($value)."&";
} else {
$ret .= "{$key}=".urlencode($value)."&";
}
}
}
}
if (!$var_name) {
$ret = substr($ret,0,-1);
}
return $ret;
}
31-Oct-2005 08:54
Just remember that according to W3C standards, you must rawurlencode() the link that's provided at the end of a mailto.
i.e.
<a href="mailto:jdoe@some.where.com?Subject=Simple testing(s)&bcc=jane@some.where.com">Mail Me</a>
Needs to be escaped (which rawurlencode() does for us).
The colon is OK after "mailto", as is the "@" after the e-mail name.
However, the rest of the URL needs to be encoded, replacing the following:
'?' => %3F
'=' => %3D
' ' => %20
'(' => %28
')' => %29
'&' => %26
'@' => %40 (note this one is in 'jane@some.where.com'
I tried to post the note with the correct text (that is the characters replaced in the note), but it said that there was a line that was too long, and so wouldn't let me add the note.
As a secondary note, I noticed that the auto-conversion routines at this site itself stopped the link at the space after "Simple testing(s)' in the first entry shown above.
02-Sep-2005 01:27
Constructing hyperlinks safely HOW-TO:
<?php
$path_component = 'machine/generated/part';
$url_parameter1 = 'this is a string';
$url_parameter2 = 'special/weird "$characters"';
$url = 'http://example.com/lab/cgi/test/'. rawurlencode($path_component) . '?param1=' . urlencode($url_parameter1) . '¶m2=' . urlencode($url_parameter2);
$link_label = "Click here & you'll be <happy>";
echo '<a href="', htmlspecialchars($url), '">', htmlspecialchars($link_label), '</a>';
?>
This example covers all the encodings you need to apply in order to create URLs safely without problems with any special characters. It is stunning how many people make mistakes with this.
Shortly:
- Use urlencode for all GET parameters (things that come after each "=").
- Use rawurlencode for parts that come before "?".
- Use htmlspecialchars for HTML tag parameters and HTML text content.
26-Aug-2005 01:14
Do not let the browser auto encode an invalid URL. Not all browsers perform the same encodeing. Keep it cross browser do it server side.
12-Aug-2005 10:24
Diferrent from the above example you do not have to encode URLs in hrefs with this. The browser does it automaticaly, so you just have to encode it with htmlentities() ;)
04-Jul-2005 09:14
I just came across the need for a function that exports an array into a query string. Being able to use urlencode($theArray) would be nice, but here's what I came up with:
<?php
function urlencode_array(
$var, // the array value
$varName, // variable name to be used in the query string
$separator = '&' // what separating character to use in the query string
) {
$toImplode = array();
foreach ($var as $key => $value) {
if (is_array($value)) {
$toImplode[] = urlencode_array($value, "{$varName}[{$key}]", $separator);
} else {
$toImplode[] = "{$varName}[{$key}]=".urlencode($value);
}
}
return implode($separator, $toImplode);
}
?>
This function supports n-dimensional arrays (it encodes recursively).
15-Apr-2005 01:48
I was testing my input sanitation with some strange character entities. Ones like and were passed correctly and were in their raw form when I passed them through without any filtering.
However, some weird things happen when dealing with characters like (these are HTML entities): ‼ ▐ ┐and Θ have weird things going on.
If you try to pass one in Internet Explorer, IE will *disable* the submit button. Firefox, however, does something weirder: it will convert it to it's HTML entity. It will display properly, but only when you don't convert entities.
The point? Be careful with decorative characters.
PS: If you try copy/pasting one of these characters to a TXT file, it will translate to a ?.
17-Feb-2005 03:49
The information on this page is misleading in that you might think the ampersand (&) will only need to be escaped as & when there is ambiguity with an existing character entity. This is false; the W3C page linked to from here clarifies that the ampersands must ALWAYS be escaped.
The following:
<a href='/script.php?variable1=value1&variable2=value2'>Link</a>
is INVALID HTML. It needs to be written as:
<a href='/script.php?variable1=value1&variable2=value2'>Link</a>
in order for the link to go to:
/script.php?variable1=value1&variable2=value2
I applaud the W3C's recommendation to use semicolons (';') instead of the ampersands, but it doesn't really change the fact that you still need to HTML-escape the value of all your HTML tag attributes. The following:
<span title='Rose & Mary'>Some text</span>
is also INVALID HTML. It needs to be escaped as:
<span title='Rose & Mary'>Some text</span>
04-Nov-2004 01:35
---[ Editor's Note ]---
You can also use rawurlencode() here, and skip the functions provided in this note.
---[ /Editor's Nore]---
For handling slashes in redirections, (see comment from cameron at enprises dot com), try this :
function myurlencode ( $TheVal )
{
return urlencode (str_replace("/","%2f",$TheVal));
}
function myurldecode ( $TheVal )
{
return str_replace("%2f","/",urldecode ($TheVal));
}
This is effectively a double urlencode for slashes and single urlencode for everything else. So, it is more "standardised" than his suggestion of using a + sign, and more readable (and search engine indexable) than a full double encode/decode.
17-Sep-2004 01:51
Be careful when encoding strings that came from simplexml in PHP 5. If you try to urlencode a simplexml object, the script tanks.
I got around the problem by using a cast.
$newValue = urlencode( (string) $oldValue );
09-Sep-2004 01:00
If you want to pass a url with parameters as a value IN a url AND through a javascript function, such as...
<a href="javascript:openWin('page.php?url=index.php?id=4&pg=2');">
...pass the url value through the PHP urlencode() function twice, like this...
<?php
$url = "index.php?id=4&pg=2";
$url = urlencode(urlencode($url));
echo "<a href=\"javascript:openWin('page.php?url=$url');\">";
?>
On the page being opened by the javascript function (page.php), you only need to urldecode() once, because when javascript 'touches' the url that passes through it, it decodes the url once itself. So, just decode it once more in your PHP script to fully undo the double-encoding...
<?php
$url = urldecode($_GET['url']);
?>
If you don't do this, you'll find that the result url value in the target script is missing all the var=values following the ? question mark...
index.php?id=4
07-Oct-2002 06:53
Just a simple comment, really, but if you need to encode apostrophes, you should be using rawurlencode as opposed to just urlencode.
Naturally, I figured that out the hard way.
