downloads | documentation | faq | getting help | mailing lists | licenses | wiki | reporting bugs | php.net sites | conferences | my php.net

search for in the

stripcslashes> <strcspn
[edit] Last updated: Fri, 26 Apr 2013

view this page in

strip_tags

(PHP 4, PHP 5)

strip_tagsStrip HTML and PHP tags from a string

Description

string strip_tags ( string $str [, string $allowable_tags ] )

This function tries to return a string with all NUL bytes, HTML and PHP tags stripped from a given str. It uses the same tag stripping state machine as the fgetss() function.

Parameters

str

The input string.

allowable_tags

You can use the optional second parameter to specify tags which should not be stripped.

Note:

HTML comments and PHP tags are also stripped. This is hardcoded and can not be changed with allowable_tags.

Note:

This parameter should not contain whitespace. strip_tags() sees a tag as a case-insensitive string between < and the first whitespace or >. It means that strip_tags("<br/>", "<br>") returns an empty string.

Return Values

Returns the stripped string.

Changelog

Version Description
5.0.0 strip_tags() is now binary safe.
4.3.0 HTML comments are now always stripped.

Examples

Example #1 strip_tags() example

<?php
$text 
'<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo 
strip_tags($text);
echo 
"\n";

// Allow <p> and <a>
echo strip_tags($text'<p><a>');
?>

The above example will output:

Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>

Notes

Warning

Because strip_tags() does not actually validate the HTML, partial or broken tags can result in the removal of more text/data than expected.

Warning

This function does not modify any attributes on the tags that you allow using allowable_tags, including the style and onmouseover attributes that a mischievous user may abuse when posting text that will be shown to other users.

Note:

Tag names within the input HTML that are greater than 1023 bytes in length will be treated as though they are invalid, regardless of the allowable_tags parameter.

See Also



stripcslashes> <strcspn
[edit] Last updated: Fri, 26 Apr 2013
 
add a note add a note User Contributed Notes strip_tags - [19 notes]
up
9
CEO at CarPool2Camp dot org
4 years ago
Note the different outputs from different versions of the same tag:

<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new  = strip_tags($data, '<br>');
var_dump($new);  // OUTPUTS string(21) "<br>EachNew<br />Line"

<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new  = strip_tags($data, '<br/>');
var_dump($new); // OUTPUTS string(16) "Each<br/>NewLine"

<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new  = strip_tags($data, '<br />');
var_dump($new); // OUTPUTS string(11) "EachNewLine"
?>
up
5
admin at automapit dot com
6 years ago
<?php
function html2txt($document){
$search = array('@<script[^>]*?>.*?</script>@si'// Strip out javascript
              
'@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
              
'@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
              
'@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
);
$text = preg_replace($search, '', $document);
return
$text;
}
?>

This function turns HTML into text... strips tags, comments spanning multiple lines including CDATA, and anything else that gets in it's way.

It's a frankenstein function I made from bits picked up on my travels through the web, thanks to the many who have unwittingly contributed!
up
4
tom at cowin dot us
2 years ago
With most web based user input of more than a line of text, it seems I get 90% 'paste from Word'. I've developed this fn over time to try to strip all of this cruft out. A few things I do here are application specific, but if it helps you - great, if you can improve on it or have a better way - please - post it...

<?php

   
function strip_word_html($text, $allowed_tags = '<b><i><sup><sub><em><strong><u><br>')
    {
       
mb_regex_encoding('UTF-8');
       
//replace MS special characters first
       
$search = array('/&lsquo;/u', '/&rsquo;/u', '/&ldquo;/u', '/&rdquo;/u', '/&mdash;/u');
       
$replace = array('\'', '\'', '"', '"', '-');
       
$text = preg_replace($search, $replace, $text);
       
//make sure _all_ html entities are converted to the plain ascii equivalents - it appears
        //in some MS headers, some html entities are encoded and some aren't
       
$text = html_entity_decode($text, ENT_QUOTES, 'UTF-8');
       
//try to strip out any C style comments first, since these, embedded in html comments, seem to
        //prevent strip_tags from removing html comments (MS Word introduced combination)
       
if(mb_stripos($text, '/*') !== FALSE){
           
$text = mb_eregi_replace('#/\*.*?\*/#s', '', $text, 'm');
        }
       
//introduce a space into any arithmetic expressions that could be caught by strip_tags so that they won't be
        //'<1' becomes '< 1'(note: somewhat application specific)
       
$text = preg_replace(array('/<([0-9]+)/'), array('< $1'), $text);
       
$text = strip_tags($text, $allowed_tags);
       
//eliminate extraneous whitespace from start and end of line, or anywhere there are two or more spaces, convert it to one
       
$text = preg_replace(array('/^\s\s+/', '/\s\s+$/', '/\s\s+/u'), array('', '', ' '), $text);
       
//strip out inline css and simplify style tags
       
$search = array('#<(strong|b)[^>]*>(.*?)</(strong|b)>#isu', '#<(em|i)[^>]*>(.*?)</(em|i)>#isu', '#<u[^>]*>(.*?)</u>#isu');
       
$replace = array('<b>$2</b>', '<i>$2</i>', '<u>$1</u>');
       
$text = preg_replace($search, $replace, $text);
       
//on some of the ?newer MS Word exports, where you get conditionals of the form 'if gte mso 9', etc., it appears
        //that whatever is in one of the html comments prevents strip_tags from eradicating the html comment that contains
        //some MS Style Definitions - this last bit gets rid of any leftover comments */
       
$num_matches = preg_match_all("/\<!--/u", $text, $matches);
        if(
$num_matches){
             
$text = preg_replace('/\<!--(.)*--\>/isu', '', $text);
        }
        return
$text;
    }
?>
up
4
mariusz.tarnaski at wp dot pl
4 years ago
Hi. I made a function that removes the HTML tags along with their contents:

Function:
<?php
function strip_tags_content($text, $tags = '', $invert = FALSE) {

 
preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
 
$tags = array_unique($tags[1]);
   
  if(
is_array($tags) AND count($tags) > 0) {
    if(
$invert == FALSE) {
      return
preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return
preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif(
$invert == FALSE) {
    return
preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return
$text;
}
?>

Sample text:
$text = '<b>sample</b> text with <div>tags</div>';

Result for strip_tags($text):
sample text with tags

Result for strip_tags_content($text):
 text with

Result for strip_tags_content($text, '<b>'):
<b>sample</b> text with

Result for strip_tags_content($text, '<b>', TRUE);
 text with <div>tags</div>

I hope that someone is useful :)
up
1
cesar at nixar dot org
7 years ago
Here is a recursive function for strip_tags like the one showed in the stripslashes manual page.

<?php
function strip_tags_deep($value)
{
  return
is_array($value) ?
   
array_map('strip_tags_deep', $value) :
   
strip_tags($value);
}

// Example
$array = array('<b>Foo</b>', '<i>Bar</i>', array('<b>Foo</b>', '<i>Bar</i>'));
$array = strip_tags_deep($array);

// Output
print_r($array);
?>
up
3
bzplan at web dot de
6 months ago
a HTML code like this:

<?php
$html
= '
<div>
<p style="color:blue;">color is blue</p><p>size is <span style="font-size:200%;">huge</span></p>
<p>material is wood</p>
</div>
'
;
?>

with <?php $str = strip_tags($html); ?>
... the result is:

$str = 'color is bluesize is huge
material is wood';

notice: the words 'blue' and 'size' grow together :(
and line-breaks are still in new string $str

if you need a space between the words (and without line-break)
use my function: <?php $str = rip_tags($html); ?>
... the result is:

$str = 'color is blue size is huge material is wood';

the function:

<?php
// --------------------------------------------------------------

function rip_tags($string) {
   
   
// ----- remove HTML TAGs -----
   
$string = preg_replace ('/<[^>]*>/', ' ', $string);
   
   
// ----- remove control characters -----
   
$string = str_replace("\r", '', $string);    // --- replace with empty space
   
$string = str_replace("\n", ' ', $string);   // --- replace with space
   
$string = str_replace("\t", ' ', $string);   // --- replace with space
   
    // ----- remove multiple spaces -----
   
$string = trim(preg_replace('/ {2,}/', ' ', $string));
   
    return
$string;

}

// --------------------------------------------------------------
?>

the KEY is the regex pattern: '/<[^>]*>/'
instead of strip_tags()
... then remove control characters and multiple spaces
:)
up
0
kai at froghh dot de
4 years ago
a function that decides if < is a start of a tag or a lower than / lower than + equal:

<?php
function lt_replace($str){
    return
preg_replace("/<([^[:alpha:]])/", '&lt;\\1', $str);
}
?>

It's to be used before strip_slashes.
up
-1
brettz9 AAT yah
4 years ago
Works on shortened <?...?> syntax and thus also will remove XML processing instructions.
up
-2
cyex at hotmail dot com
2 years ago
I thought someone else might find this useful... a simple way to strip BBCode:

<?php

$bbcode_str
= "Here is some [b]bold text[/b] and some [color=#FF0000]red text[/color]!";

$plain_text = strip_tags(str_replace(array('[',']'), array('<','>'), $bbcode_str));

//Outputs: Here is some bold text, and some red text!

?>
up
-2
jausions at php dot net
6 years ago
To sanitize any user input, you should also consider PEAR's HTML_Safe package.

http://pear.php.net/package/HTML_Safe
up
-2
salavert at~ akelos
7 years ago
<?php
      
/**
    * Works like PHP function strip_tags, but it only removes selected tags.
    * Example:
    *     strip_selected_tags('<b>Person:</b> <strong>Salavert</strong>', 'strong') => <b>Person:</b> Salavert
    */

   
function strip_selected_tags($text, $tags = array())
    {
       
$args = func_get_args();
       
$text = array_shift($args);
       
$tags = func_num_args() > 2 ? array_diff($args,array($text))  : (array)$tags;
        foreach (
$tags as $tag){
            if(
preg_match_all('/<'.$tag.'[^>]*>(.*)<\/'.$tag.'>/iU', $text, $found)){
               
$text = str_replace($found[0],$found[1],$text);
          }
        }

        return
$text;
    }

?>

Hope you find it useful,

Jose Salavert
up
-2
Sam
3 months ago
Note that js comments are also removed by strip_tags as stripping the tags from the following sample will return an empty string

<script type="text/javascript">
   // Add custom parameters here.
</script>
up
-2
hongong at webafrica dot org dot za
4 years ago
An easy way to clean a string of all CDATA encapsulation.

<?php
function strip_cdata($string)
{
   
preg_match_all('/<!\[cdata\[(.*?)\]\]>/is', $string, $matches);
    return
str_replace($matches[0], $matches[1], $string);
}
?>

Example: echo strip_cdata('<![CDATA[Text]]>');
Returns: Text
up
-2
southsentry at yahoo dot com
4 years ago
I was looking for a simple way to ban html from review posts, and the like. I have seen a few classes to do it. This line, while it doesn't strip the post, effectively blocks people from posting html in review and other forms.

<?php
if (strlen(strip_tags($review)) < strlen($review)) {
    return
false;
}
?>

If you want to further get by the tricksters that use & for html links, include this:

<?php
if (strlen(strip_tags($review)) < strlen($review)) {
        return
false;
} elseif (
strpos($review, "&") !== false) {
        return
5;
}
?>

I hope this helps someone out!
up
-2
chrisj at thecyberpunk dot com
11 years ago
strip_tags has doesn't recognize that css within the style tags are not document text. To fix this do something similar to the following:

$htmlstring = preg_replace("'<style[^>]*>.*</style>'siU",'',$htmlstring);
up
-3
Liam Morland
4 years ago
Here is a suggestion for getting rid of attributes: After you run your HTML through strip_tags(), use the DOM interface to parse the HTML. Recursively walk through the DOM tree and remove any unwanted attributes. Serialize the DOM back to the HTML string.

Don't make the default permit mistake: Make a list of the attributes you want to ALLOW and remove any others, rather than removing a specific list, which may be missing something important.
up
-3
Anonymous User
8 years ago
Be aware that tags constitute visual whitespace, so stripping may leave the resulting text looking misjoined.

For example,

"<strong>This is a bit of text</strong><p />Followed by this bit"

are seperable paragraphs on a visual plane, but if simply stripped of tags will result in

"This is a bit of textFollowed by this bit"

which may not be what you want, e.g. if you are creating an excerpt for an RSS description field.

The workaround is to force whitespace prior to stripping, using something like this:

<?php
      $text
= getTheText();
     
$text = preg_replace('/</',' <',$text);
     
$text = preg_replace('/>/','> ',$text);
     
$desc = html_entity_decode(strip_tags($text));
     
$desc = preg_replace('/[\n\r\t]/',' ',$desc);
     
$desc = preg_replace('/  /',' ',$desc);
?>
up
-5
Abdul Al-hasany
2 years ago
As noted in the documentation strip_tags would strip php and comments tags even if they are add to $allowable_tags.

Here is a little workaround for this issue:
<?php
function stripTags($text, $tags)
{
 
 
// replace php and comments tags so they do not get stripped 
 
$text = preg_replace("@<\?@", "#?#", $text);
 
$text = preg_replace("@<!--@", "#!--#", $text);
 
 
// strip tags normally
 
$text = strip_tags($text, $tags);
 
 
// return php and comments tags to their origial form
 
$text = preg_replace("@#\?#@", "<?", $text);
 
$text = preg_replace("@#!--#@", "<!--", $text);
 
  return
$text;
}
?>

The function would replace the tags to hashes so strip_tags would not identify them as normal tags, and then when strip_tags does its job the tags are modified back to their original form.
up
-8
Kalle Sommer Nielsen
5 years ago
This adds alot of missing javascript events on the strip_tags_attributes() function from below entries.

Props to MSDN for lots of them ;)

<?php
function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavaible', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragdrop', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterupdate', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmoveout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload'))
    {
        if (empty(
$aDisabledAttributes)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags)));
    }
?>

 
show source | credits | stats | sitemap | contact | advertising | mirror sites