Strip all HTML tags with Perl like PHP’s strip_tags() does
The Perl regular expression (regexp/regex) equivalent to PHP’s strip_tags() is:
while ($string =~ s/<\S[^<>]*(?:>|$)//gs) {};
Please note that it also denotes an opening “<” (followed by a non-whitespace character) as a tag and strips all characters behind, even it is not closed by a “>”. This is the same behavior as PHP’s strip_tags().
Update: This regexp is only satisfying my test against PHP 4.x, but 5.x is pretty smarter when it comes to edge cases. It will be a challenge to build a Perl equivalent as all the different approaches in CPAN also fail the test.
