Hinnerk Altenburg

Web Developer in Hamburg, Germany

Archive for the ‘Perl’ tag

Perl: Handle malformed UTF-8 strings with Encode::encode

without comments

Having the error message “Malformed UTF-8 character (fatal)” in my log files, I tried to handle this properly without letting the process die nor throwing away the whole string.
Having some research on Google I came up with following solution:

See also:
http://perldoc.perl.org/Encode.html#Handling-Malformed-Data
http://www.perlmonks.org/?node_id=839519

Written by Hinnerk

August 31st, 2011 at 4:58 pm

Posted in English

Tagged with , , , , ,

Set a custom HTTP User-Agent in Perl with WWW::Mechanize

with 2 comments

This is how you can dynamically set a custom HTTP User-Agent for your Perl requests to fake a device or browser for testing purpose or getting a device-specific version of a website.
WWW::Mechanize supports setting a custom user-agent with the constructor and after this gives a choice of 6 pre-defined basic user-agents ( $mech->agent_alias() ), only.

The following code demonstrates how to dynamically change the user-agent on a Mechanize object.

Written by Hinnerk

Juni 29th, 2011 at 9:21 pm

Posted in English

Tagged with , , , , ,

Strip all HTML tags with Perl like PHP’s strip_tags() does

with 4 comments

The Perl regular expression (regexp/regex) equivalent to PHP’s strip_tags() is:

while ($string =~ s/<\S[^<>]*(?:>|$)//gs) {};

Please note that it also denotes an opening “<” (followed by a non-whitespace character) as a tag and strips all characters behind, even it is not closed by a “>”. This is the same behavior as PHP’s strip_tags().

Update: This regexp is only satisfying my test against PHP 4.x, but 5.x is pretty smarter when it comes to edge cases. It will be a challenge to build a Perl equivalent as all the different approaches in CPAN also fail the test.

Update 2010-07-07: I’m currently porting strip_tags() from the C source code of PHP 5.3.2 to a CPAN Module. Stay tuned.

Update 2011-05-25: Today I finally uploaded my Perl port to CPAN: http://search.cpan.org/~hinnerk/HTML-StripTags-1.00/
New home of this module is http://www.hinnerk-altenburg.de/perl-strip_tags/

Written by Hinnerk

Dezember 23rd, 2009 at 2:30 pm

Posted in English

Tagged with , , , ,

PerlIDS-Artikel im deutschen Perl-Magazin $foo erschienen

without comments

Mein vierseitiger Artikel zum Perl-CPAN-Modul CGI::IDS ist in der aktuellen Ausgabe 1/2009 des deutschen Perl-Magazins $foo erschienen.
Ich gebe darin einen Überblick Über die Funktion und den Einsatz von PerlIDS zur frühzeitigen Erkennung von CrossSite-Scripting, SQL-Injections und Ähnlichen Angriffen auf Webapplikationen.

I just published a four pages long article in the German Perl magazine $foo about my Perl CPAN module CGI::IDS, a Website Intrusion Detection System.

Written by Hinnerk

Februar 3rd, 2009 at 6:27 pm

Posted in Deutsch

Tagged with , , , , ,

OpenSource Perl Website Intrusion Detection System PerlIDS (CGI::IDS) released

with one comment

Today, we at epublica have officially released my work of the last months – a Perl port of PHPIDS, a tool for detection of Cross-Site-Scripting (XSS), Cross-Site-Request-Forgery (CSRF), SQL-Injections (SQLI), Local-File-Inclusions (LFI) etc. in website requests.
The tool is released as CGI::IDS Perl module “PerlIDS” on CPAN.org under the OpenSource “Lesser GNU Public License” (LGPL).

Read the rest of this entry »

Written by Hinnerk

November 6th, 2008 at 1:36 pm

Posted in English

Tagged with , , , , , ,

My New Jobs since May 2008

without comments

Since May, I am employed by epublica GmbH, Hamburg, doing Perl development mainly for the XING Web platform. Have a look at their brand new office in the heart of the city upstairs from XING.

Also I am working as a freelancer for the TYPO3 agency EXINIT GmbH & Co. KG, Hamburg doing TYPO3 extension development in PHP.

Written by Hinnerk

Juni 25th, 2008 at 12:10 am