Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

barbie (2653)

  reversethis-{ku. ... m} {ta} {eibrab}

Leader of [] and a CPAN author []. Co-organised YAPC::Europe in 2006 and the 2009 QA Hackathon, responsible for the YAPC Conference Surveys [] and the QA Hackathon [] websites. Also the current caretaker for the CPAN Testers websites and data stores.

If you really want to find out more, buy me a Guinness ;)

Memoirs of a Roadie []
CPAN Testers Reports []
YAPC Conference Surveys []
QA Hackathon []

Journal of barbie (2653)

Monday August 10, 2009
08:23 AM

Another Data::FormValidator Filter

[ #39438 ]

At YAPC::Europe 2009 last week, I launched the Conference Survey during the final keynote, and almost immediately people began submitting their responses. I'll be posting more about the surveys later in the week, but this post concerns itself with a specific technical aspect.

Smylers, being a rather clever fellow, likes to find the edge cases. He found one such edge case in the survey submissions, and although it wasn't a vulnerability, it was potential providing a misleading error to users. The problem arose due to the use of what are usually refered to as Microsoft "smart" characters. These are the characters that don't conform to standard Unicode character sets, as they use a range that is supposed to be reserved for control characters (see Wikipedia for more details).

Smylers had entered an en-dash character and some double quote characters from a Windows machine, and had attempted to submit one of the talk feedback forms. The result was a rather confusing error. The reason being that the backend of the survey system had deleted the field with the smart characters, because they were part of a range not accepted as string characters by the validation code, and flagged as an input error. The solution was to add a filter to the Data::FormValidator profile and translate the characters into something more sensible, before validating the input string. Which is what I did.

As a result Data-FormValidator-Filters-Demoroniser is now winging its way to CPAN. The code has been in the backend system for sometime, just not in the right place to pre-validate input strings. As it turned out it was much easier to abstract it and create a new module than rewrite some of the internal code.

My thanks to Smylers for initially spotting and reporting the bug, the guys behind Data::FormValidator for making it so easy to add the filter, and Dave Wheeler for already implementing many of the translations via his Encode::ZapCP1252 module.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • I mean at the time of converting byte stream into characters.
  • Actually they were standard Unicode characters. And I wasn't actually trying to find edge cases; I was just aiming for nice typography and stumbled upon the bug by accident!

    For the record, I'd like it to be known I wasn't anywhere near Windows! I was actually using Ubuntu Linux running Gnome. Keyboard preferences lets you define a 'compose' key (I chose Caps Lock, cos that isn't something I ever use) then you can type sequences like Compose --- to get an em dash, or Compose "< to get opening curly q