Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • decodes the non-standard (and invalid according to RFC 3986) pct escape into a UTF-8 octet string, but it doesn't decode it into perl unicode string. I think the current behavior is desirable since the data can contain any octets in any encoding.


  • > %u131

    What sort of encoding is that? I mean, I can see it's the Unicode codepoint preceded by %u, but which standard backs this? I've never encountered this before.

    Here's my take on it:

    use CGI qw();
    use Encode qw(decode_utf8);

    my $input  = 'a=%C4%B1';
    my $expect = "\x{131}";
    my $got    = decode_utf8(CGI->new($input)->param('a'));
    # as per best practice

    use Devel::Peek qw(Dump); Dump $expect; Dump $got;

    print $expect eq $got
      ? "ok $] $

    • It usually comes from broken javascript applications that uses escape() instead of encodeURI()

      escape("\u263A") -> %u263A
      encodeURI("\u263A") -> %E2%98%BA


  • Did you try using use 'CGI qw/ :utf8 /;'? That seems to work the way you want with CGI 3.49 (at least it seems to on my box).