Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • > %u131

    What sort of encoding is that? I mean, I can see it's the Unicode codepoint preceded by %u, but which standard backs this? I've never encountered this before.

    Here's my take on it:

    use CGI qw();
    use Encode qw(decode_utf8);

    my $input  = 'a=%C4%B1';
    my $expect = "\x{131}";
    my $got    = decode_utf8(CGI->new($input)->param('a'));
    # as per best practice http://search.cpan.org/perldoc?CGI#-utf8

    use Devel::Peek qw(Dump); Dump $expect; Dump $got;

    print $expect eq $got
      ? "ok $] $CGI::VERSION"
      : "not ok $] $CGI::VERSION"

    __DATA__
    SV = PV(0x88bc40) at 0x8c12f8
      REFCNT = 1
      FLAGS = (PADMY,POK,pPOK,UTF8)
      PV = 0x8aaad0 "\304\261"\0 [UTF8 "\x{131}"]
      CUR = 2
      LEN = 8
    SV = PV(0xac9e60) at 0x8c13e8
      REFCNT = 1
      FLAGS = (PADMY,POK,pPOK,UTF8)
      PV = 0xad5740 "\304\261"\0 [UTF8 "\x{131}"]
      CUR = 2
      LEN = 8
    ok 5.010001 3.48

    • It usually comes from broken javascript applications that uses escape() instead of encodeURI()


      escape("\u263A") -> %u263A
      encodeURI("\u263A") -> %E2%98%BA

      --
      chansen