Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I've been surprised at how consistently my name turns up on official documents here in France. The spelling Rafaël being completely abnormal, of course, and no-one ever spells it correctly, (even I can't bother spelling it correctly most of the time), but on my passport it's right.

    I remember when I registered François last year, I got asked quite precise questions about the spelling: a dash or no dash between Garcia and Suarez? They care about that kind of stuff.

    • Bah apparently the use.perl comment boxes are not friends with my browser :/

      Those were, in order :

      00EB LATIN SMALL LETTER E WITH DIAERESIS

      00E7 LATIN SMALL LETTER C WITH CEDILLA

      • To spell Rafaël and François properly you need to entity-encode the, uh, extravagant characters: use.perl is a Latin-1 Only Zone. Quelle bêtise…

        • Hehe, I would say ASCII-only. Rafael's accents perfectly fit in the latin-1 charset.

          • I'm suspecting browser character set headers on the form submission, because I can paste a literal ć in no problem. It looks like his browser sent UTF-8, but either described it as ISO-8859-1, or didn't say, resulting in the far end treating it as ISO-8859-1.

            Ho ho ho. When that ć comes back to me on preview, the HTML source has turned into ć.

            Which reminds me. Currently, does pod2text use man as an intermediate step when generating its output?

            • The initial problem is that the use.perl.org pages declare iso-8859-1
              as its charset. So form data has also to be sent as iso-8859-1. Maybe
              a browser shouldn't accept any non-latin1 characters when entering or
              pasting data into form fields, but at least gecko-based browsers
              doesn't do this. To do something with non-latin1 characters,
              gecko-based browsers on Unix system seem to do use this heuristic:

              * codepoints below 256 are fine

              * if there are codepoints in the 0x80-0x9f range of win1252, then they