Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • Any default that isn't UTF-8 is wrong. Yes, really.
    • Uhm, that makes no sense. UTF-8 is an encoding, Unicode is a charset, and neither is a collation. F.ex., Ü will sort to different places depending on whether the collation is English or German (and may sort in yet otherwise in one of the Scandinavian languages, or in Turkish, or what have you). Whether you represent this character in Latin-1 or Unicode (happens to be the same codepoint in both charsets) and whether you encode the Unicode codepoint using UTF-8 or another encoding all has nothing to do w

  • Arguably we have at least three accented characters - the o with diaresis, o with circumflex and e with acute. At least, they occur in Modern English words in the OED even if in practice most people omit them. But we undeniably have two non-ASCII letters both of which occur both capitalised and not - the ae and oe which I dare not type here because your browser will almost certainly get them wrong.

    AE occurs in words like encyclopaedia and aesc, and OE in, for example, the proper name OEdipus and a few s

    • And even then, we have words like résumé which also have accents, depending on who is spelling them.

      • Yes, that's e with acute. Some would argue that furrin words adopted into English lose the accents though. For example, when we stole théâtre from the French it became theatre and café is normally cafe.
    • Don’t forget naïve. Æsthetics and co. work just fine in browsers, btw.

      • It's worth noting that the funny dots in ï and ö in English are diaresis marks, not umlauts. I wonder if they're spelt differently in Unicode ...