Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • by mirod (2686) on 2002.02.15 12:27 (#4574)

    We all want Unicode to work, and there is no question that it is the Right Thing (tm) to do, being open, allowing other cultures to join us and use their own writing scheme and all.

    The sad truth is that it is actually a huge pain in the ass to implement for most coders, at least in the US and especially in Europe, and I would be really interested to know if it makes things really easier for Asian coders.

    Plus Unicode is usually being forced upon us by XML, which is never a nice thing when you are already fighting with a new technologu and have deadlines to meet ;--(

    • Is it worse in Europe specifically because UTF-8 and 8-bit Latin-1 are incompatible?
      • Yes, XML parsers not only tend to die a swift but painful death when they encounter a Latin-1 (or 2 or more) character, even in a CDATA section, but also, at least XML::Parser converts everything to UTF-8, even if the rest of the environment is entirely Latin-n. This is extremely annoying as it adds an extra level of complexity to all applications, and forces people to care about encodings when really they don't want to.

    • The problem with the Asian languages is most of them already have a perfectly serviceable local standard. Big5 (traditional and simplified) for Chinese and Shift-JIS (amongst others) for Japanese. Korean and Vietnamese also have standards that work just fine.

      Unicode's in some ways more of a change for them than for us--while ASCII maps to Unicode (especially the utf8 encoding) with no change, the same can not be said for the asian languages. For them Unicode's more than just an annoyance, it's something t