Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

acme (189)

acme
  (email not shown publicly)
http://www.astray.com/

Leon Brocard (aka acme) is an orange-loving Perl eurohacker with many varied contributions to the Perl community, including the GraphViz module on the CPAN. YAPC::Europe was all his fault. He is still looking for a Perl Monger group he can start which begins with the letter 'D'.

Journal of acme (189)

Thursday July 22, 2004
09:12 AM

Unicode

[ #19994 ]
See, I like the idea of Unicode. However, at the Italian Perl Workshop almost every talk had a little bit about how to get Unicode working in a module / project / database. And there are various CPAN modules with Unicode issues.

I think we've basically screwed up Unicode. We shouldn't have to talk about it, it should just work. And BOMs being optional is madness...

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • The problem with unicode is the bloody encodings. Do you want utf-8, utf-16, or iso-latin-57.3? And do you want fries with that? Given a chunk of text, how do you *know* what the encoding is? Guessing isn't good enough.

    It gets worse - there's been N different versions of Unicode now, capable of encoding different numbers of characters. eg, in Java 1.0, the version of Unicode it supported only permitted 65536 different characters (a character was 16 bits) but now unicode supports at least 32 bits.

    Di