Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Elian (119)

Elian
  (email not shown publicly)
http://www.sidhe.org/~dan/
AOL IM: DanSugalski (Add Buddy, Send Message)

Perl 6/Parrot internals ex-design team lead.

Journal of Elian (119)

Sunday July 29, 2001
04:28 PM

Ding, Dong, the conference is dead!

[ #531 ]
Or over, at least.

The biggest problem with rolling out a design is all the stuff you didn't know about. I had about five conversations at the conference that went pretty much like:

Me: Here's my design for the guts of perl 6.
Them: Oh. Have you considered X? (For some random value of X)
Me: D'oh!

I am profoundly glad that we didn't start implementing the core six months ago, like I was really itching to do. (Yes, I really was, though almost nobody believes that) I'm not sure if all those conversations will ressult in things being done differently--I really do like the design we have now--but I've a bunch of research to do to find out if there are things that are better. That would really suck to have to do if we'd done six months of implementation.

Oh, and in case anyone cares, character encoding sucks in ways that are so deep and profound it's mind-boggling. Think Unicode'll solve the problem? Wrong! I only wish it did.

Even once you convert your data to Unicode, if you can even do that without losing information, that's only the first step. How do you sort and compare it? Is a string with all Japanese text bigger or smaller than one with all Arabic text? (And don't even get me started on comparing Japanese strings with Chinese ones, since the two can potentially use the identicalUnicode characters, only they may mean something else entirely, and certainly sort differently) Then there's mixed-language strings, which are even worse for those languages that use overlapping sets of characters. (How do you tell that a phrase embedded in a Chinese string is Japanese or, in some cases, Korean? That's important for purely mechanical things like sorting and comparison)

Feh. String data sucks. I think I said that, though.

That means that, unfortunately, we need to tag strings with both encoding types (Like UTF-8 or UTF-32) but we also need to tag it with a locale. (Not to be confused with locales, which are a completely separate field of suckiness. Go ask Jarrko about that) Heck, we really need a way to properly tag substrings with locales, so we can tell if a chunk of the string is Japanese and not Chinese. Or even French and not Spanish. (Don't think you get off easy ignoring the world outside the US and Western Europe--different countries sort and compare accented/umlauted/tilded characters differently)

String data definitely sucks.

On the other hand, life's not entirely bad. (Nat's gift of .NET books from O'Reilly aside... :) Akira's out on DVD. Yay! (I am Pioneer's mind-slave) It's a sweet transfer, and it looks and sounds really nice. I even got lucky and found one of the two-disc collector sets that's got a zillion production sketches and other nifty bits on the second disc. I don't much care that it's in the tin, but it is a keen little gimmie on top of it all. I can even tell the subtitling's better than on my aging VHS version. (I'm one of the few folks in the US to have the subbed and not dubbed tape, apparently) My Japanese isn't actually good enough to tell what they're saying, but it's good enough to tell that what was on the tape's subtitles wasn't quite it. (I learn languages oddly. Go figure)

Oh, right, before I forget. Everyone should head over to the YAS site and see about donating to them for sponsorship stuff. They're the folks responsible for sponsoring Damian this year. (And maybe next year, I'm not sure) While I'm not entirely sure his wife thanks us for it (We, as a community, definitely owe her big time), we as a community have benefitted enourmously. Even if it's only $14 a piece (which is 2 lattes at Starbucks, or half of Serial Experiments Lain Vol 1), every little bit helps, and it does add up quickly.

And on a final note, if you happen by and see that this is by background, it's probably a clue that it's not a great day... :) (I do rather like Aimo's art, though probably not the pieces you'd expect)