The biggest problem with rolling out a design is all the stuff you didn't know about. I had about five conversations at the conference that went pretty much like:
Me: Here's my design for the guts of perl 6.
Them: Oh. Have you considered X? (For some random value of X)
I am profoundly glad that we didn't start implementing the core six months ago, like I was really itching to do. (Yes, I really was, though almost nobody believes that) I'm not sure if all those conversations will ressult in things being done differently--I really do like the design we have now--but I've a bunch of research to do to find out if there are things that are better. That would really suck to have to do if we'd done six months of implementation.
Oh, and in case anyone cares, character encoding sucks in ways that are so deep and profound it's mind-boggling. Think Unicode'll solve the problem? Wrong! I only wish it did.
Even once you convert your data to Unicode, if you can even do that without losing information, that's only the first step. How do you sort and compare it? Is a string with all Japanese text bigger or smaller than one with all Arabic text? (And don't even get me started on comparing Japanese strings with Chinese ones, since the two can potentially use the identicalUnicode characters, only they may mean something else entirely, and certainly sort differently) Then there's mixed-language strings, which are even worse for those languages that use overlapping sets of characters. (How do you tell that a phrase embedded in a Chinese string is Japanese or, in some cases, Korean? That's important for purely mechanical things like sorting and comparison)
Feh. String data sucks. I think I said that, though.
That means that, unfortunately, we need to tag strings with both encoding types (Like UTF-8 or UTF-32) but we also need to tag it with a locale. (Not to be confused with locales, which are a completely separate field of suckiness. Go ask Jarrko about that) Heck, we really need a way to properly tag substrings with locales, so we can tell if a chunk of the string is Japanese and not Chinese. Or even French and not Spanish. (Don't think you get off easy ignoring the world outside the US and Western Europe--different countries sort and compare accented/umlauted/tilded characters differently)
String data definitely sucks.
On the other hand, life's not entirely bad. (Nat's gift of
Oh, right, before I forget. Everyone should head over to the YAS site and see about donating to them for sponsorship stuff. They're the folks responsible for sponsoring Damian this year. (And maybe next year, I'm not sure) While I'm not entirely sure his wife thanks us for it (We, as a community, definitely owe her big time), we as a community have benefitted enourmously. Even if it's only $14 a piece (which is 2 lattes at Starbucks, or half of Serial Experiments Lain Vol 1), every little bit helps, and it does add up quickly.
And on a final note, if you happen by and see that this
is by background, it's probably a clue that it's not a great day...