Based on the voting from attendees, we decided the 2nd round of accepted talks. Now we've got 53 talks and they all look so interesting! Go check the list on the schedule page.
We'll announce the program next week with "Personalized Schedule" functionality built on top of Act hopefully with this weekend Hackathon!
YAPC::Asia 2008 organizers would like to thank Eric Cholet, the author of ACT for the great conference organizing software that powers most of YAPCs and Perl Workshops.
To show the appreciation in the hacker's way, I'm flying to Paris, France next weekend (April 25-28) funded by YAPC::Asia possible profit, to work on Act feature enhancement.
We plan to work on these things because we want them for YAPC::Asia:
* OpenID provider support
* Better Japanese names display (i18n)
* Embed videos and slides (YouTube, Google Video, Slideshare etc.) in talks
* Personal Scheduling (Who is attending to which talks) like Sched.org or icalico
* Online check-in API (Who actually showed up when)
* Promotional code / coupon for discounted payments
We (at least, I) prioritize implementing these because the trip is funded by YAPC::Asia but if there's anything you think is missing for Act, I'd love to hear. Remote participation (#act on irc.perl.org during the weekend) would be welcome too!
YAPC::Asia 2008 website got a redesign, along with the announcement of sponsors and the initial set of talks (currently 33 talks and more to come!).
We have Larry Wall and Michael Schwern as keynote speakers this year. Tickets will go on sale on March 25th Tuesday local time. There's been YAPC::Asia tradition that 300 tickets go sold out in a week, so don't miss it.
(Editorial: Don't frontpage this post, editors. I write it down here to summarize my thought, wanting to get feedbacks from my trusted readers and NOT flame wars or another giant thread of utf-8 flag woes)
I can finally say I fully grok Unicode, UTF-8 flag and all that stuff in Perl just lately. Here are some analysis of how perl programmers understand Unicode and UTF-8 flag stuff.
(This post might need more code to demonsrate and visualize what I'm talking about, but I'd leave it as a homework for readers, or at least thing for me to do until YAPC::Asia if there's a demand for this talk
Level 1. "Take that annoying flag off, dude!"
They, typically web application developers, assume all data is encoded in utf-8. If they encounter some wacky garbaged characters (a.k.a Mojibake in Japanese) which they think is a perl bug, they just make an ad-hoc call of:
Encode::_utf8_off($stuff)
to take the utf-8 flag off and make sure all data is still in utf-8 by avoiding any possible latin-1-utf8 auto upgrades.
This is level 1. Unfortunately, this works okay, assuming their data is actually encoded only in utf-8 (like database is utf-8, web page is displayed in utf-8, the data sent from browsers is utf-8 etc.). Their app is still broken when they call things like length(), substr() or regular expression because the strings are not UTF-8 flagged and those functions don't work in Unicode semantics.
They can optionally use "use encoding 'utf-8'" or CPAN module encoding::warnings to avoid auto-upgrades at all, or catch such mistakes, or use Unicode::RecursiveDowngrade to turn off UTF-8 flag on complex data structure.
Level 2. "Unicode strings have UTF-8 flags. That's easy"
They make an extensive use of Encode module encode() and decode() to make sure all data in their app is UTF-8 flagged. Their app works really nice in Unicode semantics.
They sometimes need to deal with UTF-8 bytes in addition to UTF8-flagged strings. In that case, they use some hacky modules named ForceUTF8, or do things like
utf8::encode($_) if utf8::is_utf8($_)
to assume that "Unicode strings should have UTF-8 flagged, and those without the flag are assumed UTF-8 bytes."
This is Level 2. This is a straight upgrade from Level 1 and fixes some issues of Level 1 (string functions not working in Unicode semantics, etc.), but it's still too UTF-8 centric. They ignore why perl5 treats strings this way, and still hate SV Auto-upgrade.
To be honest I was thinking this way until, like early 2007. There's a couple of my modules on CPAN that accepts both UTF-8 flagged string and UTF-8 bytes, because I thought it'd be handy, but actually that breaks latin-1 strings if they're not utf-8 flagged, which is rare in UTF-8 centric web application development anyway, but still could happen.
I gradually have changed my mind when I talked about how JSON::Syck Unicode support is broken with Marc Lehmann, and when I read the tutorial by and attended to the Perl Unicode tutorial talk by Juerd Waalboer in YAPC::EU.
Level 3. "Don't bother UTF-8 flag"
They stop guessing if a variable is UTF-8 flagged or not. All they need to know is that a string is whether bytes or characters, by checking how a scalar variable is generated.
If it's bytes, use decode() to get Unicode strings. If it's characters, don't bother if it's UTF-8 flagged or not: if it's not flagged they'll be auto-upgraded thanks to Perl, so you don't need to know the internal representations.
So it's like a step back from Level 2. "Get back to the basic, and think why Perl 5 does this latin-1 to utf-8 auto upgrades."
If your function or module needs to accept strings that might be either characters or bytes, just provide 2 different functions, or some flag to explicitly set. Don't auto-decode bytes as utf-8 because that breaks latin-1 characters if they're not utf-8 flagged. Of course the caller of the module can call utf8::upgrade() to make sure, but it's just a pain and anti-perl5 way.
There's still a remaining problem with CPAN modules, though. Some modules return strings in some occasion and not otherwise. For instance, $c->req->param($foo) would return UTF-8 flagged string if Catalyst::Plugin::Unicode is loaded and bytes otherwise. And using utf8::is_utf8($_) here might cause bugs like described before.
Well, in C::P::Unicode example, actually not. using C::P::Unicode guarantees that parameters are all utf-8 flagged even if the characters contain latin-1 range characters. Not using the plugin guarantees the parametes are not flagged at all. So it's a different story.
(To be continued...)
Wondering what talk I should submit to OSCON (and other YAPCs this year too!).
The obvious choice is Web::Scraper since I haven't done this talk other than Europe and Japan, and I can make lots of updates till summer when I give an actual talk (We call it CDD -- Conference Driven Development)
Any suggestions?
URI-Find is a great module to extract URIs from an arbitrary text, but unfortunately, it doesn't work with non-ascii URLs that we often encounter when chatting with Safari users, such as: http://ja.wikipedia.org/wiki/メインページ
The reason why Safari users sometimes do this is that Safari shows the URI-decoded path in its location bar.
I hacked and uploaded URI::Find extension (subclass) URI::Find::UTF8 which can be a drop-in replacement for URI::Find, to extract URLs like this.
We have a subversion repository too, if you want to take a look and found a bug and patch the code.
UPDATE: The module was originally written using constant overloading, but it is a dangerous and gross hack, so I changed that to use autobox framework instead (wondering why I didn't try that at first!). I updated the post accordingly.
Rails has ActiveSupport, something to add funky methods to Ruby core object, to do fancy things like 2.months.ago to get Time duration object etc.
I found it pretty interesting and wondered if it's doable in Perl. Yes it is, with using autobox framework which I hope is going to be in core in perl 5.12, or using constant overloading like bigint.pm does.
So here you are: autobox::DateTime::Duration on CPAN and SVN repository if you can't wait CPAN mirrors updates. With this you can say:
use autobox;
use autobox::DateTime::Duration;
print 1->day->ago, "\n"; # 2008-01-14T23:25:53
print 2->minutes->from_now, "\n"; # 2008-01-15T23:28:20
and all methods implemented in ActiveSupport::CoreExt::Numeric::Time, including this crazy fortnight method. Since it's a standard DateTime::Duration object, you can also say this to save some typings:
my $now = DateTime->now;
my $dur = 3->hours + 2->minutes;
$now->add_duration($dur);
This might be a fun birthday gift for DateTime's 5th birthday
My friend Toru Hisai, who has joined us at Shibuya.pm tech meetings in Tokyo a lot, has recently moved to Honolulu, Hawaii and he's now trying to start a local Perl user group there: Honolulu.pm. Hawaii.pm appears to have been there for really a long time but it turns out the website is way outdated and the contact on the site is bouncing, so I suggested him to start his own.
This might be a significant step for us towards YAPC::Hawaii? hint, hint.