Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

miyagawa (1653)

miyagawa
  (email not shown publicly)
http://bulknews.vox.com/
AOL IM: bulknews (Add Buddy, Send Message)

Journal of miyagawa (1653)

Friday October 13, 2006
04:22 AM

ccTLD - iso3166 - Timezones

[ #31310 ]

I'm looking for a way (i.e. a Perl module) to convert 1) ccTLD to iso 3166 country code and 2) then iso 3166 cc to available TimeZones in the country.

The use cases will be:

a) In a web application, if users already set his/her country, display a list of timezones in that country.

Example: if you set your country as 'Russia', a select box for "your timezone:" will automatically limited (or defaults) to the timezones in Russia. It could be AJAXified for a better UE obviously. clkao told me that Google calendar does it very well.

See http://blog.bulknews.net/tmp/datetime-country.cgi for the live demo of this use case. Note that this is a demo app and choosing "United Kingdom" or "United States" will give you a funny language, since it loops over languages to get the country code, hence the later one overrides. Eh.

b) I'd like to determine the TZs from web domain name, if retrieved datetime is a floating datetime.

Example: If you parse the feed from http://example.jp/feed.xml and its datetime is floating (without TZ set, which is a valid W3CDTF format), you can "guess" its timezone as JST ("Asia/Tokyo") since its domain name is .jp.

I realize this doesn't work well for a big country like Russia/China/US with multiple TZs. And guessing would be sometimes wrong as well, since some web 2.0 services use funky domain names like ".tv", ".us" or ".to" to get the human readable domain names, in which case the actual intended TZs are different from that of those ccTLDs. I'd use the guess technique just as a fallback when the retrieved datetime is floating, which might be rare anyway.

For 1) ccTLD to ISO 3166, looks like ccTLDs and ISO 3166 code names are equivalent, with only few exception. So I wrote the following code to find out the exceptions:

use Net::Domain::TLD 'tlds';
use Locale::Country;
 
my $tlds = tlds 'cc';
 
for my $tld (tlds 'cc') {
    my $label = $tlds->{$tld};
    my $iso = code2country($tld) or warn "$tld ($label)\n";
}

to get the following result:

gg (Guernsey)
cd (Congo, Democratic Republic of the)
su (Soviet Union)
eu (European Union)
ac (Ascension Island)
uk (United Kingdom)
tp (East Timor)
je (Jersey)
im (Isle of Man)

This obviously matches with the domains and country codes listed in http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2. Looks like CD/GG/JE/IM are now part of official ISO 3166 country code (and Locale::Country isn't updated since July 2002!), SU/TP are split or changed to the new codes, and EU/UK is an union country stuff obviously.

For 2) ISO 3166 country code to Timezones, there's an exact dataset for it in the Olson DB: zone.tab file, as seen on http://en.wikipedia.org/wiki/List_of_tz_zones_by_country as well. I looked for a method to get that information in DateTime::TimeZone but there wasn't one. So well, based on the data you can easily create a Perl module for it.

Using IP::Country could definitely help a bit. IP::Country tries to resolve the hostname to an IP address, then look for a country to which the IP is assigned. So this could be useful too, for the use case b) Get the timezone from web page domain. I'm not entirely sure what if those two don't match. e.g: if sixapart.jp is hosted on US, but the domain implies it's Japan. Hmm.

Any suggestions would be appreciated.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.