Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jozef (8299)

  (email not shown publicly)

Journal of jozef (8299)

Sunday July 19, 2009
09:24 AM

i18n of wikipedia links

[ #39311 ]

The day before yesterday (Friday) we went for some beer ( social meeting) and we spoke a lot. ;-)

Some time ago I've asked potyl to make some French translations for me. Inside the translations there were also wikipedia links so that it's possible to point to the French wikipedia. (instead of English for English translations). So on Friday potyl told me that the link i18n is a task for program and not for a human. For sure that it IS MACHINE WORK! I should have known, but sometimes "people" don't see the obvious. ;-)

update wikipedia links script

I've wrote that script on my way back to Vienna in the train. And it took no more than 1h. It's universal for en to any language.

Basicaly it's scraping the For my ~70 links it should be fine but before you do the same, read "Why not just retrieve data from at runtime?" - robots has rate limit of 1req/s. Wikipedia also offers untransformed raw database format or the database dumps for the users with the "most interest".

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.