Stories
Slash Boxes
Comments

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jplindstrom (594)

jplindstrom
  (email not shown publicly)

Journal of jplindstrom (594)

Saturday May 11, 2002
11:30 AM

Unicode transliteration

[ #4855 ]

I'm doing some processing of the OpenDirectory dmoz RDF dump, but couldn't find a decent way to make sense of the Unicode chars.

After having lucked out on Google and CPAN for a week I finally found the way to transcribe utf8 text to Latin-1:

http://groups.google.com/groups?selm=note-18266%40php.net

My previously home grown version is slightly more complete when it comes to e.g. Romanian chars. It's 100% manual though... (I log missing chars and add them by looking at the dmoz.org web site :)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.