Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jplindstrom (594)

  (email not shown publicly)

Journal of jplindstrom (594)

Tuesday June 28, 2005
07:26 AM

Wiki Replication - Unicode problems solved

[ #25402 ]

In replicating a waaay old MoinMoin wiki to the latest version (it's like ten versions in between, I doubt it's feasible to upgrade all that way) I found that some pages didn't get stored right.

That was strange, because this was an existing wiki replicator that worked fine in the past.

After some detective work I figured out that all failing pages contained åäö or £ or something like that and remembered reading they've moved to utf8 for everything in the new MoinMoin version. Ahh!

So in the POST, I tried to somehow set the content type to utf8 like this:

my $req = POST($url,
    Content_Type => 'application/x-www-form-urlencoded; charset=utf-8',
    Content => [
       action => "savepage",
       datestamp => $timestamp,
       savetext => $markup,
       comment => "WiMi by $author",
       button_save => "Save Changes",

That didn't work either, obviously because the old string is in Latin-1 and so the high-byte values was misinterpreted by the MoinMoin Python code.

Next step: perldoc utf8, which pointed to the Encode module. This little thing worked, even without the extra header:

my $utf8Markup = encode("utf8", $markup);

Et voila!

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.