Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Sounds like another job for The Demoroniser [fourmilab.ch]... Or an equivalent tool.

    --
    -- "It's not magic, it's work..."
    • Actually in this case, the email isn't HTML (I don't read HTML email), it's plain text.

      It's probably a programmer somewhere who thinks that iso-8859-1 and Windows-1252 are the same thing.
      • Fair point. You can probably assume that the main problem is when Windows sends out a Windows encoding say cp-1252, but lies and calls is iso-8859-1.

        One possible solution is to scan any iso-8859-1 files, looking for any diagnostic control-code points (Demoroniser has some suggestions for that), and then ask a recoding program convert this to a civilised scheme, or correctly set the Content-Type.

        I have to deal with this all the time, users think that Windows is "correct", then the output a document, incorrectly tagged, and then when they don't see a trademark symbol (™) or micon symbol (µ) on the web site it's my fault! Yesterday several hours were wasted because of the lies that Windows tells...

        --
        -- "It's not magic, it's work..."