Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Will bite you in the end. Trust me. I've just come from a project where we've used hack upon hack upon hack upon hack to ensure that entities get preserved in one state or another. But the trouble is that you've effectively got several layers of character encoding. In our case, we ended up with stuff in the database which contained & et al. Well, in some tables we did. In others we had UTF-8. And the search engine saw character references and turned them into latin-1. Sometimes. So you neve

    • Actually, it sounds like you are dealing with a different problem.

      We own the data that we are serving up through this web app. So it's fully normalized by the time it's parsed in this pipeline. The problem is more about keeping the entities that are in there from being converted into UTF-8.