Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • As I stated on Perlmonks today ;--): Did you try tidy? http://tidy.sourceforge.net/
    --
    mirod
    • by Ovid (2709) on 2002.12.19 20:46 (#15562) Homepage Journal

      I didn't see your reply on Perlmonks! In any event, I'm using HTML::TokeParser::Simple to get around the problems with bad HTML and it's worked fine. My basic method is to parse my sample HTML, create a bunch of tokens that I store in an array. Then, with the target HTML, I do the same thing and if, at any point, I have a matching token stream, I do the replacement. So far, it's worked out much better than I thought.