Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

tomhukins (4457)

  (email not shown publicly)

Perl, Web, Database developer, and Milton Keynes Perl Monger [].

Journal of tomhukins (4457)

Friday November 04, 2005
09:09 AM

HTML::Tidy Revelation

[ #27445 ]

In the WWW::Mechanize talk I gave at this year's YAPC::Europe and NPW, I describe how I have passed invalid HTML through the command line tool tidy before passing it to XML::LibXML to process.

I mention that I don't use HTML::Tidy because it doesn't actually clean the HTML, it just checks for warnings. At least, that's what I thought.

Robbie, who I work with, has just showed me some code where he calls clean to do this. In my defence, the documentation confused me by saying this method returns true, whereas it actually returns the cleaned content, which happens to evaluate to true. I should get into the habit of reading documentation on AnnoCPAN, which mentions this.

I hope I haven't encouraged too many people to use a separate process to do something a CPAN module already does. The module's name makes its purpose clear enough.


The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • I should get into the habit of reading documentation on AnnoCPAN, which mentions this

    Surely this is a documentation bug and should have been reported via RT ?
    Why should module users have to check yet another documentation source ?

    Not that I have anything against AnnoCPAN in principle but this seems a good example of using it incorrectly.

    • My initial reaction was to report this with RT, but I didn't want to duplicate existing information in case CPAN authors feel overloaded.

      It's a tough call, but your response prompted me to go with my gut feeling and send a patch in Bug #15573 [].

      Thanks for the suggestion, essuu.

      • I'd always prefer if users err on the side of too many reports than too few. Your specific situation with a problem might be different than everyone else's anyway.