Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • See the parse_html_* [] methods in LibXML.
    • I wasn't sufficiently clear in my original message. I was trying the parse_html_* methods in XML::LibXML and they were whining about broken HTML in the two pages I was playing with. So I said "screw it" and sent back to parsing those with HTML::* modules.


      • Doh. HTML parsers that can't parse broken HTML aren't that useful :)

        Have you tried HTML::TreeBuilder with Class::XPath []?
        • I haven't, but boy that's really cute. I was wondering the other day whether there were more general XPath modules available. You know, with a little optimization (the ability to search a tree once but have multiple possible XPath expressions and associated actions to run at each step), you could use XPath as the basis for your optimizer--write XPath expressions for the things to optimize.

          Ah yes, I've known about XPath for three days. Why wouldn't I assume I've had an original thought :-)