Monday August 21, 2006
I've been looking for simple HTML-to-text converter.
HTML::FormatText does most of what I want, but it does more than that. Rendering HR tags to horizontal "-----" is one example. I don't like that.
HTML::Element has as_text() method, which is very close to what I want. But apparently, it doesn't do the right thing with img@alt attribute (<img src="foo.jpg" alt="Bar" /> is dumped empty, not "Bar"), and "Foo<br />Bar" is dumped as "FooBar", not "Foo Bar".
I chatted with Yuval (nothingmuch) in #catalyst and would like to write a simple Visitor module to do with HTML::TreeBuilder generated tree.
If that sounds like a duplicate of someone else's work, let me know.