NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
good stuff! (Score:1)
i am just going to do some scraping work and W::S works great so far. the doc is lacking though, the examples you posted in past journal helped! have few questions though:
process "h3.ens>a",where the ens seems to be doing wildcard matching, any class name contains ens.
Parsing of undecoded UTF-8 will give garbage when decoding entitiesHTML::Parser mentioned encoding the data before it gets parsed. but i have no clue how to do that.
resultkeyword in the DSL do? i took it out of the DSL and it still works fine.Reply to This
Re: (Score:2)
Re: (Score:1)
.ens>adoes that matching any class name contain the string 'ens'? what is the syntax for exact matching on a classname then?Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
No, actually, “
.ens > a” matches an “a” element inside an element of any name with class “ens”, whereas “a[class~="ens"]” wants to see the class on the “a” element itself. The partial-match version would actually be “*[class~="ens"] > a”.Re: (Score:2)
Re: (Score:1)
er. my bad. i thought
great module, thanks!class="listing first"is one class name. it is 'listing' and 'first'.