Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

miyagawa (1653)

miyagawa
  (email not shown publicly)
http://bulknews.vox.com/
AOL IM: bulknews (Add Buddy, Send Message)

Journal of miyagawa (1653)

Saturday September 23, 2006
02:30 AM

CSS selector in Perl

[ #31090 ]

Ruby library scrAPI looks promising. It allows you to write scraper code using CSS selector, like:

scra per = Scraper.define do
  process 'span.title > a:first-child', :title => :text, :url => '@href'
  process 'ul.list-circle > li:first-child > a', :category => :text
  result :title, :url, :category
end
 
html = open(url).read
scraper.scrape(html)

In Plagger's EntryFullText module and alike, we use regular experssion and/or XPath to extract these kinds of information, and i think adding CSS selector would be neat too.

Are there already perl module to do the similar things on CPAN? I searched for it but couldn't find any. CSS.pm doesn't do such things.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.