Stories
Slash Boxes
Comments

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

miyagawa (1653)

miyagawa
  (email not shown publicly)
http://bulknews.vox.com/
AOL IM: bulknews (Add Buddy, Send Message)

Journal of miyagawa (1653)

Saturday September 23, 2006
03:30 AM

CSS selector in Perl

[ #31090 ]

Ruby library scrAPI looks promising. It allows you to write scraper code using CSS selector, like:

scra per = Scraper.define do
  process 'span.title > a:first-child', :title => :text, :url => '@href'
  process 'ul.list-circle > li:first-child > a', :category => :text
  result :title, :url, :category
end
 
html = open(url).read
scraper.scrape(html)

In Plagger's EntryFullText module and alike, we use regular experssion and/or XPath to extract these kinds of information, and i think adding CSS selector would be neat too.

Are there already perl module to do the similar things on CPAN? I searched for it but couldn't find any. CSS.pm doesn't do such things.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.