XPath is just great at screenscraping, especially when combined with libxml2's xmllint tool for turning html into XML...
Here's the current temperature in london:
$ xmllint --html --format http://www.bbc.co.uk/weather/5day.shtml?world=0008 |
xpath 'normalize-space(string((//tr[starts-with(normalize-space(.), "Temperature")])))'
(the above finds all the <tr>'s who's text content starts with "Temperature" (of which there are two on that page), then takes the second one of those (which is the current temperature), and then does a normalize-space on the string value of that (which means strip all the tags, basically))
I personally think using XPath for screen scraping is a bit easier than other methods of doing the same, and possibly safer too. Plus you can quite nicely apply this technique to all sorts of useful systems.