Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

TorgoX (1933)


"Il est beau comme la retractilité des serres des oiseaux rapaces [...] et surtout, comme la rencontre fortuite sur une table de dissection d'une machine à coudre et d'un parapluie !" -- Lautréamont

Journal of TorgoX (1933)

Friday April 26, 2002
04:13 PM

/Perl and LWP/ book

[ #4486 ]
You can now preorder my book Perl & LWP at Amazon and at BN. Pay no attention to both sites' miscapitalizion of the title.

I think there'll be a sample chapter put up some time or other, but in the mean time, here's a descriptive ToC from the Preface:

Chapter 1, Preparing for LWP, covers in general terms what LWP does, the alternatives to using LWP, and when you shouldn't use LWP.

Chapter 2, HTTP and LWP::Simple, explains how the web works and the easy-to-use yet limited functions for accessing it.

Chapter 3, The LWP Classes for HTTP, covers the more powerful interface to the web.

Chapter 4, The LWP Classes for URLs, shows how to parse URLs with the URI class, and how to convert between relative and absolute URLs.

Chapter 5, Forms, describes how to submit GET and POST forms.

Chapter 6, HTML Processing with Regular Expressions, shows how to extract information from HTML using regular expressions.

Chapter 7, HTML Processing with Tokens, provides an alternative approach to extracting data from HTML using the HTML::TokeParser module.

Chapter 8, Tokenizing Walkthrough, is a case study of data extraction using tokens.

Chapter 9, HTML Processing with Trees, shows how to extract data from HTML using the HTML::TreeBuilder module.

Chapter 10, Modifying HTML with Trees, covers the use of HTML::TreeBuilder to modify HTML files.

Chapter 11, Cookies, Authentication, and Advanced Requests, deals with the tougher parts of requests.

Chapter 12, Spiders, explores the technological issues involved in automating the download of more than one page from a site.

Appendix A, LWP Modules, is a complete list of the LWP modules.

Appendix B, HTTP Status Codes, is a list of HTTP codes, what they mean, and whether LWP considers them error or succcess.

Appendix C, Common MIME Types, contains the most common MIME types and what they mean.

Appendix D, Common Language Tags, is a list of the most common language tags and what they mean (e.g., "zh-cn" means Mainland Chinese, while "sv" is Swedish).

Appendix E, Common Content Encodings, is a list of the most common character encodings ("character sets") and the tags that identify them.

Appendix F, ASCII Table, is a table to help you make sense of the most common Unicode characters. It shows the character, its numeric code (in decimal, octal, and hex), and any HTML escapes there may be for it.

Appendix G, User's View of Object-Oriented Modules, is an introduction to the use of Perl's object-oriented programming features.

I would love it if people here 1) ordered the book, and 2) wrote reviews about how happy they are with it, as compared to not having it at all (which is the only alternative, as this is the only book on LWP out there). There is an unwelcome trend lately toward only malcontents bothering to write Amazon reviews of tech books. ("And on p 43 he misspells 'Bette Midler' as 'Bette Middler' so ZERO STARS THIS BOOK SUCKS!") Hopefully more positive and germane reviews can be written for my lucious book!

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • I was supposed to have tech review comments but due to big project and me arseing things up I missed the deadline.


    I've finally got round to reading it though and it is very good. Infact I'm going to order a copy as it is a lot better than the LWP docs and a lot easier on the eyes (I hate reading electronic docs). It also has HTML::TreeBuilder stuff which is good as sometimes I feel like I'm the only one who uses it :)

    It has inspired me to go and poke through some of my code and update it to use