Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • Any chance of seeing this perl code?
    I could learn a great amount from it, I'm sure...
    • Sure! I suppose I should have done more hackery to automatically determine the credentials() arguments from the URL, but I couldn't be buggered :-)

      #!/usr/bin/perl -w

      use LWP;
      use HTML::TableContentParser;
      use Getopt::Std;
      use strict;

      # username and password for ORA intranet
      my ($USERNAME, $PASSWORD) = ('CHANGE', 'ME');

      # where to store files.  change this!
      my $DIR = ($^O eq "darwin") ? '/Users/gnat/Ora/Paperwork/edcal'

  • Recipe? (Score:2, Insightful)

    This would make a nice Cookbook recipe...

    • D'oh, good point. I can't believe I didn't think of that. Thanks, applied. :-)


      • There's another module HTML::TableExtract to parse HTML tables. I have used this, and it is pretty nice. I haven't looked at HTML::TableContentParser, so can't really compare, yet.

        Also, look at WWW::Mechanize, which is really awesome for scraping web content. There is WWW::Mechanize::Shell, for writing quick scripts to this kinda stuff.

        Just some more info for you to chew on while you write that cookbook entry.


        • I spent a long time looking for data with column headings for HTML::TableExtract to work on. I finally found some census data [], but after half an hour of trying, I couldn't make H::TE grok the nested table headings. I finally gave up and just documented HTML::TableContentParser. Sorry!


  • Excel to XML (Score:3, Insightful)

    by darobin (1316) on 2003.04.29 10:41 (#19573) Homepage Journal

    If the Excel is usable, then you might want to try XML::SAXDriver::Excel at some point (or for a similar problem involving surviving in an office with M$ users).


    -- Robin Berjon []