I have received two revisions (and may ultimately receive more) of a Word document specification (and I use that term loosely). The main part of this document I'm concerned with is a series of tables. I am extracting these tables into an Excel spreadsheet by cutting and pasting. I then process the spreadsheet with a custom Perl program that spits out YAML, SQL DDL, and a couple of other important goodies.
Obviously I'd like to eliminate the cut and paste part of this process. Besides being something I just don't want to do, it is error prone, slow, and difficult to consistently replicate.
Does anyone know of a way I can automate this extraction process? I'm willing to consider any language, if necessary, though of course I prefer Perl. I'm also willing to consider intermediate formats, such as converting to OpenOffice, AbiWord, or whatever. (My Excel spreadsheet is already an intermediate format.) I'd like any such conversions to also be automateable, but if I had to manually convert and then extract it would still shrink down the human-driven, error-prone, unreplicable part of this process by at least an order of magnitude.
Incidentally, I have reason to believe that the