Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

joedoc (3603)

joedoc
  (email not shown publicly)
http://attaboy.tommydoc.net/
AOL IM: joedoc21 (Add Buddy, Send Message)

Journal of joedoc (3603)

Friday November 08, 2002
05:53 AM

IIS web long file names, and other new officers...

[ #8838 ]

We have two web servers at my job. Well, actually five, but only two matter right now.

One of the two is our main server, our image to the community and the world. My design, runs on a home-made rackmount box running Linux, Apache, Perl and all good things OSS.

The other is the box from which we serve our product, meteorological images and data. The product is either produced or stored on that system. That the product is served from it is really secondary to some folks around here, but when I suggest moving the product to the Linux system, I hear shudders. The box is an NT system. Those shudders are fear of *nix by the people who "manage" that system. So, I have to live with it.

We have a new executive officer. He's young, energetic, and a pain in my ass. He arrived right before the new commanding officer, who's a bit older, more laid back, and away a lot.

The new XO is asking for a lot of things, including something he keeps referring to as "web metrics."

{He likes to toss around phrases like "paradigm shift" and "think out of the box" and "we have to many legacy systems." These three things translate, as you all know, to "doing stuff differently", "changing everything because nothing works" and "we have too much old crap."}

I asked him what he means by "web metrics." Turns out he wants web statistics, which is what I figured, but I wanted to make him say the right words. Simple. I grab the log files from the main server, run analog, toss the resulting page on the web site, viola.

But, wait. We needs stats for the [i]products[/i] on the other server...that's our bread-and-butter. OK, no problem. Grab a log file from the NT box, run analog...oops.

Here's where I find out more about how Microsoft's way of doing things is fundamentally flawed in every way. The log files on the NT box (runs IIS version 4.0) go back three, four years. Daily logs. Gigs of them. One problem: when IIS was set up, they used the default settings for the logs, which don't put the date/time with each log entry (only at the top of each file). This makes analog choke, rendering four years of log files pretty useless. There is a utility that will add the date, but who has the time...maybe I'll write a script.

After explaining this to them, I reconfigured the NT box to do weekly logs in W3C Extended format. I gave it a couple of weeks to build up some data, then began working on the scripts to roll the logs to the intranet server and run analog on them.

{Now, I know some of you are wondering why I would go to all the trouble of building new scripts, from scratch, to do all this, when there's probably some very fine material out on the net that's already been written. Remember, I'm still pretty new at a lot of this, so building this setup will, hopefully, make me a better perl programmer. Which is the point, right?}

Now, here's Microsoft/IIS screwup number two. My plan is to have all the log entries on the Apache system go to a single log file. On Sunday night, my perl procedure will move the active long file, restsrt Apache to recreate new files, pull the rolled file to the intranet box, run analog, add to stats page, etc., etc. Not complicated.

However, there's a problem with the way IIS does weekly files. First, it names the file based on the month and the week, and you can't configure it to just give it a generic name, as you can with Apache. The second problem is their concept of a "weekly" log file. If the calendar month ends in the middle of a week, IIS closes the active log file and opens a new one. Apparently, they assume that if the month changes in mid-week, the first week of the new month must only have days from that month included. {Don't they understand the concept of flexibility?}

So, if you automate the process of grabbing those weekly files, you have to account for November beginning on Wednesday, then search for two files, cat them together, then run analog.

Naturally, perl has enough magic to permit me to figure out a way around all this. For right now, the simple way is to just run daily logs on the IIS server, have the perl script run nightly to get the daily file to the intranet server, and cat it to a weekly file there. Then, I can analog it on Sunday morning along with the Apache file.

Perhaps this is another example of evil plotting by Microsoft for world domination, blah, blah, blah, but I still have to work with their stuff. And I really can't imagine the IIS developers sitting around saying "let's make this log file thing really nasty for joedoc and every other OSS freak out there."

Besides, this is why they pay us the big bucks, right?