Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

btilly (5037)

btilly
  (email not shown publicly)

Journal of btilly (5037)

Saturday May 24, 2008
12:02 AM

Thank you CPAN

[ #36507 ]

Everyone's favorite reason to use Perl just came through for me again. Let me give some background.

At $work I am in charge of reporting. Our business has something called events. Different events are treated differently, and sometimes we make money from them and sometimes we don't. Sometimes we make more money from them, and sometimes less. One of the things we want reports for is to figure out what factors make events make more money. (Obviously because we want to make more events make money for us!)

Now I have a fairly flexible report that we'll call revenue per event, because that is its name. It allows us to see revenue per event at various ages broken out by various combinations of factors. Such as whether passwords are needed to login, whether gift certificates were added, whether specific promotions were run, that kind of thing. This is a very useful report. We can see that, for instance, running a promotion brings in money (duh) and tell about how much more money events with that promotion make.

But we have a problem. You see, we have a pretty good idea what factors make events make more money. (Unfortunately we don't control how the event is set up, we're handling them as a service for our direct customers.) So when people take our advice they do several things that are good. How can we tell how good each individual thing they are doing is?

Hmm..let's see. Sounds like we need to do some sort of multi-variable linear regression. Why don't we look on CPAN and find Statistics::Regression and see if that works? Oh look, it does! I've used it in the past. But look, it added a method called standarderrors, what is that? (Run some tests, make hypothesis, email author, get confirmation.) Goody, we not only can find a linear regression, but we can get estimates of how much random noise in the data might be throwing off the coefficients! I had been bothered by the fact that people tend to take the numbers I produce as gospel with no eye to whether there was any statistical validity to the numbers.

Hrm, do I trust the module? I take a legitimate pride in knowing a fair amount of math, but I know I don't know how to do this. Well look at the source code and..holy crap! OK, for me to learn this to my satisfaction would take a long time. What programmer contributed it and do I trust it? Rummage around and do some research...oh, he's a professor at Brown. He teaches courses on multi-variable statistics. I think I can trust that he knows his stuff! :-)

Getting my report to do multi-variable regression and display it the way I wanted still wasn't easy. But at least it wasn't easy for programming reasons (some day I need to write something explaining why it may be important to put a condition in an ON clause rather than a WHERE clause - I spent an hour tracking down the resulting bug), and not because I couldn't figure out the math.

Where else but CPAN would you expect to find something like that?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.