NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

## All the Perl that's Practical to Extract and Report

#### grinder (1100)

grinder
(email not shown publicly)
http://www.landgren.net/perl/
Yahoo! ID: perlgrinder (Add User, Send Message)

Editor of p5p summaries, member of the p5p peanut gallery.

### Journal of grinder (1100)

Friday October 19, 2007
06:20 AM

### Another fun (?) SQL puzzle

[ #34718 ]

So, I have a database with a particularly nasty design decision. We have people who belong to cost centres, usually only one, but sometimes with a prorata on two cost centres. The programmer responsable for creating the table denormalised things, so rather than having

EMP1  UNIT1 50%
EMP1  UNIT2 50%
EMP2  UNIT3 100%
EMP3  UNIT1 100%

we have something that looks like

EMP1   UNIT1   UNIT2    50   50
EMP2   UNIT3   null    100    0
EMP3   UNIT1   null    100    0

That is, both cost centre ids in the same table, the second one usually null. It is a given that there will never be more than two. This table is of course an utter bitch to work with. Turns out we can cheat a bit, by only keeping track of the rate of the first centre, the second is just 100-first (which also helps cut down round-offs). Let us create a table to play with:

create table t1 (
id_person varchar(10),
rate      number(5,2),
unit1     varchar(3),
unit2     varchar(3),
val1      number(10),
val2      number(10)
);
insert into t1 values ('alice',   1, 'U1', null,   10,   20);
insert into t1 values ('bob',     1, 'U2', null,    4,    8);
insert into t1 values ('carol', 0.5, 'U1', 'U2',  300,  600);
insert into t1 values ('david', 0.2, 'U1', 'U3', 6000, 8000);

Now I want the sum the values val1 and val2 by unit, keeping in mind that for 'david', VAL1 6000 * 0.2 = 1200 is summed to U1, and the difference, 4800, to U3. Similarly, for VAL2, 1600 to U1 and 6400 to U3. In other words, I want the following result set:

U1 1360 1920
U2  154  308
U3 4800 6400

Now the only way that I can see is to:

1. Sum the value proratas for all people on their first cost centre. If they have one centre, they get the full hit, otherwise it's prorata'ed between it and the second.
2. Union the above with the people having two centres, by subtracting the prorata of the first centre from the total value to arrive at the second prorata (to minimise roundoff errors, otherwise the sum of both centres will lose a cent compared with the initial sum from time to time).
3. Treat all that as a derived table, and sum the results.

This gives the following:

select
S.UNIT  UNIT
,sum(V1) V1_TOT
,sum(V2) V2_TOT
from (
select
unit1            UNIT
,sum(val1 * rate) V1
,sum(val2 * rate) V2
from
t1
group by
unit1
union select
unit2                   UNIT
,sum(val1 - val1 * rate) V1
,sum(val2 - val2 * rate) V2
from
t1
where
unit2 is not null
group by
unit2
) S
group by S.UNIT
order by S.UNIT

That's pretty ugly. Is there a better way?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Full
Abbreviated
Hidden