Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

grantm (164)

grantm
  (email not shown publicly)
http://www.mclean.net.nz/

Just a simple [cpan.org] guy, hacking Perl for fun and profit since way back in the last millenium. You may find me hanging around in the monestary [perlmonks.org].

What am I working on right now? Probably the Sprog project [sourceforge.net].

GnuPG key Fingerprint:
6CA8 2022 5006 70E9 2D66
AE3F 1AF1 A20A 4CC0 0851

Journal of grantm (164)

Wednesday September 27, 2006
08:32 PM

Embrace Change

[ #31142 ]

This recent journal entry links to an article which asserts that changing data structures in a relational database is hard.

Pah!

As autarch points out regardless of how you store your data, a change to the structure of the data will require a change to the code that uses it.

People are often resistant to change, even when the change is a fix to something that's broken. Often this is simply due to inertia. Sometimes, especially in the IT world, it's due to a fear of breaking other stuff.

In my $work_project we aim to develop for the current requirements without trying to second-guess future requirements. In fact just about anything I can think of that we put in because we thought we'd need it in the future, turned out to be a mistake. We all make mistakes. Hopefully we learn from our mistakes. And if we embrace change rather than fearing it, hopefully we can fix our mistakes as they become apparent.

Our database schema is moderately complex. Our PostgreSQL database has nearly 200 inter-related tables and takes up around 60GB of storage. As we add new features to the application, data structures and relationships are added and updated. Often these updates will have a ripple effect where existing bits of code need to be changed as well. Do we fear change? No, we embrace it. Our development workflow and infrastructure has been built up over time, to support frequent change.

Regression tests are obviously a critical part of how we work. We have almost 2000 test scripts which we use like this:

  • A developer will add tests and code and will make sure the relevant subset of tests pass before committing new code
  • A dedicated test server is constantly running the full test suite (takes about 30 mins) and tagging the code as 'good' when the tests pass
  • Known 'good' code is checked out automatically onto the staging servers periodically so that the client can do user acceptance testing etc
  • Once we have signoff, a 'release' is pushed out to the production servers

Another critical element of our project workflow is our 'apply_patches.pl' script. A 'patch' in this context is a .sql file that can be fed into psql to update the database schema. A file naming scheme is used to ensure that patches get applied in a consistent order and to allow us to know what 'patch level' any system is up to.

The patches are applied automatically in the development environment when we run regression tests. They are also applied automatically when 'good' code is checked out in the staging environment. Our production migration process is scripted and it also applies patches automatically. By the time a patch has made its way through developer testing and staging deployment, the chances of it going wrong in a production migration are fairly slim. And because it's all automated, we can be confident that code changes and schema changes will remain in sync.

In the three years since the project went live, nearly 450 schema patches have been deployed. Obviously many of those are trivial inserts/updates/deletes to tables of 'constants' (or 'lookup' data), but patches which add/drop/alter tables number in the hundreds.

Fear of making a mistake can slow development. Providing an infrastructure which supports rapidly fixing mistakes has significantly improved our productivity.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I have been trying to reach full automation in this for myself but being a small shop I keep falling back to various manual updates. Ocassionally I also seem to need more processing during the upgrade than a simple set of SQL statements. (It might of course be due to my lack of SQL expertise)

    It would be great if you could show us some of your code or point us to the tools you are using.

    --
    • There's a simplified version of our apply_patches script here [pm.org].

      It's built around the psql utility that comes with PostgreSQL. We could have used DBI of course, but the psql macro support (the \set command) has been very useful. It also allows us to debug a patch by piping it into psql directly.

      The default behaviour is to query the installed patchlevel, search for patches of a 'higher' level and install any that are found.

      The script uses command-line options to specify database host, database, and login

  • Fear of making a mistake can slow development.

    Indeed, I've seen the things people do to prevent making mistakes slow down development and ossify systems into unchangeable messes way too often. (Look at the languages people design to prevent bad developers from making mistakes, or change control requests to prevent customers from giving feedback after development has started.)

  • I used to work for a company that had similar concerns. Its database had over 1000 tables, most of them with 3 triggers on them, and hundreds of stored procedures. Dozens of database modification scripts were submitted each week.

    When I started to work there, it was chaos. Developpers submitted scripts, and noone was really testing how well the scripts worked, usually leading to very frustrated field engineers (those applying the patches to the customers).

    15 months later, no developer could even submit a d