Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • by ggoebel (893) <> on 2006.06.27 11:22 (#48426)
    I worked for many years in an environment where we faced the same issue. However, we had the benefit of a real RDBMS with transactions. I apologize for the emphasis. I don't see how you can solve your problem without a bulletproof locking mechanism. ...Or serializing your schema modifications through one.

    Suggestion #1
    To prevent simultaneous attempts to modify a schema, setup a RDBMS that can pass the ACID test (MySQL w/ InnoDB or BDB types). Stage updates on it and serialize the schema updates through it.

    Suggestion #2
    Don't rely on developers remembering to update source control. Someone will forget or decide they have a special case... or something. Then you'll be left in a situation where you'll be unable to reproduce what happened to the schema between two given points in time. Force all schema modifications through source control. Control access to the ACID compliant RDBMS where you stage updates.

    I'm not saying there won't be special cases. But special cases should probably require special approval. At one point, I was tasked with figuring out how to update schemas outside the approved update approval process in order to fix the update mechanism. And on another occassion to fix the reporting mechanism built into our update mechanism.

    Suggestion #3
    Your "databases" won't be able to rollback a failed update that has been partially applied. Consider breaking updates into 2 stages. In the first stage, make non-destructive changes. I.e., instead of dropping the customer.current column, rename it. In the second stage, verify your initial changes were successful, make your destructive updates and then update the config.version or roll them back. Use your RDBMS that passes the ACID test in tracking the state/stage of updates.

    On failure of stage one, there's no garrauntee that your rollback will succeed. Likewise, on success of stage one, there's no garrauntee that the destructive update will succeed. But in either case, someone should be able to go in and clean things up manually.

    Note, it'd be nice if a failed updates triggered alerts of some kind (email) to the people who'll be responsible for fixing them.

    Suggestion #4
    Occassionally, something will go wrong. It could be something simple and obvious that affects all updated servers immediately. Or it could be something subtle like encountering constraint based issues on a subset of the servers upgraded a week later.

    It would be best if you could roll updates out in stages of successively larger groups of affected machines over an extended period of time. This would require an approval mechanism of some sort. Where I worked we had both versioned and time delayed approvals. Application and database schema updates were rolled out based on version approvals, os and av updates and patches were rolled out based on time delays. Database data updates should have been time delayed... but in practice were immediate on the next pull for updates.

    We used a pull model where servers would phone home to update themselves on a staggered schedule which gave us some garrauntee that the update servers wouldn't get hammered too hard. Whether you use a push or pull, you'll want to give some consideration to the load on whatever machines push or pull updates.

    If you go with using an ACID compliant RDBMS to centrally manage updates, consider building in a reporting mechanism too. We would send back info on things like drive utilization, service failures, etc. Later, I was able to add an event monitor and alert notification service on a database that aggregated reporting data from our update servers. It would send out email to the people responsible for responding to or servicing the specified events. It was also used by other departments to determine what build versions machines in the field were using.

    Suggestion #5
    Consider keeping the schema modifications in a single file and wrapping each versioned update with a check that the $target_version >= $current version.

    While you're at it, consider wrapping individual schema changes with instrospective code. For the example you mentioned, I'd check that the customer.current column exists before you attempt to drop it... failing verbosely if it doesn't. You would be surprised at how useful such checks are when performing a series version updates or a single version update with numerous schema updates.

    This approach has the advantage of keeping all schema related code in one place. Where it is easy to find and hopefully grok. You don't have to figure out which updates to send for a particular version update. And you should be able to reapply updates without error. In the example you gave, a successive attempt to run your update to drop the customer.current column would result in an error.

    That's all I can think of at the moment... Good luck!