Re: The Right Way to manage schemas in SCM systems

From: "Aaron Bono" <postgresql(at)aranya(dot)com>
To: "Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: The Right Way to manage schemas in SCM systems
Date: 2006-08-15 19:18:13
Message-ID: bf05e51c0608151218t6ebab4a4jd5f157eeecf8849c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 11 Aug 2006 08:05:34 -0700, Andrew Hammond <
andrew(dot)george(dot)hammond(at)gmail(dot)com> wrote:
>
> I've been trying to figure out a good way to manage schema change
> control for a while now. Since I have a development background, I
> really want to find some way to check the schema into a SCM system like
> CVS (for example). Just using a pg_dump doesn't work very well becase
> there's no guarantee of consistent ordering. One of the things I find
> most useful with SCM systems is the ability to quickly spot changes. A
> re-ordering within a single file makes for a lot of noise in the
> deltas.
>
> So far, the best idea I've come up with is the "file tree dump"
> approach: get a list of every object in the database and then dump it
> into a suitably named file. Finally, create a master file which
> consists of only include lines. There are a couple of problems I can
> see with this.
>
> 1) How to calculate the dependancy graph (or worse, dealing with RI
> loops) to determine the right order to load things in isn't stupidly
> obvious.
> 2) If I just script this in bash, without making any changes to
> pg_dump, it doesn't happen in a single transaction.
> 3) No clever solution for relative vs absolute filenames (unless all
> this goes into a tar file, but that format already exists and doesn't
> solve my problem).
>
> So my question is, does anyone have a better way of managing schemas in
> SCM systems? And if not, is there a precieved value in the community
> for a pg_dump --format scm_friendly option? (In which case I'll take
> this thread over to hackers)

The approach you are taking here is difficult to implement but makes life
easy on developers who like to make changes to the database whenever
needed. It can be great for rapid development but I have chosen to take a
different approach - one that requires a little more discipline up front but
is better for controlling your testing and production environments.

For every project, I first create DDL scripts for my database. I have a
code builder that does this for me but you could also just create the
database and use the pg_dump to get your initial DDL.

Next I save this DDL in CVS. I break each schema into a separate script.
If you are worried about the order, do as pg_dump does - create the tables
first, then add the constraints like foreign keys after the structure is
there.

This is great for the initial setup. To deal with change, I have a manual
process in place and use an auditing process to check that everything is
done correctly. Here is how it works:

Development:
1. Copy the production database into the development database (we do this at
least once a week).
2. Make the change in development - the nice thing here is I use EMS Manager
and whenever I make a change it provides the SQL required to make the
change.
3. Copy the SQL for the change (from EMS Manager) and put it into a
migration script (this will be executed in production when we deploy the
application) - order is very important here.
4. Make the change to the initial DDL and check that change into CVS. Our
code builder helps out a lot with this when the changes are large.
5. Update my application code and test with new changes in development.
Again out code builder does a lot of this for us.
6. Drop the development database, refresh it from production, run the
migration script and test the new code - if all goes well it is ready for
production.
7. Deploy to production. We never allow developers to directly make changes
to production. Only our administrators have that capability.

Audit:
1. Get a copy of production and put it into development.
2. Run the DDL from CVS and put it into an audit database.
3. Run a database diff - we use DB Comparer (same company as EMS Manager).
4. Reconcile differences and put into DDL.

We also will, from time to time, test the application against the DDL rather
than from a copy of production.

Currently we are working on ways to automate this mostly manual process.
The nice thing is, the tools we use allow us to do the work fairly quickly.
I like a little manual work in the whole process though as it keeps the
developers better atuned to the database structure and the changes that are
being made to it.

If you were looking for a way to just backup the structure in CVS every day,
you may consider writing a script (Perl would be an excellent choice for
this) that reads the pg_dump and splits it out into separate files for each
schema/table.

==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Kyle Bateman 2006-08-15 21:58:39 Re: Using bitmap index scans-more efficient
Previous Message Andrew Sullivan 2006-08-15 14:45:39 Re: Multiple DB join