Re: [SQL] (Ab)Using schemas and inheritance

Lists: pgsql-generalpgsql-sql
From: Jorge Godoy <jgodoy(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: (Ab)Using schemas and inheritance
Date: 2006-05-23 20:16:54
Message-ID: 200605231716.58784.jgodoy@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql


Hi!

I'm modelling an application that will have data -- financial data, human
resources, etc. -- for several hundred (even thousands) of companies. This
is for an accounting office.

I could put some kind of "company_id" column in all of my tables to separate
data in a more standard way, I could create a separate schema for each client
and then create all needed tables in there (could I? I'd be abusing schemas
here and this is part of my doubt) and, finally, I could create a "base"
schema, define all my standard tables and create an individual schema for
each client where I'd inherit from those base.tables.

This would allow me to separate all information with a "SET search_path TO
company" without having to make the restriction "by hand" (on "company_id",
for example). It would also allow me to view some complete statistics
grouping all clients by SELECTing data from the base schema. We're testing
views and functions to see how they behave with inherited tables and changes
on "search_path", and it looks like we can do that for, at least, a small
number of schemas.

Of course, this has implications on permissions as well, so there will be a
large number of groups -- probably at least one per schema + some common
groups -- and roles as well...

Is this a good idea? Would this be too bad, performance-wise, if I had
thousands of schemas to use like that? Any advice on better approaches? Any
expected problems?

TIA,
--
Jorge Godoy <jgodoy(at)gmail(dot)com>


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Jorge Godoy <jgodoy(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-23 23:26:09
Message-ID: 20060523232609.GP64371@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Moving to -general, where it's more likely that others will have input.

On Tue, May 23, 2006 at 05:16:54PM -0300, Jorge Godoy wrote:
> I'm modelling an application that will have data -- financial data, human
> resources, etc. -- for several hundred (even thousands) of companies. This
> is for an accounting office.
>
> I could put some kind of "company_id" column in all of my tables to separate
> data in a more standard way, I could create a separate schema for each client
> and then create all needed tables in there (could I? I'd be abusing schemas
> here and this is part of my doubt) and, finally, I could create a "base"
> schema, define all my standard tables and create an individual schema for
> each client where I'd inherit from those base.tables.
>
> This would allow me to separate all information with a "SET search_path TO
> company" without having to make the restriction "by hand" (on "company_id",
> for example). It would also allow me to view some complete statistics
> grouping all clients by SELECTing data from the base schema. We're testing
> views and functions to see how they behave with inherited tables and changes
> on "search_path", and it looks like we can do that for, at least, a small
> number of schemas.
>
> Of course, this has implications on permissions as well, so there will be a
> large number of groups -- probably at least one per schema + some common
> groups -- and roles as well...
>
>
> Is this a good idea? Would this be too bad, performance-wise, if I had
> thousands of schemas to use like that? Any advice on better approaches? Any
> expected problems?

One issue is that you'll probably be breaking new ground here a bit; I
suspect there's very few people that are using more than a handful of
schemas. Shouldn't pose any issues, but you never know; although any
issues you do run into should only be performance problems.

Another consideration is that the free space map doesn't care too much
for tracking space info on tons of small tables.

Perhaps the biggest issue is: what happens when you need to do DDL? If
you have 1000 schemas that should be identical, you'll need to perform
any DDL 1000 times.

But as you point out, there's some interesting advantages to using
schemas like this.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: "Tony Wasson" <ajwasson(at)gmail(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: "Jorge Godoy" <jgodoy(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 00:54:09
Message-ID: 6d8daee30605231754s58364cd1o79a6520f203a1734@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On 5/23/06, Jim C. Nasby <jnasby(at)pervasive(dot)com> wrote:
> > Is this a good idea? Would this be too bad, performance-wise, if I had
> > thousands of schemas to use like that? Any advice on better approaches? Any
> > expected problems?
>
> One issue is that you'll probably be breaking new ground here a bit; I
> suspect there's very few people that are using more than a handful of
> schemas. Shouldn't pose any issues, but you never know; although any
> issues you do run into should only be performance problems.

$.02 about lots of schemas.

I worked with an application that had 500 schemas and that worked very
well. However, as the number of schemas exceeeded 8000 the query speed
started to degrade. Running \d with a single schema in your search
path took a few seconds with that many schemas. Queries that were
running in 100ms were now taking about 600ms.

In short, postgresql will support lots of schemas.


From: Jorge Godoy <jgodoy(at)gmail(dot)com>
To: "Tony Wasson" <ajwasson(at)gmail(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 01:09:45
Message-ID: 200605232209.48130.jgodoy@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Em Terça 23 Maio 2006 21:54, Tony Wasson escreveu:
>
> $.02 about lots of schemas.
>
> I worked with an application that had 500 schemas and that worked very
> well. However, as the number of schemas exceeeded 8000 the query speed
> started to degrade. Running \d with a single schema in your search
> path took a few seconds with that many schemas. Queries that were
> running in 100ms were now taking about 600ms.
>
> In short, postgresql will support lots of schemas.

Hi Tony.

Thanks. But then, it will degrade a lot when I reach "thousands" of schemas
or it will degrade if I have too many schemas on my search path? It was a
little ambiguous from your answer above.

Be seeing you,
--
Jorge Godoy <jgodoy(at)gmail(dot)com>


From: Jorge Godoy <jgodoy(at)gmail(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 01:37:03
Message-ID: 200605232237.04670.jgodoy@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Em Terça 23 Maio 2006 20:26, Jim C. Nasby escreveu:
> Moving to -general, where it's more likely that others will have input.

Thanks Jim. This wasn't clear to me when I subscribed to the other mailing
list.

> One issue is that you'll probably be breaking new ground here a bit; I
> suspect there's very few people that are using more than a handful of
> schemas. Shouldn't pose any issues, but you never know; although any
> issues you do run into should only be performance problems.

If these come when I have, e.g., the 8k schemas Tony mentioned, then I could,
definitely, partition my database or use some other technic to divide the
amount of schemas I have on each server. (I hope we have that kind of
problem because it will mean a very successful application :-))

> Another consideration is that the free space map doesn't care too much
> for tracking space info on tons of small tables.

Hmmmm...

> Perhaps the biggest issue is: what happens when you need to do DDL? If
> you have 1000 schemas that should be identical, you'll need to perform
> any DDL 1000 times.

Not really. As I said, I'm planning on using inheritance also. We have
tested that modifications to the parent table reflects on child tables that
inherited from it. So, if I add some new column then I'll have it added to
all 1k schemas.

Anyway, if we have to make some change that isn't inherited by child tables we
can write some script or function to help looping through all tables and
schemas.

> But as you point out, there's some interesting advantages to using
> schemas like this.

I see it as a very interesting possibility, specially when combined with
inheritance.

--
Jorge Godoy <jgodoy(at)gmail(dot)com>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tony Wasson" <ajwasson(at)gmail(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Jorge Godoy" <jgodoy(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 03:08:58
Message-ID: 25040.1148440138@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

"Tony Wasson" <ajwasson(at)gmail(dot)com> writes:
> $.02 about lots of schemas.

> I worked with an application that had 500 schemas and that worked very
> well. However, as the number of schemas exceeeded 8000 the query speed
> started to degrade. Running \d with a single schema in your search
> path took a few seconds with that many schemas. Queries that were
> running in 100ms were now taking about 600ms.

Note that this probably says more about \d than about anything else.
I can believe that having a large number of schemas listed in your
search path would suck, but there's not a good reason for lots of
schemas unrelated to your session to cause you any great pain.
(\d does some searching that might be impacted by lots of schemas,
but that doesn't say ordinary queries would be.)

If you've got a counterexample please file a bug with details ...

regards, tom lane


From: Alban Hertroys <alban(at)magproductions(dot)nl>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Jorge Godoy <jgodoy(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 09:09:38
Message-ID: 447422D2.4060403@magproductions.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Jim C. Nasby wrote:
> Moving to -general, where it's more likely that others will have input.
> Perhaps the biggest issue is: what happens when you need to do DDL? If
> you have 1000 schemas that should be identical, you'll need to perform
> any DDL 1000 times.

What about using updatable views instead of inheritence? You'd need your
company_id back, but adding new companies or modifying table definitions
could be a lot easier (as long as you don't need to update all of your
views...).

--
Alban Hertroys
alban(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //


From: Jorge Godoy <jgodoy(at)gmail(dot)com>
To: Alban Hertroys <alban(at)magproductions(dot)nl>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 13:26:05
Message-ID: 200605241026.07042.jgodoy@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Em Quarta 24 Maio 2006 06:09, Alban Hertroys escreveu:
>
> What about using updatable views instead of inheritence? You'd need your
> company_id back, but adding new companies or modifying table definitions
> could be a lot easier (as long as you don't need to update all of your
> views...).

Hi Alban. Besides that update problem in my views -- that would force me to
update the view and associated rules -- how would it behave with regards to
performance? From what I got reading
http://www.varlena.com/GeneralBits/82.php all data would be on the same
table, so if I have millions of records -- what is easy if I'll be recording
each individual transaction -- either buying or selling -- for each company
and I have 1000 companies, including hotels and other companies where we can
have hundreds of entries per day...

Some things are really important here:

- performance for operations on an individual company --- it has to be
as fast as possible because this might be used by my client's clients.

- safety: if it becomes available externally, then one client could never
see other client's data. This is a main concern. If used only internally
it is desirable that only authorized employees view each client since all
of them doesn't need to access all clients even if working directly with
accounting.

- easy to code on application side: other systems will be plugged to this
database. The more we can do to avoid other people mistakes, the better.

- easy to maintain database: if it is too painful, hacks will come and this
is not a good plan before starting the project...

Thinking about the first item -- performance -- and by reading the manual, we
got to table inheritance (data partitioning made easy ;-)). Thinking about
the second item -- safety -- and including search_paths we got to several
schemas + access permissions. The third and fourth items were a consequence
of the first two decisions.

We haven't benchmarked anything yet, but from what is in the docs, this looked
like a good approach.

Updateable views give me the part of second and also the third item but it
seems to be missing on the first and last items... On the other hand, if it
solves problems with views and functions that I said I was having on the
other thread then might become interesting... But performance would still be
a problem with millions of records (by law we're required to keep at least 5
years of docs online for some docs, for other it is required to have the full
company history... so getting to dozens of millions in 5 years is not all
that hard...).

Am I right or completely wrong? :-)

--
Jorge Godoy <jgodoy(at)gmail(dot)com>


From: Alban Hertroys <alban(at)magproductions(dot)nl>
To: Jorge Godoy <jgodoy(at)gmail(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 14:48:10
Message-ID: 4474722A.3060205@magproductions.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Jorge Godoy wrote:
> Em Quarta 24 Maio 2006 06:09, Alban Hertroys escreveu:
>
>>What about using updatable views instead of inheritence? You'd need your
>>company_id back, but adding new companies or modifying table definitions
>>could be a lot easier (as long as you don't need to update all of your
>>views...).
>
>
> Hi Alban. Besides that update problem in my views -- that would force me to
> update the view and associated rules -- how would it behave with regards to
> performance? From what I got reading
> http://www.varlena.com/GeneralBits/82.php all data would be on the same
> table, so if I have millions of records -- what is easy if I'll be recording
> each individual transaction -- either buying or selling -- for each company
> and I have 1000 companies, including hotels and other companies where we can
> have hundreds of entries per day...
>
> Some things are really important here:
>
> - performance for operations on an individual company --- it has to be
> as fast as possible because this might be used by my client's clients.

I suppose inherited tables would be faster. Both at read and write
operations.

However, I think that partitioning/clustering the table over an index on
company_id would help a lot. Also, considering you need to keep a lot of
old data around, you may be able to split your data in archived and
recent data, keeping your indices small (partial indices maybe).

> - safety: if it becomes available externally, then one client could never
> see other client's data. This is a main concern. If used only internally
> it is desirable that only authorized employees view each client since all
> of them doesn't need to access all clients even if working directly with
> accounting.

Safety would be about equal to your inheritence solution. You can GRANT
rights on the views, that can be different from the access rights on the
base tables. The views must evidently have access to your base tables.

> - easy to code on application side: other systems will be plugged to this
> database. The more we can do to avoid other people mistakes, the better.

Views might be a bit easier there, as you can add derived data to your
results (like first name, infix and surname concatenated, dates in a
format specific for a companys' locale, etc.).

> - easy to maintain database: if it is too painful, hacks will come and this
> is not a good plan before starting the project...

This was my main reason to suggest views. For select you could probably
get away with SELECT * FROM table WHERE company_id = x; (Though SELECT *
is being frownde upon).

The update/insert/delete parts are also all rather similar. You may be
able to generate REPLACE VIEW statements if the base table layout
changes. You'd probably need to DROP and re-CREATE them, but being views
your data is not at risc. Nothing physically changes in your tables.
Hmm... Almost forgot about the RULEs that make them updatable... Those'd
need updating too, of course...

With inherited tables you would need a way to dump and restore their
contents, or do a whole lot of ALTER TABLE statements. You'd be
physically altering the tables that contain your data; there's more than
zero risc involved. You can't rollback DDL statements...

OTOH, with inherited tables you can have company-tables that differ from
the base tables for a specific company; with views that'd be... difficult.

> We haven't benchmarked anything yet, but from what is in the docs, this looked
> like a good approach.

Neither did I benchmark updatable views; haven't had much chance to use
them yet. No, I don't have a lot of experience with them, but I know
some people on this list must have.

> Updateable views give me the part of second and also the third item but it
> seems to be missing on the first and last items... On the other hand, if it
> solves problems with views and functions that I said I was having on the
> other thread then might become interesting... But performance would still be
> a problem with millions of records (by law we're required to keep at least 5
> years of docs online for some docs, for other it is required to have the full
> company history... so getting to dozens of millions in 5 years is not all
> that hard...).

In my experience, having the right indexes on your tables/views helps a
LOT. Millions of records need not be a problem if you know what you will
be querying for.

> Am I right or completely wrong? :-)

Have you considered downtime if a schema change is necessary? What would
cause the most; tables or views?
I'm pretty sure VIEWs will be faster if you need to DROP/CREATE the
tables, as you'll have to move your data around not to lose it... Of
course, you'd need to do the same for your base tables, that contain
_all_ your data in the updatable view case...

Well, only more things to consider choosing between, I'm afraid.

Regards,
--
Alban Hertroys
alban(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //


From: Jorge Godoy <jgodoy(at)gmail(dot)com>
To: Alban Hertroys <alban(at)magproductions(dot)nl>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 15:21:52
Message-ID: 200605241221.55060.jgodoy@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Em Quarta 24 Maio 2006 11:48, Alban Hertroys escreveu:
> Jorge Godoy wrote:
> > Some things are really important here:
> >
> > - performance for operations on an individual company --- it has to be
> > as fast as possible because this might be used by my client's clients.
>
> I suppose inherited tables would be faster. Both at read and write
> operations.
>
> However, I think that partitioning/clustering the table over an index on
> company_id would help a lot. Also, considering you need to keep a lot of
> old data around, you may be able to split your data in archived and
> recent data, keeping your indices small (partial indices maybe).

But this could be easily done with two commands (besides creating indices
again):

ALTER TABLE x RENAME TO x_year;
CREATE TABLE x () INHERITS FROM base.x;

No need to change the rest...

> > - safety: if it becomes available externally, then one client could
> > never see other client's data. This is a main concern. If used only
> > internally it is desirable that only authorized employees view each
> > client since all of them doesn't need to access all clients even if
> > working directly with accounting.
>
> Safety would be about equal to your inheritence solution. You can GRANT
> rights on the views, that can be different from the access rights on the
> base tables. The views must evidently have access to your base tables.

Of course. This way, to keep the view working, I'd have to use SELECT INTO
when archiving data, right? IIRC, views would still refer to the old table
if I rename them (this is a problem on my solution as well if I can't use a
common view and make it respect the search_path, as I described on the other
message).

> > - easy to code on application side: other systems will be plugged to
> > this database. The more we can do to avoid other people mistakes, the
> > better.
>
> Views might be a bit easier there, as you can add derived data to your
> results (like first name, infix and surname concatenated, dates in a
> format specific for a companys' locale, etc.).

I'll have the full data on each schema, so this is also possible.

> > - easy to maintain database: if it is too painful, hacks will come and
> > this is not a good plan before starting the project...
>
> This was my main reason to suggest views. For select you could probably
> get away with SELECT * FROM table WHERE company_id = x; (Though SELECT *
> is being frownde upon).

Instead of doing SELECT * FROM schema.table; (no filter here)...

> The update/insert/delete parts are also all rather similar. You may be
> able to generate REPLACE VIEW statements if the base table layout
> changes. You'd probably need to DROP and re-CREATE them, but being views
> your data is not at risc. Nothing physically changes in your tables.
> Hmm... Almost forgot about the RULEs that make them updatable... Those'd
> need updating too, of course...

Is it at risk with inherited tables?

> With inherited tables you would need a way to dump and restore their
> contents, or do a whole lot of ALTER TABLE statements. You'd be
> physically altering the tables that contain your data; there's more than
> zero risc involved. You can't rollback DDL statements...

Why? If I add a column to the base table then this new column appears on all
inherited tables. And as I can't rollback DDL statements, I won't be risking
breaking things when I change the view ;-)

> OTOH, with inherited tables you can have company-tables that differ from
> the base tables for a specific company; with views that'd be... difficult.

I'd have to have an exclusive view and use auxiliar tables if I wanted that
with views... For now, from what we've discussed with the client, this won't
be the case. But it is something that we should think about for future
expansions... If there aren't too many exceptions, then we can deal with
that somewhat cleanly on both sides with the approach of the auxiliar table
and exclusive view, but if there are too many exceptions adapting the table
might be better (both are "hard" if there are too many exceptions...). I
haven't gotten into this planning level yet.

> > We haven't benchmarked anything yet, but from what is in the docs, this
> > looked like a good approach.
>
> Neither did I benchmark updatable views; haven't had much chance to use
> them yet. No, I don't have a lot of experience with them, but I know
> some people on this list must have.

I hope they give us some advice :-)

> > Updateable views give me the part of second and also the third item but
> > it seems to be missing on the first and last items... On the other hand,
> > if it solves problems with views and functions that I said I was having
> > on the other thread then might become interesting... But performance
> > would still be a problem with millions of records (by law we're required
> > to keep at least 5 years of docs online for some docs, for other it is
> > required to have the full company history... so getting to dozens of
> > millions in 5 years is not all that hard...).
>
> In my experience, having the right indexes on your tables/views helps a
> LOT. Millions of records need not be a problem if you know what you will
> be querying for.

You're right about indexes.

> > Am I right or completely wrong? :-)
>
> Have you considered downtime if a schema change is necessary? What would
> cause the most; tables or views?

What do you mean by a schema change? If we go this route, then this will have
to be a core part of the database and application design. Changes here will
be critical, so there shouldn't be changes or they should be done
incrementally. Bigger changes will require a maintenance routine that will
probably take the system offline for some time (from minutes to hours).

We're studying what to do to be able to plan things like that.

> I'm pretty sure VIEWs will be faster if you need to DROP/CREATE the
> tables, as you'll have to move your data around not to lose it... Of
> course, you'd need to do the same for your base tables, that contain
> _all_ your data in the updatable view case...

As I said, I was thinking along the lines of renaming tables to archive them
when needed (weekly, monthly, quarterly, yearly) and inheriting from "base"
schema again. This looks like a very fast operation...

Do you believe it will be problematic?

> Well, only more things to consider choosing between, I'm afraid.

Don't be. Those are very helpful as well. One can't remember everything, so
it is nice to have someone criticizing your project and design. I'm very
thankful.

I just few that we're starting to walk on not-stable grounds here... This is
my main concern. Maybe a more traditional approach as you suggest should be
the best approach...

Then, we'd have a "company_id" to separate data per company, create updateable
views for each company and see how we can write our functions to work with
either the views or the tables (what will be better and easier to expand as
new companies are added)...

--
Jorge Godoy <jgodoy(at)gmail(dot)com>


From: Alban Hertroys <alban(at)magproductions(dot)nl>
To: Jorge Godoy <jgodoy(at)gmail(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 16:06:43
Message-ID: 44748493.5090107@magproductions.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Jorge Godoy wrote:
> Em Quarta 24 Maio 2006 11:48, Alban Hertroys escreveu:
>
>>Jorge Godoy wrote:
>>
>>>Some things are really important here:
>>>
>>> - performance for operations on an individual company --- it has to be
>>> as fast as possible because this might be used by my client's clients.
>>
>>I suppose inherited tables would be faster. Both at read and write
>>operations.
>>
>>However, I think that partitioning/clustering the table over an index on
>>company_id would help a lot. Also, considering you need to keep a lot of
>>old data around, you may be able to split your data in archived and
>>recent data, keeping your indices small (partial indices maybe).
>
>
> But this could be easily done with two commands (besides creating indices
> again):
>
> ALTER TABLE x RENAME TO x_year;
> CREATE TABLE x () INHERITS FROM base.x;
>
> No need to change the rest...

CREATE TABLE sample ();
ALTER TABLE sample RENAME TO sample_2004;
CREATE TABLE sample () INHERITS FROM sample_2004;
ALTER TABLE sample RENAME TO sample_2005;
CREATE TABLE sample () INHERITS FROM sample_2005;

You really mean that yearly doubling inheritance? I've been told
(admittedly by people I don't give much credit) that inheriting tables
works by joining the underlying tables, which could seriously affect
performance after a few years of doing this.

Did you try this? It seems to be important.

Considering this case with views, you could:

CREATE INDEX sample_2004_idx ON sample (object_id) WHERE date BETWEEN
'2004-01-01'::date AND '2004-12-31'::date;

CREATE INDEX sample_2005_idx ON sample (object_id) WHERE date BETWEEN
'2005-01-01'::date AND '2005-12-31'::date;

etc.

You could extend that to also take company_id into account and put the
indices on the base tables instead. I figure that takes less maintenance.

Mind you, to make use of those indices, you need date somewhere in your
where-clauses.

In your inheritence case, if you have tables covering this year only
(that of course inherit from the complete data set somehow), you could
use the ONLY statement in your FROM clauses. You wouldn't need to bother
with date.

>>> - safety: if it becomes available externally, then one client could
>>>never see other client's data. This is a main concern. If used only
>>>internally it is desirable that only authorized employees view each
>>>client since all of them doesn't need to access all clients even if
>>>working directly with accounting.
>>
>>Safety would be about equal to your inheritence solution. You can GRANT
>>rights on the views, that can be different from the access rights on the
>>base tables. The views must evidently have access to your base tables.
>
> Of course. This way, to keep the view working, I'd have to use SELECT INTO
> when archiving data, right? IIRC, views would still refer to the old table
> if I rename them (this is a problem on my solution as well if I can't use a
> common view and make it respect the search_path, as I described on the other
> message).

By archiving you mean moving yearly data to disk via a temporary table
or some such? In that case, yes, probably.

For yearly archiving, dumping one of those inherited tables (containing
only data for a specific year, after all) could work. Never tried that
before.

>>> - easy to code on application side: other systems will be plugged to
>>>this database. The more we can do to avoid other people mistakes, the
>>>better.
>>
>>Views might be a bit easier there, as you can add derived data to your
>>results (like first name, infix and surname concatenated, dates in a
>>format specific for a companys' locale, etc.).
>
> I'll have the full data on each schema, so this is also possible.

Okay, no client data processing required, then. Nice.

>>> - easy to maintain database: if it is too painful, hacks will come and
>>>this is not a good plan before starting the project...
>>
>>This was my main reason to suggest views. For select you could probably
>>get away with SELECT * FROM table WHERE company_id = x; (Though SELECT *
>>is being frownde upon).
>
> Instead of doing SELECT * FROM schema.table; (no filter here)...

Indeed.

Hmm... thinking about this, I realize that in your case selecting all
data in a year could be faster. The planner would certainly choose a
sequential scan plan, while with all data in a single table (with views
over them) may trigger an index scan, which would probably be slower.

Depends on whether you're likely to do that, of course.

>>The update/insert/delete parts are also all rather similar. You may be
>>able to generate REPLACE VIEW statements if the base table layout
>>changes. You'd probably need to DROP and re-CREATE them, but being views
>>your data is not at risc. Nothing physically changes in your tables.
>>Hmm... Almost forgot about the RULEs that make them updatable... Those'd
>>need updating too, of course...
>
> Is it at risk with inherited tables?

Well, the data is inside those tables, where it's not with views. If you
accidentaly DROP a column in a table, the data is gone. With a view you
just don't see it anymore, but it's still there.

>>With inherited tables you would need a way to dump and restore their
>>contents, or do a whole lot of ALTER TABLE statements. You'd be
>>physically altering the tables that contain your data; there's more than
>>zero risc involved. You can't rollback DDL statements...
>
> Why? If I add a column to the base table then this new column appears on all
> inherited tables. And as I can't rollback DDL statements, I won't be risking
> breaking things when I change the view ;-)

Oh right, I forgot about that. But what happens if you want to change a
columns data type for example? I'd add a new column with the new type,
update it with the data in the column I want to change, drop the
original column and rename it. What happens if you do that when tables
inherit the one you're working on?

Also note that foreign keys to an inherited table don't work properly.
You get foreign key violations, because the constraint is looking at the
wrong table, IIRC.

>>OTOH, with inherited tables you can have company-tables that differ from
>>the base tables for a specific company; with views that'd be... difficult.
>
> I'd have to have an exclusive view and use auxiliar tables if I wanted that
> with views... For now, from what we've discussed with the client, this won't
> be the case. But it is something that we should think about for future
> expansions... If there aren't too many exceptions, then we can deal with

Sure, they always say that ;)

> that somewhat cleanly on both sides with the approach of the auxiliar table
> and exclusive view, but if there are too many exceptions adapting the table
> might be better (both are "hard" if there are too many exceptions...). I
> haven't gotten into this planning level yet.

You may be able to achieve some customization by inheriting from the
companies inherited tables again, adding a few columns. It's still a
pain, of course, as the client code will need to be customized too to
take advantage of the different table definitions.

>>>Am I right or completely wrong? :-)
>>
>>Have you considered downtime if a schema change is necessary? What would
>>cause the most; tables or views?
>
> What do you mean by a schema change? If we go this route, then this will have
> to be a core part of the database and application design. Changes here will
> be critical, so there shouldn't be changes or they should be done
> incrementally. Bigger changes will require a maintenance routine that will
> probably take the system offline for some time (from minutes to hours).
>
> We're studying what to do to be able to plan things like that.

Well, I know customers... They always want something to be different in
the end.

>>I'm pretty sure VIEWs will be faster if you need to DROP/CREATE the
>>tables, as you'll have to move your data around not to lose it... Of
>>course, you'd need to do the same for your base tables, that contain
>>_all_ your data in the updatable view case...
>
> As I said, I was thinking along the lines of renaming tables to archive them
> when needed (weekly, monthly, quarterly, yearly) and inheriting from "base"
> schema again. This looks like a very fast operation...
>
> Do you believe it will be problematic?

What I believe it to be isn't really significant, neither of us has much
experience using table inheritence or updatable views for things like
these, it seems.

One thing that comes to mind now is the timetravel contrib package.
That's another thing I plan to look into some time, but it may suit your
needs.

--
Alban Hertroys
alban(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //


From: Jorge Godoy <jgodoy(at)gmail(dot)com>
To: Alban Hertroys <alban(at)magproductions(dot)nl>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 17:10:44
Message-ID: 200605241410.46202.jgodoy@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Em Quarta 24 Maio 2006 13:06, Alban Hertroys escreveu:
> Jorge Godoy wrote:
> >
> > But this could be easily done with two commands (besides creating indices
> > again):
> >
> > ALTER TABLE x RENAME TO x_year;
> > CREATE TABLE x () INHERITS FROM base.x;
> >
> > No need to change the rest...
>
> CREATE TABLE sample ();
> ALTER TABLE sample RENAME TO sample_2004;
> CREATE TABLE sample () INHERITS FROM sample_2004;
> ALTER TABLE sample RENAME TO sample_2005;
> CREATE TABLE sample () INHERITS FROM sample_2005;
>
> You really mean that yearly doubling inheritance? I've been told
> (admittedly by people I don't give much credit) that inheriting tables
> works by joining the underlying tables, which could seriously affect
> performance after a few years of doing this.

That's not what I showed above. What I meant was:

CREATE TABLE base_schema.sample ();
CREATE TABLE client1_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client1_schema.sample RENAME TO client1_schema.sample_2004;
CREATE TABLE client1_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client1_schema.sample RENAME TO client1_schema.sample_2005;
CREATE TABLE client1_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client1_schema.sample RENAME TO client1_schema.sample_2006;
CREATE TABLE client1_schema.sample () INHERITS FROM base_schema.sample;
...
CREATE TABLE client2_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client2_schema.sample RENAME TO client1_schema.sample_2004;
CREATE TABLE client2_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client2_schema.sample RENAME TO client1_schema.sample_2005;
CREATE TABLE client2_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client2_schema.sample RENAME TO client1_schema.sample_2006;
CREATE TABLE client2_schema.sample () INHERITS FROM base_schema.sample;
...
CREATE TABLE client3_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client3_schema.sample RENAME TO client1_schema.sample_2004;
CREATE TABLE client3_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client3_schema.sample RENAME TO client1_schema.sample_2005;
CREATE TABLE client3_schema.sample () INHERITS FROM base_schema.sample;
ALTER TABLE client3_schema.sample RENAME TO client1_schema.sample_2006;
CREATE TABLE client3_schema.sample () INHERITS FROM base_schema.sample;
...

So I'm always inheriting from the same base and I'm renaming the inherited
tables, not the parent table.

> Did you try this? It seems to be important.

I've done what I said, not what you said... Does it look better this way?

> Considering this case with views, you could:
>
> CREATE INDEX sample_2004_idx ON sample (object_id) WHERE date BETWEEN
> '2004-01-01'::date AND '2004-12-31'::date;
>
> CREATE INDEX sample_2005_idx ON sample (object_id) WHERE date BETWEEN
> '2005-01-01'::date AND '2005-12-31'::date;
>
> etc.
>
> You could extend that to also take company_id into account and put the
> indices on the base tables instead. I figure that takes less maintenance.

From what I did to this suggestion or from what you thought I was doing to
this suggestion?

> Mind you, to make use of those indices, you need date somewhere in your
> where-clauses.

While with table partitioning like I was saying I would not use this. (Of
course I'll have date indices to retrieve information but not to split
data...)

> In your inheritence case, if you have tables covering this year only
> (that of course inherit from the complete data set somehow), you could
> use the ONLY statement in your FROM clauses. You wouldn't need to bother
> with date.

If I SELECT from the schema it will only retrieve data from the schema. If I
select from the parent table then I'll have all data available -- and for the
parent table I might need some date index.

> > Of course. This way, to keep the view working, I'd have to use SELECT
> > INTO when archiving data, right? IIRC, views would still refer to the
> > old table if I rename them (this is a problem on my solution as well if I
> > can't use a common view and make it respect the search_path, as I
> > described on the other message).
>
> By archiving you mean moving yearly data to disk via a temporary table
> or some such? In that case, yes, probably.

Yes, that's what I mean.

> For yearly archiving, dumping one of those inherited tables (containing
> only data for a specific year, after all) could work. Never tried that
> before.

If I'm removing this from the database, then I can dump it. If I rename it
data will still be available for processing in the parent table while new
data is inserted in the new table. When the time that I need to keep those
records there expire, I can dump the table to some backup and remove it from
the database, clearing space and eliminating unneeded tables.

> Hmm... thinking about this, I realize that in your case selecting all
> data in a year could be faster. The planner would certainly choose a
> sequential scan plan, while with all data in a single table (with views
> over them) may trigger an index scan, which would probably be slower.
>
> Depends on whether you're likely to do that, of course.

Probably not often, but if we have benefits here, I believe that those might
also appear on small selects.

> Well, the data is inside those tables, where it's not with views. If you
> accidentaly DROP a column in a table, the data is gone. With a view you
> just don't see it anymore, but it's still there.

Indeed, but with inherited tables I have to DROP ... CASCADE;, what might
trigger some advice to whoever is maintaining the database. But you're right
on this. But I'm at the same risk dropping some column from the tables the
views are derived from...

> Oh right, I forgot about that. But what happens if you want to change a
> columns data type for example? I'd add a new column with the new type,
> update it with the data in the column I want to change, drop the
> original column and rename it. What happens if you do that when tables
> inherit the one you're working on?

I'd "ALTER TABLE sample ALTER COLUMN column TYPE ...". I haven't tested this
with regards to inheritance...

> Also note that foreign keys to an inherited table don't work properly.
> You get foreign key violations, because the constraint is looking at the
> wrong table, IIRC.

I'll have to check if my partner checked that. I haven't. Thanks for teh
reminder.

> > I'd have to have an exclusive view and use auxiliar tables if I wanted
> > that with views... For now, from what we've discussed with the client,
> > this won't be the case. But it is something that we should think about
> > for future expansions... If there aren't too many exceptions, then we
> > can deal with
>
> Sure, they always say that ;)

Yeah... And they never know what they want as well... :-(

> > that somewhat cleanly on both sides with the approach of the auxiliar
> > table and exclusive view, but if there are too many exceptions adapting
> > the table might be better (both are "hard" if there are too many
> > exceptions...). I haven't gotten into this planning level yet.
>
> You may be able to achieve some customization by inheriting from the
> companies inherited tables again, adding a few columns. It's still a
> pain, of course, as the client code will need to be customized too to
> take advantage of the different table definitions.

Inheriting from inherited tables wasn't in my plan. But as I said, I haven't
gotten to this level of details yet. :-(

> > What do you mean by a schema change? If we go this route, then this will
> > have to be a core part of the database and application design. Changes
> > here will be critical, so there shouldn't be changes or they should be
> > done incrementally. Bigger changes will require a maintenance routine
> > that will probably take the system offline for some time (from minutes to
> > hours).
> >
> > We're studying what to do to be able to plan things like that.
>
> Well, I know customers... They always want something to be different in
> the end.

hehe... Yes, they do. But I believe that they'd want that for all of their
customer.

> What I believe it to be isn't really significant, neither of us has much
> experience using table inheritence or updatable views for things like
> these, it seems.

Both newbies, indeed. ;-)

> One thing that comes to mind now is the timetravel contrib package.
> That's another thing I plan to look into some time, but it may suit your
> needs.

I'll take a look into it. I confess I don't even know what it does, but I'll
check :-)

Thanks, Alban!

--
Jorge Godoy <jgodoy(at)gmail(dot)com>


From: Richard Broersma Jr <rabroersma(at)yahoo(dot)com>
To: Jorge Godoy <jgodoy(at)gmail(dot)com>, Alban Hertroys <alban(at)magproductions(dot)nl>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 17:48:55
Message-ID: 20060524174855.47650.qmail@web31807.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

> If I SELECT from the schema it will only retrieve data from the schema. If I
> select from the parent table then I'll have all data available -- and for the
> parent table I might need some date index.

From what I've read I am not sure that an Index created on the base/parent table will help when
preforming select queries on data across all of the children. If I interpret the "Caveats"
correctly I would assume that the index will only apply to data that is inserted directly into the
base/parent table.

From the Manual:

5.8.1. Caveats

...

A serious limitation of the inheritance feature is that indexes (including unique constraints) and
foreign key constraints only apply to single tables, not to their inheritance children. This is
true on both the referencing and referenced sides of a foreign key constraint.

Regards,

Richard Broersma Jr.


From: Jorge Godoy <jgodoy(at)gmail(dot)com>
To: Richard Broersma Jr <rabroersma(at)yahoo(dot)com>
Cc: Alban Hertroys <alban(at)magproductions(dot)nl>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-24 17:56:43
Message-ID: 200605241456.45645.jgodoy@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Em Quarta 24 Maio 2006 14:48, Richard Broersma Jr escreveu:
>
> From what I've read I am not sure that an Index created on the base/parent
> table will help when preforming select queries on data across all of the
> children. If I interpret the "Caveats" correctly I would assume that the
> index will only apply to data that is inserted directly into the
> base/parent table.

It won't. What I was saying is that I don't need an index just for this
reason because data partitioning would be intrinsic to the inheritance
solution. I was thinking as described in
http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html
(please, see 5.9.2, specially).

With his solution I'd need an index to have this partitioning implemented and
I'd suffer without it. With inheritance I just need this index if I'm
actually operating with dates, not to just retrieve the right data for a
given year (whatever that means ;-)).

> From the Manual:
>
> 5.8.1. Caveats
>
> ...
>
> A serious limitation of the inheritance feature is that indexes (including
> unique constraints) and foreign key constraints only apply to single
> tables, not to their inheritance children. This is true on both the
> referencing and referenced sides of a foreign key constraint.

This is why, in one of the messages, I've said that I'd have to create indices
for children tables. But hey, it is nice that you're also helping us
here :-) Thanks! :-)

--
Jorge Godoy <jgodoy(at)gmail(dot)com>


From: Alban Hertroys <alban(at)magproductions(dot)nl>
To: Jorge Godoy <jgodoy(at)gmail(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-26 09:25:37
Message-ID: 4476C991.5060707@magproductions.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Jorge Godoy wrote:
> Em Quarta 24 Maio 2006 13:06, Alban Hertroys escreveu:
>
>>Jorge Godoy wrote:

> That's not what I showed above. What I meant was:
>
> CREATE TABLE base_schema.sample ();
> CREATE TABLE client1_schema.sample () INHERITS FROM base_schema.sample;
> ALTER TABLE client1_schema.sample RENAME TO client1_schema.sample_2004;
> CREATE TABLE client1_schema.sample () INHERITS FROM base_schema.sample;
> ALTER TABLE client1_schema.sample RENAME TO client1_schema.sample_2005;
> ...
>
> So I'm always inheriting from the same base and I'm renaming the inherited
> tables, not the parent table.
>
>>Did you try this? It seems to be important.
>
> I've done what I said, not what you said... Does it look better this way?

It does; only 1 level of inheritence for every table, AOT growing
inheritence.
There are some differences in behaviour too:

- Your method shows only data from one year in each table, so you'll
need to query a different table to get historical information (probably
the base tables, but that also includes data from other companies).

- The "growing inheritence" method would show old data in the newly
created tables, unless you use the ONLY operator.

Your method will have better performance for the general case (querying
current data), I think.

>>Considering this case with views, you could:
>>
>>CREATE INDEX sample_2004_idx ON sample (object_id) WHERE date BETWEEN
>>'2004-01-01'::date AND '2004-12-31'::date;
>>
>>CREATE INDEX sample_2005_idx ON sample (object_id) WHERE date BETWEEN
>>'2005-01-01'::date AND '2005-12-31'::date;
>>
>>etc.
>>
>>You could extend that to also take company_id into account and put the
>>indices on the base tables instead. I figure that takes less maintenance.
>
> From what I did to this suggestion or from what you thought I was doing to
> this suggestion?

Neither; That's about the updatable views case.

>>For yearly archiving, dumping one of those inherited tables (containing
>>only data for a specific year, after all) could work. Never tried that
>>before.
>
> If I'm removing this from the database, then I can dump it. If I rename it
> data will still be available for processing in the parent table while new
> data is inserted in the new table. When the time that I need to keep those
> records there expire, I can dump the table to some backup and remove it from
> the database, clearing space and eliminating unneeded tables.

Can't you just dump the client and year specific inherited table
(client1_schema.sample_2004 for example)? That way you could just use
pg_dump, provided it understands inheritence of course.

>>Hmm... thinking about this, I realize that in your case selecting all
>>data in a year could be faster. The planner would certainly choose a
>>sequential scan plan, while with all data in a single table (with views
>>over them) may trigger an index scan, which would probably be slower.
>>
>>Depends on whether you're likely to do that, of course.
>
> Probably not often, but if we have benefits here, I believe that those might
> also appear on small selects.

Only if a sequential scan is desirable over an index scan, meaning
you're selecting almost all rows in your table.

The difference between an index scan over all data and an index scan
over year-specific data should be pretty small.

>>Well, the data is inside those tables, where it's not with views. If you
>>accidentaly DROP a column in a table, the data is gone. With a view you
>>just don't see it anymore, but it's still there.
>
> Indeed, but with inherited tables I have to DROP ... CASCADE;, what might
> trigger some advice to whoever is maintaining the database. But you're right
> on this. But I'm at the same risk dropping some column from the tables the
> views are derived from...

True, but with updatable views you have only one set of tables to take
into account, instead of a set for every customer company. You can
automate that to some extent, but it is more error prone - it involves
human beings, after all ;)

>>One thing that comes to mind now is the timetravel contrib package.
>>That's another thing I plan to look into some time, but it may suit your
>>needs.
>
> I'll take a look into it. I confess I don't even know what it does, but I'll
> check :-)
>
>
> Thanks, Alban!

You're welcome.

--
Alban Hertroys
alban(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Alban Hertroys" <alban(at)magproductions(dot)nl>
Cc: "Jorge Godoy" <jgodoy(at)gmail(dot)com>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: [SQL] (Ab)Using schemas and inheritance
Date: 2006-05-26 14:53:38
Message-ID: b42b73150605260753x4969e11dua10c7dad562f3411@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

to all:

maybe I can be of some help here. I think schemas are definately the
way to go. I've used them quite a bit, previously with an accounting
application which used schemas to separate company. Right now I am in
development of a major project which will use them even more
extensively, following an evalution of other methods of isloating data
for scalability reasons.

The mammoth table approach (company id in every table) is very clean
and the best approach from a academic perspective but unfortunately
you can run into scalability problems if your tables are very large.
An ideal SQL database would automagically do everything for you
without having to worry about it (no such database exists AFAIK).

The major downsides of using schemas in an application can be
mitigated...for example clever use of dynamic sql functions and
scripts can go a long way in helping with updates that have to run
across mutiple schemas.

The benifits of schemas are pretty nice, you get the performance
advantages of partitioning and you can manipulate the search path for
easy switching...just make sure to force a reconnect to get all those
stored procedures to relearn all the tables they use if you keep your
functions in a shared area (usually a good idea).

Table partitioning and inheritance look good on the surface but there
are a number of 'gotchas' that can hold you back from using them
effectively. Foreign keys are a mess, and compelex queries can be
problematic...the new constraint exclusion feature helps, but there is
a way to go IMO.

by the way, the \d query in psql maps to a fairly complex sql
statement, you can beat it by going right to the pg catalogs if
performance is an issue, or materialize your catalogs to an indexed
table. If you have a lot of tables (1000s) you will definately want
to prepare everything, meaning use views, procedures, and prepared
statements liberally. I would also take careful note of Jim's FSM
comments and keep a careful eye on that.

Merlin