logical column position

From: Neil Conway <neilc(at)samurai(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Dave Cramer <davec(at)fastcrypt(dot)com>
Subject: logical column position
Date: 2003-11-17 16:48:01
Message-ID: 87wu9zndq6.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'd like to add a new column to pg_attribute that specifies the
attribute's "logical position" within its relation. The idea here is
to separate the logical order of the columns in a relation from the
on-disk storage of the relation's tuples. This allows us to easily &
quickly change column order, add an additional column before or after
an existing column, etc.

At present, attnum basically does three things: identifies an column
within a relation, indicates which columns are system columns, and
defines the order of a relation's columns. I'd like to move this last
functionality into a separate pg_attribute column named "attpos" (or
"attlogicalpos"):

- when the table is created, attnum == attpos. System columns
have attpos < 0, as with attnum. At no point will two
columns of the same relation have the same attpos.

- when returning output to the client and no column ordering
is implied by the query (e.g. "SELECT * ..."), we sort the
columns in ascending attpos order.

- when storing a tuple on disk, we don't consider attpos

- if we want to change the order of the column's in a
relation, we can do so merely by updating pg_attribute; no
changes to the on-disk storage of the relation should be
necessary

A few notes:

(a) ISTM this should also apply to COPY TO and COPY FROM if the user
didn't supply a column list. Is this reasonable? It would break
dumps of the table's contents, but then again, dumps aren't
guaranteed to remain valid over arbitrary changes to the table's
meta-data.

(b) Using the above scheme that attnum == attpos initially, there
won't be any gaps in the sequence of attpos values. That means
that if, for example, we want to move the column in position 50
to position 1, we'll need to change the position's of all the
columns in positions [1..49] (and suffer the resulting MVCC
bloat in pg_attribute). Changing the column order is hardly a
performance critical operation, so that might be acceptable.

If we want to avoid this, one easy (but arguably unclean) way to
do so would be to make the initial value of attpos == attnum *
1000, and make attpos an int4 rather than an int2. Then, we can
do most column reordering operations with only a single
pg_attribute update -- in the worst-case that enough
re-orderings are done that we overflow the 999 "padding"
positions, we can just fall-back to doing multiple pg_attribute
updates. Is this worth doing, and/or is there a better way to
achieve the same effect?

(c) Do I need to consider inheritance?

Comments are welcome.

-Neil

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2003-11-17 16:49:25 logical column position
Previous Message Adam Witney 2003-11-17 16:29:05 7.4 make failure on OSX