preliminary: logical column order

Lists: pgsql-patches
From: Neil Conway <neilc(at)samurai(dot)com>
To: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Cc: Dave Cramer <davec(at)fastcrypt(dot)com>
Subject: preliminary: logical column order
Date: 2003-11-21 08:09:02
Message-ID: 87oev66t41.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

This patch provides a preliminary implementation of the logical column
ordering functionality recently proposed on -hackers. This patch has a
number of known issues, so please don't apply it.

Current functionality implemented:

- Added attpos to the system catalogs & bumped the catalog
version: initdb required
- COPY TO/FROM obey logical column ordering
- When expanding "SELECT * ...", we make sure to obey logical
column order (see below, however)
- If INSERT is given an empty column list, it will produce the
necessary implicit column list in sorted by attpos
- (unrelated) improve a bunch of comments in pg_class.h,
remove some dead code from pg_attribute.h, refactor some
code in access/common/printtup.c

Remaining work to do:

- Change the name of the pg_attribute attribute to something
other than 'attpos', per the discussion on -hackers. I don't
believe we've settled on the right name yet.
- The code for actually sorting the columns in attpos-order is
duplicated a few times -- this was just done for the sake of
convenience, I'm going to clean this up and stick it in a
single, shared location in the new patch.
- The INSERT change causes the alter_table regression tests to
fail (the regression.diff) is attached. This appears to be
the logical column ordering interacting with inheritance,
but I haven't yet taken an in-depth look at it.
- When processing a "SELECT *", for example, the actual data
columns are returned in the right order, but the
RowDescription messages sent by libpq are not (i.e. they are
sent in attnum-order, not attpos).

In SendRowDescriptionMessage() and related printtup code, I
took a look at trying to sort the attributes we're about to
send RowDescription messages for by their 'attpos', but the
TupleDesc that we have for the return type doesn't included
a filled-in attpos.

I spent a while trying to see how feasible it would be to
fill in the result-type-TupleDesc with an attpos where
available, but couldn't figure out an easy way to do
this. Are there any suggestions on how this should be best
done?
- I haven't yet implemented ordering-by-attpos for the output
columns of a join, or the alias->column matching in the FROM
alias list (per Tom's email to -hackers). I saw a couple
places where I might be able to do this, but I'm quite sure
what the correct spot is. Tom, do you have any suggestions?

Any comments would be gratefully appreciated.

-Neil

Attachment Content-Type Size
att-logical-pos-13.patch text/x-patch 67.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>, Dave Cramer <davec(at)fastcrypt(dot)com>
Subject: Re: preliminary: logical column order
Date: 2003-11-21 08:20:59
Message-ID: 10406.1069402859@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Neil Conway <neilc(at)samurai(dot)com> writes:
> - The code for actually sorting the columns in attpos-order is
> duplicated a few times -- this was just done for the sake of
> convenience, I'm going to clean this up and stick it in a
> single, shared location in the new patch.

Bruce and I were chatting about that on the phone today. I think it
might be useful for TupleDescs to doubly index their contained attribute
rows --- that is, keep the existing array-indexed-by-attnum, but add
another pointer array indexed by attpos, containing only nondeleted
columns. This would be easy to build, and it'd eliminate
searching/sorting for places that had access to a TupleDesc.

> - When processing a "SELECT *", for example, the actual data
> columns are returned in the right order, but the
> RowDescription messages sent by libpq are not (i.e. they are
> sent in attnum-order, not attpos).

Easy to fix given above proposal ... although actually I am not sure why
this would occur. printtup and friends should always get a constructed
TupDesc that has no notion of deleted or renumbered columns. This may
be a symptom of a more fundamental error somewhere.

regards, tom lane


From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>, Dave Cramer <davec(at)fastcrypt(dot)com>
Subject: Re: preliminary: logical column order
Date: 2003-11-24 07:47:29
Message-ID: 5kd3svcfcgfl7lu9r5a7l7opon7bp8cqvu@email.aon.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Fri, 21 Nov 2003 03:09:02 -0500, Neil Conway <neilc(at)samurai(dot)com> wrote:
>attachment; filename=weird_regression.diffs

This was caused by a small oversight in ALTER TABLE ... ADD COLUMN:

diff -ruN ../base/src/backend/commands/tablecmds.c src/backend/commands/tablecmds.c
--- ../base/src/backend/commands/tablecmds.c 2003-10-14 00:47:15.000000000 +0200
+++ src/backend/commands/tablecmds.c 2003-11-23 16:51:37.000000000 +0100
@@ -1786,6 +1786,7 @@
attribute->attcacheoff = -1;
attribute->atttypmod = colDef->typename->typmod;
attribute->attnum = i;
+ attribute->attpos = i;
attribute->attbyval = tform->typbyval;
attribute->attndims = attndims;
attribute->attisset = (bool) (tform->typtype == 'c');

Servus
Manfred


From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>, Dave Cramer <davec(at)fastcrypt(dot)com>
Subject: Re: preliminary: logical column order
Date: 2003-11-25 20:33:51
Message-ID: 874qwsw5lc.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Bruce and I were chatting about that on the phone today. I think it
> might be useful for TupleDescs to doubly index their contained
> attribute rows

Ok, I implemented this. I made it so that the properly sorted
attribute array is constructed lazily -- my reasoning was that (a)
relatively few locations in the code actually use the sorted attribute
array, so there is no point allocating and initializing it every time
a TupleDesc is used (b) the array could potentially be 1,500 pointers
long -- not huge, but not tiny either, so it is worth a little effort
to avoid unnecessarily alloc'ing and sorting it. Access to the sorted
array is (only) done via a new function, GetAttrByLogicalPosition()

> Easy to fix given above proposal ... although actually I am not sure
> why this would occur. printtup and friends should always get a
> constructed TupDesc that has no notion of deleted or renumbered
> columns.

It seems to happen because:

- SendRowDescription() in printtup.c notes that "the TupleDesc has
been manufactured by ExecTypeFromTL() or some similar function;
it does not contain a full set of fields."

- ExecTypeFromTL() allocates an empty TupleDesc, and then fills in
its individual entries with data from a TargetList -- the
Form_pg_attribute for the attributes involved is never consulted,
so the default attlogpos inserted by TupleDescInitEntry() is
used: attlogpos == attnum

I'm unsure of the best way to fix this so that the TupleDesc that is
handed to printtup & friends contains the information we require. Any
suggestions?

A new version of the patch is attached. Changes:

- replace sorting code with a lazily-constructed sorted array of
pointers to attribute data in TupleDesc
- include Manfred's suggested fix for the alter table regression
test
- (unrelated) add a comment to rewrite/rewriteDefine.c noting the
intent of the code and known problems
- (unrelated) merge my other patch for refactoring
CreateTupleDescCopy() into this patch: I needed to modify this
function in this patch, so I didn't want to deal with patching
complexities.

-Neil

Attachment Content-Type Size
att-logical-pos-21.patch text/x-patch 94.0 KB

From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To:
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: preliminary: logical column order
Date: 2003-11-30 16:33:45
Message-ID: 3FCA1BE9.3030508@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

I wonder if it wouldn't be easier to reorder the TupDesc->attrs[] array
according to an attphysid when filling the TupDesc structure, right
after a column was dropped/recreated (before any indexes/constraints are
recreated), so attnum remains, while storage changes.

Example:

before:
attnum attphysid attname attisdropped

1 1 foo f
2 2 bar f

after drop/recreate col:
1 3 foo f
2 2 bar f
3 1 foo_del t

resulting in an attrs array
attrs[0] describing physical col 3
attrs[1] describing physical col 2
attrs[2] describing physical col 1

Regards,
Andreas