Re: ALTER TABLE modifications

Lists: pgsql-hackerspgsql-patches
From: Rod Taylor <pg(at)rbt(dot)ca>
To: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: ALTER TABLE modifications
Date: 2003-11-08 16:55:06
Message-ID: 1068310505.87879.91.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

A general re-organization of Alter Table. Node wise, it is a
AlterTableStmt with a list of AlterTableCmds. The Cmds are the
individual actions to be completed (Add constraint, drop constraint, add
column, etc.)

Processing is done in 2 phases. The first phase updates the system
catalogs and creates a work queue for the table scan. The second phase
is to conduct the actual table scan evaluating all constraints and other
per tuple processing simultaneously, as required. This has no effect on
single step operations, but has a large benefit for combinational logic
where multiple table scans would otherwise be required.

Steps for the table scan include expression processing (default or
transform expressions), not null constraint process, other constraints.

The code has been simplified a little. Simple permission and recursion
checks are conducted in the common area and an enum for the command type
is used rather than a Char.

ALTER TABLE ADD COLUMN col DEFAULT 3 NOT NULL;
Allow creation of a column with a default (filled) and not null
constraint. This is internally subdivided into 3 actions much
like the below syntax.

ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
The above combinational syntax is commented out in gram.y. The
support framework is used in both the above and below items, but
arbitrary statements probably have some issues -- I've not
tested enough to determine.

If it is useful, it will be submitted at a later date.

ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
Currently migrates indexes, check constraints, defaults, and the
column definition to the new type with optional transform. If
the tranform is not supplied, a standard assignment cast is
attempted.

One issue is that it detects dependencies way too late in the
game (after the TRANFORM has been applied). I tried mucking up
performDeletion to have a RESTRICT that did not throw the error
at the end (or physically remove the items) but that didn't work
out. Any bright ideas on how to achieve the NOTICES from a
failed DROP COLUMN earlier? Otherwise, I can put in single
object detection at that point.

It does not migrate foreign keys yet (should be easy to add) and
will be forwarded as an independent patch at a later time.
Currently foreign keys are rejected like views, functions, etc.

Comments appreciated.
--
Rod Taylor <pg [at] rbt [dot] ca>

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Attachment Content-Type Size
altertable.patch.gz application/x-gzip 35.8 KB

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-12 04:06:47
Message-ID: 200311120406.hAC46lO06481@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Is this to be applied to CVS HEAD?

---------------------------------------------------------------------------

Rod Taylor wrote:
> A general re-organization of Alter Table. Node wise, it is a
> AlterTableStmt with a list of AlterTableCmds. The Cmds are the
> individual actions to be completed (Add constraint, drop constraint, add
> column, etc.)
>
> Processing is done in 2 phases. The first phase updates the system
> catalogs and creates a work queue for the table scan. The second phase
> is to conduct the actual table scan evaluating all constraints and other
> per tuple processing simultaneously, as required. This has no effect on
> single step operations, but has a large benefit for combinational logic
> where multiple table scans would otherwise be required.
>
> Steps for the table scan include expression processing (default or
> transform expressions), not null constraint process, other constraints.
>
>
> The code has been simplified a little. Simple permission and recursion
> checks are conducted in the common area and an enum for the command type
> is used rather than a Char.
>
>
> ALTER TABLE ADD COLUMN col DEFAULT 3 NOT NULL;
> Allow creation of a column with a default (filled) and not null
> constraint. This is internally subdivided into 3 actions much
> like the below syntax.
>
> ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
> The above combinational syntax is commented out in gram.y. The
> support framework is used in both the above and below items, but
> arbitrary statements probably have some issues -- I've not
> tested enough to determine.
>
> If it is useful, it will be submitted at a later date.
>
> ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
> Currently migrates indexes, check constraints, defaults, and the
> column definition to the new type with optional transform. If
> the tranform is not supplied, a standard assignment cast is
> attempted.
>
> One issue is that it detects dependencies way too late in the
> game (after the TRANFORM has been applied). I tried mucking up
> performDeletion to have a RESTRICT that did not throw the error
> at the end (or physically remove the items) but that didn't work
> out. Any bright ideas on how to achieve the NOTICES from a
> failed DROP COLUMN earlier? Otherwise, I can put in single
> object detection at that point.
>
> It does not migrate foreign keys yet (should be easy to add) and
> will be forwarded as an independent patch at a later time.
> Currently foreign keys are rejected like views, functions, etc.
>
>
> Comments appreciated.
> --
> Rod Taylor <pg [at] rbt [dot] ca>
>
> Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-12 04:46:34
Message-ID: 5677.1068612394@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Is this to be applied to CVS HEAD?

It sounded like large portions were still at the request-for-comment
stage...

regards, tom lane


From: Rod Taylor <pg(at)rbt(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-12 13:51:19
Message-ID: 1068645078.28107.51.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, 2003-11-11 at 23:46, Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Is this to be applied to CVS HEAD?
>
> It sounded like large portions were still at the request-for-comment
> stage...

It can be applied to -HEAD without breaking anything or backtracking in
feature set (that I can see) but there is substantial change in the way
things work so a pre-review is warranted and I'm sure changes will be
requested.

If I don't receive comments for corrections by the weekend, I'll finish
off one of the missing components (foreign keys) and submit a
replacement.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-12 19:02:20
Message-ID: Pine.LNX.4.44.0311121959560.32472-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Rod Taylor writes:

> ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
> The above combinational syntax is commented out in gram.y. The
> support framework is used in both the above and below items, but
> arbitrary statements probably have some issues -- I've not
> tested enough to determine.
>
> If it is useful, it will be submitted at a later date.

I think it's perfectly fine to write two separate ALTER TABLE statements.
No need to introduce this nonstandard syntax.

> ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
> Currently migrates indexes, check constraints, defaults, and the
> column definition to the new type with optional transform. If
> the tranform is not supplied, a standard assignment cast is
> attempted.

Please don't use the term "transform". It is used by the SQL standard for
other purposes. What kind of object would you put in place of the "..."
anyway? A function? What syntax do other databases use?

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Rod Taylor <pg(at)rbt(dot)ca>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-12 19:23:25
Message-ID: 1068665004.30452.32.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, 2003-11-12 at 14:02, Peter Eisentraut wrote:
> Rod Taylor writes:
>
> > ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);

> I think it's perfectly fine to write two separate ALTER TABLE statements.
> No need to introduce this nonstandard syntax.

Yes, it is certainly fine to do so, but much faster to do the above.

The command shown executes nearly 40% faster than 2 independent
statements in a single transaction -- the difference is even more
significant with additional sub-commands.

> > ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
> > Currently migrates indexes, check constraints, defaults, and the
> > column definition to the new type with optional transform. If
> > the tranform is not supplied, a standard assignment cast is
> > attempted.
>
> Please don't use the term "transform". It is used by the SQL standard for
> other purposes. What kind of object would you put in place of the "..."
> anyway? A function? What syntax do other databases use?

I've not found another database which allows this syntax. The suggestion
of TRANSFORM was Toms and was a result of using an assignment cast by
default. Do you have a better term I can use?

http://groups.google.ca/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=266b02a270a164aa&seekm=1064805960.60248.24.camel%40jester#link4

The ... is an A_Expr which does not accept (among other things)
subselects. CASE statements, equations, etc. work fine.

CREATE TABLE tab (col int2);
-- integer to boolean
ALTER TABLE tab ALTER col TYPE boolean
TRANSFORM CASE WHEN col >= 1 THEN true ELSE false END;

-- or say Bytes to MBytes (original column is int8)
ALTER TABLE tab ALTER col TYPE integer TRANSFORM col / (1024 * 1024);


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-13 11:28:19
Message-ID: 1068722899.4751.16.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Peter Eisentraut kirjutas K, 12.11.2003 kell 21:02:
> Rod Taylor writes:
>
> > ALTER TABLE tab ADD COLUMN col DEFAULT 3, ADD CHECK (anothercol > 3);
> > The above combinational syntax is commented out in gram.y. The
> > support framework is used in both the above and below items, but
> > arbitrary statements probably have some issues -- I've not
> > tested enough to determine.
> >
> > If it is useful, it will be submitted at a later date.
>
> I think it's perfectly fine to write two separate ALTER TABLE statements.

I guess the difference is that each pass (i.e. ALTER TABLE) needs to do
another scan and copy of the table. Putting them in one statement allows
all the alterations to be done in one pass.

> No need to introduce this nonstandard syntax.
>
> > ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
> > Currently migrates indexes, check constraints, defaults, and the
> > column definition to the new type with optional transform. If
> > the tranform is not supplied, a standard assignment cast is
> > attempted.
>
> Please don't use the term "transform". It is used by the SQL standard for
> other purposes.

Is the "other" use conflicting with this syntax ?

I think we have preferred reusing existing keywords to adding new ones
in the past.

-----------------
Hannu


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-13 14:18:30
Message-ID: Pine.LNX.4.44.0311131514220.16787-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Rod Taylor writes:

> Yes, it is certainly fine to do so, but much faster to do the above.

Are table schema changes performance-sensitive operations, and are they
usually done in bulk? I doubt it.

> I've not found another database which allows this syntax. The suggestion
> of TRANSFORM was Toms and was a result of using an assignment cast by
> default. Do you have a better term I can use?

I'm not sure I buy this whole concept in the first place. If there is
no cast between type A and type B, then surely changing a table column
from A to B is nonsensical.

> -- or say Bytes to MBytes (original column is int8)
> ALTER TABLE tab ALTER col TYPE integer TRANSFORM col / (1024 * 1024);

You can do this using a plain column type change plus an UPDATE. I'd
prefer to keep these operations independent.

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-13 14:20:05
Message-ID: Pine.LNX.4.44.0311131519000.16787-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hannu Krosing writes:

> > Please don't use the term "transform". It is used by the SQL standard for
> > other purposes.
>
> Is the "other" use conflicting with this syntax ?
>
> I think we have preferred reusing existing keywords to adding new ones
> in the past.

Maybe (although I don't agree). but we've never intentionally done
terminology overloading.

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Rod Taylor <pg(at)rbt(dot)ca>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-13 14:59:41
Message-ID: 1068735581.32543.43.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, 2003-11-13 at 09:18, Peter Eisentraut wrote:
> Rod Taylor writes:
>
> > Yes, it is certainly fine to do so, but much faster to do the above.
>
> Are table schema changes performance-sensitive operations, and are they
> usually done in bulk? I doubt it.

Perhaps not for you. But I would disagree with both of those points for
a majority of companies.

If the system has high uptime requirements, the timeframe to accomplish
maintenance decreases substantially. For example, I have two 1.5 hour
maintenance windows per year for maintenance. This is a requirement
given by our clients, nearly all of whom insist on it. Telcos and other
enterprise level environments with 24 / 7 usage patterns have different
demands.

So.. Adding 2 new check constraints to a table with 500million entries
takes way way too much time in my eyes.

Toms request (when I initially brought up fixing ADD COLUMN) was that it
would process the DEFAULT and other new constraints in a single pass.
The above syntax simply uses the framework put in place for ADD COLUMN.

I'll not push it if it does not function with complex combinations,
which has not been tested, as such is not included in this patch.

> > -- or say Bytes to MBytes (original column is int8)
> > ALTER TABLE tab ALTER col TYPE integer TRANSFORM col / (1024 * 1024);
>
> You can do this using a plain column type change plus an UPDATE. I'd
> prefer to keep these operations independent.

Yes, if you can do the update first.

Ok.. How about boolean to int2 as the user discovered they require a
third or fouth state? Or vice-versa. An MSSQL, MySQL, etc. migration
would leave a bunch of int2's (since they do not support boolean) in the
PostgreSQL database which have a boolean meaning.

rbt=# select '1'::int4::boolean
rbt-# ;
ERROR: cannot cast type integer to boolean
rbt=# select '1'::boolean;
bool
------
t
(1 row)

Are you going to tell these people to write down the Default, add a new
column, copy the data, remove the old column, re-add the default, and
rename the new column to the old column name?

This is the exact reason many folks have asked for this ALTER form to be
added.

Considering there are already 2 sites out there running with this patch
(I have been asked for advance copies) I do not believe the
functionality is overkill in any respect.

Can you please suggest a better term to use in place of TRANSFORM?
Perhaps UPDATE WITH?


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-13 16:35:51
Message-ID: 1068741350.2710.1.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Rod Taylor kirjutas N, 13.11.2003 kell 16:59:

>
> Can you please suggest a better term to use in place of TRANSFORM?
> Perhaps UPDATE WITH?

or perhaps USING, based loosely on our use of USING in CREATE INDEX ?

----------
Hannu


From: Rod Taylor <pg(at)rbt(dot)ca>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-13 21:42:30
Message-ID: 1068759749.32543.66.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

-- moved to -hackers

On Thu, 2003-11-13 at 11:35, Hannu Krosing wrote:
> Rod Taylor kirjutas N, 13.11.2003 kell 16:59:
> >
> > Can you please suggest a better term to use in place of TRANSFORM?
> > Perhaps UPDATE WITH?
>
> or perhaps USING, based loosely on our use of USING in CREATE INDEX ?

USING sounds reasonable to me. Peter?

To those not on -patches, Peter is unhappy with the TRANSFORM clause in:
ALTER TABLE tab ALTER COLUMN col TRANSFORM expression;

The reasoning is that TRANSFORM in the spec means something different
than our use. We're attempting to come up with a replacement term.


From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 13:59:05
Message-ID: 1068818345.1620.143.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Rod,

I tried the current patch on a RC2 release, and I noticed one
undesirable side affect.

Modifying a column moves it to the end. In high availability situations
this would not be desirable, I would imagine it would break lots of
code.

Dave
On Thu, 2003-11-13 at 11:35, Hannu Krosing wrote:
> Rod Taylor kirjutas N, 13.11.2003 kell 16:59:
>
> >
> > Can you please suggest a better term to use in place of TRANSFORM?
> > Perhaps UPDATE WITH?
>
> or perhaps USING, based loosely on our use of USING in CREATE INDEX ?
>
> ----------
> Hannu
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
>


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Dave Cramer <pg(at)fastcrypt(dot)com>
Cc: Hannu Krosing <hannu(at)tm(dot)ee>, Rod Taylor <pg(at)rbt(dot)ca>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 14:17:32
Message-ID: 20031114141732.GB23845@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, Nov 14, 2003 at 08:59:05AM -0500, Dave Cramer wrote:

> I tried the current patch on a RC2 release, and I noticed one
> undesirable side affect.
>
> Modifying a column moves it to the end. In high availability situations
> this would not be desirable, I would imagine it would break lots of
> code.

This is expected. Doing otherwise would incur into a much bigger
performance hit.

Anyway, IMHO no code should use SELECT * in any case, which is the only
scenario where one would expect physical column order to matter, isn't
it?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen)


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Dave Cramer <pg(at)fastcrypt(dot)com>, Rod Taylor <pg(at)rbt(dot)ca>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 14:23:17
Message-ID: 1068819797.2540.2.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera kirjutas R, 14.11.2003 kell 16:17:
> On Fri, Nov 14, 2003 at 08:59:05AM -0500, Dave Cramer wrote:
>
> > I tried the current patch on a RC2 release, and I noticed one
> > undesirable side affect.
> >
> > Modifying a column moves it to the end. In high availability situations
> > this would not be desirable, I would imagine it would break lots of
> > code.
>
> This is expected. Doing otherwise would incur into a much bigger
> performance hit.

Not neccessarily, but it would make the patch much bigger ;)

IIRC there was discussion about splitting colum numbers into physical
and logical numbers at the time when DROP COLUMN was done.

> Anyway, IMHO no code should use SELECT * in any case, which is the only
> scenario where one would expect physical column order to matter, isn't
> it?

and this could also break when just changing the column type.

------------
Hannu


From: Rod Taylor <pg(at)rbt(dot)ca>
To: pg(at)fastcrypt(dot)com
Cc: Hannu Krosing <hannu(at)tm(dot)ee>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 14:32:57
Message-ID: 1068820376.35839.15.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, 2003-11-14 at 08:59, Dave Cramer wrote:
> Rod,
>
> I tried the current patch on a RC2 release, and I noticed one
> undesirable side affect.
>
> Modifying a column moves it to the end. In high availability situations
> this would not be desirable, I would imagine it would break lots of
> code.

Yes, I've done that to myself a few times.

The method is rename old column, add new column, move data across, move
or reform dependencies, drop old column.

Adding the new column puts it to the end. In order to avoid the
repositioning we would need some kind of a position abstraction from the
physical storage to what the user sees. It's on the TODO list, but not a
part of this patch.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: pg(at)fastcrypt(dot)com, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 14:57:46
Message-ID: Pine.LNX.4.44.0311141556360.4276-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Rod Taylor writes:

> The method is rename old column, add new column, move data across, move
> or reform dependencies, drop old column.

I can do this by hand. If we have an explicit command to do it, then it
needs to preserve the table schema. Else, this feature would be mostly
useless and a certain source of complaints.

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Rod Taylor <pg(at)rbt(dot)ca>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pg(at)fastcrypt(dot)com, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 15:48:59
Message-ID: 1068824938.35839.68.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, 2003-11-14 at 09:57, Peter Eisentraut wrote:
> Rod Taylor writes:
>
> > The method is rename old column, add new column, move data across, move
> > or reform dependencies, drop old column.
>
> I can do this by hand. If we have an explicit command to do it, then it
> needs to preserve the table schema. Else, this feature would be mostly
> useless and a certain source of complaints.

The method was agreed to on -hackers prior to any code having been
written. Unless I'm mistaken, the method was suggested to me by other on
-hackers.

Can you give me an alternative on how to approach this problem without
relying on another TODO item labelled ALTER TABLE .. POSITION?

The point of the command isn't to accomplish anything magical, simply to
make it easier.

begin;
Rename old column
Add new column
Copy data
Recreate default
Recreate NOT NULL constraint
Recreate index 1
Recreate index 2 (multi-key)
Recreate check constraint 1
Recreate check constraint 2
Recreate check constraint 3
Recreate foreign key 1
remove old column
commit;
vacuum full table (to get rid of 2x problem since the user was unable to
do a fileswap)

The above is by hand, the below is with this command:

begin;
Alter table .. type ..
commit;

If it becomes possible to change the order without having to goto an
exceptional amount of work (that POSITION item) then it would only be
obvious to attempt to preserve the position.


From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 15:59:31
Message-ID: 1068825571.1625.197.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

OK,

Here is another approach, that would retain column order. It will
require that the table be locked while this proceeds, but I think this
is a good idea anyway.

lock table
create newtable as select c1, c2, c3::newtype
modify pg_class to point to the new filename
modify existing pg_attribute for the column in question
recreate indexes that exist on the column
unlock table

Dave

On Fri, 2003-11-14 at 09:57, Peter Eisentraut wrote:
> Rod Taylor writes:
>
> > The method is rename old column, add new column, move data across, move
> > or reform dependencies, drop old column.
>
> I can do this by hand. If we have an explicit command to do it, then it
> needs to preserve the table schema. Else, this feature would be mostly
> useless and a certain source of complaints.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pg(at)fastcrypt(dot)com, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 16:38:20
Message-ID: 26628.1068827900@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Rod Taylor <pg(at)rbt(dot)ca> writes:
> On Fri, 2003-11-14 at 09:57, Peter Eisentraut wrote:
>> I can do this by hand. If we have an explicit command to do it, then it
>> needs to preserve the table schema. Else, this feature would be mostly
>> useless and a certain source of complaints.

> The method was agreed to on -hackers prior to any code having been
> written.

I believe the consensus was that automating what you could do by hand
is still a step forward.

It clearly would be better if we could relabel the logical column
position after finishing the whole process, but I agree with Rod that
that is an independent patch. Combining them into one mega-patch
doesn't sound like good engineering.

I guess the real question here is whether we would want to revert this
capability if a patch to adjust logical column orderings doesn't appear
before 7.5. My vote would be "no", but apparently Peter's is "yes".
Any other opinions?

regards, tom lane


From: Rod Taylor <pg(at)rbt(dot)ca>
To: pg(at)fastcrypt(dot)com
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 16:44:08
Message-ID: 1068828247.35839.80.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> lock table
> create newtable as select c1, c2, c3::newtype
> modify pg_class to point to the new filename
> modify existing pg_attribute for the column in question
> recreate indexes that exist on the column
> unlock table

I actually tried this to start with an ran into several dead-ends in
trying to rebuild constraints, defaults, etc. In order to do this I
think you would need to create a second pg_class entry and do a full
table swap.

By far, the easiest method to preserve order without writing a ton of
additional code is to do something that is on the TODO list already,
separate the visible position from physical position.

Doing the above has lots of added benefits, where spending a ton of time
on TYPE change has very few benefits.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, <pg(at)fastcrypt(dot)com>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 16:51:44
Message-ID: Pine.LNX.4.44.0311141745130.4276-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane writes:

> I believe the consensus was that automating what you could do by hand
> is still a step forward.

I don't recall that, but if so, I would like to revisit that consensus.

AFAICT, this patch does not buy us anything at all. It's just a different
spelling of existing functionality. We have never done that before. It
just makes the system harder to maintain and use. All commands should be
reasonably independent, or at least add some functionality of their own.

> It clearly would be better if we could relabel the logical column
> position after finishing the whole process, but I agree with Rod that
> that is an independent patch. Combining them into one mega-patch
> doesn't sound like good engineering.

Good engineering would be if the logical column number patch comes first.
We cannot possibly leave this patch as is. People expect in-place column
changes. Things will break left and right, users will complain all over
the place if we offer a way to change a column, but yeah, by the way it
changes the structure of the table as well. We've had these kinds of good
idea/right direction/better than nothing approaches in areas like DROP
COLUMN and CLUSTER already, and they were no good. Except in this case,
"better than nothing" doesn't even apply, because there is already
something.

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <pg(at)rbt(dot)ca>, pg(at)fastcrypt(dot)com, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 17:28:45
Message-ID: 1068830924.2540.41.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Peter Eisentraut kirjutas R, 14.11.2003 kell 18:51:
> Tom Lane writes:
>
> > I believe the consensus was that automating what you could do by hand
> > is still a step forward.
>
> I don't recall that, but if so, I would like to revisit that consensus.
>
> AFAICT, this patch does not buy us anything at all. It's just a different
> spelling of existing functionality. We have never done that before.

what about DROP COLUMN - this is also just a different spelling for

SELECT INTO, migrate all constraints, DROP OLD TABLE, RENAME.

> It just makes the system harder to maintain and use. All commands should be
> reasonably independent, or at least add some functionality of their own.
>
> > It clearly would be better if we could relabel the logical column
> > position after finishing the whole process, but I agree with Rod that
> > that is an independent patch. Combining them into one mega-patch
> > doesn't sound like good engineering.
>
> Good engineering would be if the logical column number patch comes first.

Agreed, it would be nice.

> We cannot possibly leave this patch as is. People expect in-place column
> changes.

Does SQL spec even require that SELECT * always returns columns in the
same order ? I don't think that relational model assigns any 'order' to
columns.

BTW, SELECT * is just a different spelling of existing functionality ;)

> Things will break left and right, users will complain all over
> the place if we offer a way to change a column, but yeah, by the way it
> changes the structure of the table as well.

ALTER TABLE ADD/DROP COLUMN would also break SELECT *.

as would type change in many cases (query expects int but gets string)

> We've had these kinds of good
> idea/right direction/better than nothing approaches in areas like DROP
> COLUMN and CLUSTER already, and they were no good. Except in this case,
> "better than nothing" doesn't even apply, because there is already
> something.

We have always had DUMP -> sed -> LOAD too.

This patch is *MUCH* better than nothing for performance. Instead of
being N times SEQSCAN (+ intervening VACUUM FULLs to reclaim space), it
can do it all in one pass. If that pass takes 1.5 hours, we have a very
big win here.

-----------------
Hannu


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, pg(at)fastcrypt(dot)com, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-14 18:42:59
Message-ID: 3051.1068835379@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hannu Krosing <hannu(at)tm(dot)ee> writes:
> Peter Eisentraut kirjutas R, 14.11.2003 kell 18:51:
>> I don't recall that, but if so, I would like to revisit that consensus.

> [ Hannu disagrees ]

Please take this thread to pgsql-hackers; if the issue is going to be
contentious then we should try to get a wider spectrum of opinion than
just people who follow -patches.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <pg(at)rbt(dot)ca>, <pg(at)fastcrypt(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] ALTER TABLE modifications
Date: 2003-11-14 19:01:13
Message-ID: Pine.LNX.4.44.0311141956540.5327-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hannu Krosing writes:

> > AFAICT, this patch does not buy us anything at all. It's just a different
> > spelling of existing functionality. We have never done that before.
>
> what about DROP COLUMN - this is also just a different spelling for
>
> SELECT INTO, migrate all constraints, DROP OLD TABLE, RENAME.

No, because DROP COLUMN preserves dependent objects.

> > We cannot possibly leave this patch as is. People expect in-place column
> > changes.
>
> Does SQL spec even require that SELECT * always returns columns in the
> same order ?

Yes:

b) Otherwise, the <select list> "*" is equivalent to a <value
expression> sequence in which each <value expression>
is a column reference that references a column of T and
each column of T is referenced exactly once. The columns
are referenced in the ascending sequence of their ordinal
position within T.

> I don't think that relational model assigns any 'order' to columns.

Correct, but SQL is not the relational model or vice versa.

> BTW, SELECT * is just a different spelling of existing functionality ;)

No, there is no other way to get a complete list of columns. (Hard-coding
does not count.)

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Dave Cramer <pg(at)fastcrypt(dot)com>, Hannu Krosing <hannu(at)tm(dot)ee>, Rod Taylor <pg(at)rbt(dot)ca>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-15 03:21:18
Message-ID: 3FB59BAE.8090507@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> This is expected. Doing otherwise would incur into a much bigger
> performance hit.
>
> Anyway, IMHO no code should use SELECT * in any case, which is the only
> scenario where one would expect physical column order to matter, isn't
> it?

Well, we can always bring back the old idea of a attlognum which is the
logical order of the columns. Then we use that to expand *

Chris


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pg(at)fastcrypt(dot)com, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-15 03:38:18
Message-ID: 3FB59FAA.1050104@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> I guess the real question here is whether we would want to revert this
> capability if a patch to adjust logical column orderings doesn't appear
> before 7.5. My vote would be "no", but apparently Peter's is "yes".
> Any other opinions?

The fact that it deals with the nightmare of dropping and recreating
indexes and fk's makes it worth it to me.

How about functions and views and rules that depend on it though?

Chris


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: ALTER TABLE modifications
Date: 2003-11-16 14:10:00
Message-ID: 1068991799.17671.22.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Rod Taylor kirjutas L, 08.11.2003 kell 18:55:
> A general re-organization of Alter Table. Node wise, it is a
> AlterTableStmt with a list of AlterTableCmds. The Cmds are the
> individual actions to be completed (Add constraint, drop constraint, add
> column, etc.)
>
> Processing is done in 2 phases. The first phase updates the system
> catalogs and creates a work queue for the table scan. The second phase
> is to conduct the actual table scan evaluating all constraints and other
> per tuple processing simultaneously, as required. This has no effect on
> single step operations, but has a large benefit for combinational logic
> where multiple table scans would otherwise be required.

...

> ALTER TABLE tab ALTER COLUMN col TYPE text TRANSFORM ...;
> Currently migrates indexes, check constraints, defaults, and the
> column definition to the new type with optional transform. If
> the tranform is not supplied, a standard assignment cast is
> attempted.

Do you have special cases for type changes which don't need data
transforms.

I mean things like changing VARCHAR(10) to VARCHAR(20), dropping the NOT
NULL constraint or changing CHECK A < 3 to CHECK A < 4.

All these could be done with no data migration or extra checking.

So how much of it should PG attemt to detect automatically and should
there be NOSCAN option when progremmer knows better
(changing CHECK ABS(A) < 3 into CHECK 9 > (A*A) )

----------------
Hannu


From: Rod Taylor <pg(at)rbt(dot)ca>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] ALTER TABLE modifications
Date: 2003-11-16 14:37:13
Message-ID: 1068993432.87852.9.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

-- moving to -hackers

> Do you have special cases for type changes which don't need data
> transforms.

> I mean things like changing VARCHAR(10) to VARCHAR(20), dropping the NOT
> NULL constraint or changing CHECK A < 3 to CHECK A < 4.

There are basically 3 types of change.

The first is simple, a removal. These do not kick off scans of any form.
You simply do not add any work to the post-processing queue.

The second requests a table scan. This is intended for check constraint,
etc. additions.

The third is a rewrite of the records which is done with via a file
swap. All type changes, column + default additions, etc. are done this
way. It will simultaneously take care of the check constraint, etc.
requests as well.

> All these could be done with no data migration or extra checking.

I'll leave it for someone else to add in the queue bypasses for simple
items. I've otherwise attempted to maintain the current processes (new
check constraint always scans, etc.).

Queued foreign keys are going to need to work a little differently since
they are cross table checks. I'm currently deferring them until
everything else has been accomplished (affects alter type only at the
moment).