Re: column ordering, was Re: [PATCHES] Enums patch v2

Lists: pgsql-hackerspgsql-patches
From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: pgsql-patches(at)postgresql(dot)org
Subject: Enums patch v2
Date: 2006-12-19 02:34:36
Message-ID: 45874FBC.9000804@tomd.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi all

Here is an updated version of the enums patch. It has been brought up to
date and applies against current CVS HEAD. The original email is at [1],
and describes the implementation.

This version contains sgml documentation, and contains the missing
bounds checks and error codes noted in the last email.

As mentioned before, the only part that I'm not super keen on is the
hack required to pass the type oid in to the text-to-enum cast function,
since normally those take type mods but not type oids. I wasn't sure how
else to get a cast working though, short of allowing another type of
cast function which accepts type oids as well. That seemed overkill for
just one case, though, and was getting a bit beyond the realms of what I
wanted to get done with this patch.

Anyway, it's reasonably feature complete and should be appropriately
documented now, so feedback gratefully accepted.

Cheers

Tom

[1] http://archives.postgresql.org/pgsql-patches/2006-09/msg00000.php

Attachment Content-Type Size
enums.patch.gz application/x-gzip 23.8 KB

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2006-12-19 08:09:47
Message-ID: 45879E4B.5050407@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Dunstan wrote:
> Here is an updated version of the enums patch. It has been brought up to
> date and applies against current CVS HEAD. The original email is at [1],
> and describes the implementation.

I'm sorry I missed the original discussions, but I have to ask: Why do
we want enums in core? The only potential advantage I can see over using
a look-up table and FK references is performance. And I'd rather spend
time improving the performance of FK checks than add extra machinery to
do the same thing in a different way.

Ignoring my general dislike of enums, I have a few issues with the patch
as it is:

1. What's the point of having comparison operators for enums? For most
use cases, there's no natural ordering of enum values.

2. The comparison routine compares oids, right? If the oids wrap around
when the enum values are created, the ordering isn't what the user expects.

3. 4 bytes per value is wasteful if you're storing simple status codes
etc. Especially if you're doing this for performance, let's do no harm
by wasting space. One byte seems enough for the typical use cases. I'd
even argue that having a high upper limit on the number of enum values
encourages misuse of the feature.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2006-12-19 08:23:23
Message-ID: 17858.1166516603@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
> Ignoring my general dislike of enums, I have a few issues with the patch
> as it is:

> 1. What's the point of having comparison operators for enums? For most
> use cases, there's no natural ordering of enum values.

If you would like to be able to index enum columns, or even GROUP BY one,
you need those; whether the ordering is arbitrary or not is irrelevant.

> 2. The comparison routine compares oids, right? If the oids wrap around
> when the enum values are created, the ordering isn't what the user expects.

This is a fair point --- it'd be better if the ordering were not
dependent on chance OID assignments. Not sure what we are willing
to pay to have that though.

> 3. 4 bytes per value is wasteful if you're storing simple status codes
> etc.

I've forgotten exactly which design Tom is proposing to implement here,
but at least one of the contenders involved storing an OID that would be
unique across all enum types. 1 byte is certainly not enough for that
and even 2 bytes would be pretty marginal. I'm unconvinced by arguments
about 2 bytes being so much better than 4 anyway --- in the majority of
real table layouts, the hoped-for savings would disappear into alignment
padding.

regards, tom lane


From: David Fetter <david(at)fetter(dot)org>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2006-12-19 08:51:01
Message-ID: 20061219085101.GC14423@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, Dec 19, 2006 at 08:09:47AM +0000, Heikki Linnakangas wrote:
> Tom Dunstan wrote:
> >Here is an updated version of the enums patch. It has been brought up to
> >date and applies against current CVS HEAD. The original email is at [1],
> >and describes the implementation.
>
> I'm sorry I missed the original discussions, but I have to ask: Why do
> we want enums in core? The only potential advantage I can see over using
> a look-up table and FK references is performance.

A natural ordering is another. I'd love to be able to make a type
color that has

Red
Orange
Yellow
Green
Blue
Indigo
Violet

and then be able to do an ORDER BY color;

> And I'd rather spend time improving the performance of FK checks
> than add extra machinery to do the same thing in a different way.

Not the same thing.

> Ignoring my general dislike of enums, I have a few issues with the patch
> as it is:
>
> 1. What's the point of having comparison operators for enums? For most
> use cases, there's no natural ordering of enum values.

A natural ordering is precisely the use case for enums. Otherwise,
you just use a FK to a one-column table and have done.

Cheers,
D
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Tom Dunstan <pgsql(at)tomd(dot)cc>
Subject: Re: Enums patch v2
Date: 2006-12-19 09:58:48
Message-ID: 200612191058.49304.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Heikki Linnakangas wrote:
> I'm sorry I missed the original discussions, but I have to ask: Why
> do we want enums in core? The only potential advantage I can see over
> using a look-up table and FK references is performance.

The difference is that foreign-key-referenced data is part of your data
whereas enums would be part of the type system used to model the data.

An objection to enums on the ground that foreign keys can accomplish the
same thing could be extended to object to any data type with a finite
domain.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2006-12-19 14:34:27
Message-ID: 4587F873.7070202@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>
>> 1. What's the point of having comparison operators for enums? For most
>> use cases, there's no natural ordering of enum values.
>>
>
> If you would like to be able to index enum columns, or even GROUP BY one,
> you need those; whether the ordering is arbitrary or not is irrelevant.
>

Heikki's assertion is wrong in any case. The enumeration definition
defines the ordering, and I can think of plenty of use cases where it
does matter. We do not use an arbitrary ordering. An enum type is an
*ordered* set of string labels. Without this the feature would be close
to worthless. But if a particular application doesn't need them ordered,
it need not use the comparison operators. Leaving aside the uses for
GROUP BY and indexes, I would ask what the justification would be for
leaving off comparison operators?

>
>> 2. The comparison routine compares oids, right? If the oids wrap around
>> when the enum values are created, the ordering isn't what the user expects.
>>
>
> This is a fair point --- it'd be better if the ordering were not
> dependent on chance OID assignments. Not sure what we are willing
> to pay to have that though.
>

This is a non-issue. The code sorts the oids before assigning them:

/* allocate oids */
oids = (Oid *) palloc(sizeof(Oid) * n);
for(i = 0; i < n; i++)
{
oids[i] = GetNewOid(pg_enum);
}
/* wraparound is unlikely, but just to be safe...*/
qsort(oids, n, sizeof(Oid), oid_cmp);

>
>> 3. 4 bytes per value is wasteful if you're storing simple status codes
>> etc.
>>
>
> I've forgotten exactly which design Tom is proposing to implement here,
> but at least one of the contenders involved storing an OID that would be
> unique across all enum types. 1 byte is certainly not enough for that
> and even 2 bytes would be pretty marginal. I'm unconvinced by arguments
> about 2 bytes being so much better than 4 anyway --- in the majority of
> real table layouts, the hoped-for savings would disappear into alignment
> padding.
>
>
>

Globally unique is the design adopted, after much on-list discussion.
That was a way of getting it *down* to 4 bytes. The problem is that the
output routines need enough info from just the internal representation
of the type value to do their work. The original suggestions was for 8
bytes - type oid + offset in value set. Having them globally unique lets
us get down to 4.

As for efficiency, I agree with what Tom says about alignment and
padding dissolving away any perceived advantage in most cases. If we
ever get around to optimising record layout we could revisit it.

cheers

andrew


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Enums patch v2
Date: 2006-12-19 14:58:48
Message-ID: 20061219145848.GK27098@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:

> As for efficiency, I agree with what Tom says about alignment and
> padding dissolving away any perceived advantage in most cases. If we
> ever get around to optimising record layout we could revisit it.

I don't, because there are always those that are knowledgeable enough to
know how to reduce space lost to padding. So it would be nice to have
2-byte enums on-disk, and resolve them based on the column's typid. But
then, I'm not familiar with the patch at all so I'm not sure if it's
possible.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Enums patch v2
Date: 2006-12-19 15:11:42
Message-ID: 4588012E.7020207@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>
>> As for efficiency, I agree with what Tom says about alignment and
>> padding dissolving away any perceived advantage in most cases. If we
>> ever get around to optimising record layout we could revisit it.
>>
>
> I don't, because there are always those that are knowledgeable enough to
> know how to reduce space lost to padding. So it would be nice to have
> 2-byte enums on-disk, and resolve them based on the column's typid. But
> then, I'm not familiar with the patch at all so I'm not sure if it's
> possible.
>
>

The trouble is that we have one output routine for all enum types. See
previous discussions about disallowing extra params to output routines.
So if all we have is a 2 byte offset into the list of values for the
given type, we do not have enough info to allow the output routine to
deduce which particular enum type it is dealing with. With the globally
unique oid approach it doesn't even need to care - it just looks up the
corresponding value. Note that this was a reduction from the previously
suggested (by TGL) 8 bytes.

I'm not a big fan of ordering columns to optimise record layout, except
in the most extreme cases (massive DW type apps). I think visible column
order should be logical, not governed by physical considerations.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Enums patch v2
Date: 2006-12-19 15:19:34
Message-ID: 23973.1166541574@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I don't, because there are always those that are knowledgeable enough to
> know how to reduce space lost to padding. So it would be nice to have
> 2-byte enums on-disk, and resolve them based on the column's typid. But
> then, I'm not familiar with the patch at all so I'm not sure if it's
> possible.

Remember that the value has to be decodable by the output routine.
So the only way we could do that would be by creating a separate output
function for each enum type. (That is, a separate pg_proc entry
... they could all point at the same C function, which would have to
check which OID it was called as and work backward to determine the enum
type.)

While this is doubtless doable, it's slow, it bloats pg_proc, and
frankly no argument has been offered that's compelling enough to
require it. The alignment issue takes enough air out of the
space-saving argument that it doesn't seem sufficient to me.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Tom Dunstan" <pgsql(at)tomd(dot)cc>, <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] Enums patch v2
Date: 2006-12-19 15:22:13
Message-ID: 87psafkj16.fsf@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Andrew Dunstan" <andrew(at)dunslane(dot)net> writes:

> I'm not a big fan of ordering columns to optimise record layout, except in the
> most extreme cases (massive DW type apps). I think visible column order should
> be logical, not governed by physical considerations.

Well as long as we're talking "should"s the database should take care of this
for you anyways.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-19 15:48:41
Message-ID: 458809D9.5030109@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark wrote:
> "Andrew Dunstan" <andrew(at)dunslane(dot)net> writes:
>
>
>> I'm not a big fan of ordering columns to optimise record layout, except in the
>> most extreme cases (massive DW type apps). I think visible column order should
>> be logical, not governed by physical considerations.
>>
>
> Well as long as we're talking "should"s the database should take care of this
> for you anyways.
>
>

Sure, but the only sane way I can think of to do that would be have
separate logical and physical orderings, with a map between the two. I
guess we'd need to see what the potential space savings would be and
establish what the processing overhead would be, before considering it.
One side advantage would be that it would allow us to do the often
requested "add column at position x".

cheers

andrew


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-19 16:25:04
Message-ID: 20061219162504.GC21385@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, Dec 19, 2006 at 10:48:41AM -0500, Andrew Dunstan wrote:
> Sure, but the only sane way I can think of to do that would be have
> separate logical and physical orderings, with a map between the two. I
> guess we'd need to see what the potential space savings would be and
> establish what the processing overhead would be, before considering it.
> One side advantage would be that it would allow us to do the often
> requested "add column at position x".

A patch to allow seperate physical and logical orderings was submitted
and rejected. Unless something has changed on that front, any
discussion in this direction isn't really useful.

Once this is possible it would allow a lot of simple savings. For
example, shifting all fixed width fields to the front means they can
all be accessed without looping through the previous columns, for
example.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2006-12-20 01:23:02
Message-ID: 45889076.5020507@tomd.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Heikki Linnakangas wrote:
> I'm sorry I missed the original discussions, but I have to ask: Why do
> we want enums in core? The only potential advantage I can see over using
> a look-up table and FK references is performance.

Well, there are a few things. Sometimes its tidiness, sometimes
integrity... I've seen more than one system with hundreds of these
things, and they've either gone down the table-per-enum solution, with
LOTS of extra tables whose values never change, or the EAV solution,
with one or two globally referenced tables to which everything in your
system has as a FK, and an integrity check in a trigger if you're very
lucky. Yuck on both accounts. Enums hit a sweet spot in the middle and
provide data integrity and performance for non-changing values.

> 1. What's the point of having comparison operators for enums? For most
> use cases, there's no natural ordering of enum values.

Well, there are a number of cases where ordering IS important, and
indeed, enums provide a way to do it easily where many of the
alternative solutions do not. It's one of the key benefits.

> 2. The comparison routine compares oids, right? If the oids wrap around
> when the enum values are created, the ordering isn't what the user expects.

As has been pointed out by others quicker on the draw than me, I do sort
the OIDs at enum creation time, for exactly this reason.

> 3. 4 bytes per value is wasteful if you're storing simple status codes
> etc. Especially if you're doing this for performance, let's do no harm
> by wasting space. One byte seems enough for the typical use cases. I'd
> even argue that having a high upper limit on the number of enum values
> encourages misuse of the feature.

I'd really love to have these fit into a 1 or 2 byte value on disk, but
AFAIK there's simply no way to do it currently in postgresql. If we ever
move to a solution where on-disk representation is noticeably different
from in-memory representation, then it might be doable. If that does
happen, we might benefit from other improvements such as being able to
order columns in a tuple on disk so as to minimize alignment padding,
not having to store a composite type's oid, etc. Until that happens,
though, if it ever does, this is probably the tightest on-disk
representation we're likely to get, unless we're happy to impose some
pretty severe restrictions, like 8 bits per enum, and only 256 enums in
total (for a 2 byte total). I was already shot down trying to make
similar restrictions when I first brought it up. :) The OID solution
seems to offend the least. :)

We did discuss this somewhat earlier, and I'm happy to take alternative
suggestions, but AFAIK this is about as good as it's going to get right now.

Cheers

Tom


From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Enums patch v2
Date: 2006-12-20 01:39:58
Message-ID: 4588946E.90505@tomd.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera wrote:
> I don't, because there are always those that are knowledgeable enough to
> know how to reduce space lost to padding. So it would be nice to have
> 2-byte enums on-disk, and resolve them based on the column's typid. But
> then, I'm not familiar with the patch at all so I'm not sure if it's
> possible.

Not with this patch, and AFAIK not possible generally, without writing
separate I/O functions for each type. I'd love to be able to do that,
but I don't think it's possible currently. The main stumbling block is
the output function (and cast-to-text function), because output
functions do not get provided the oid of the type that they're dealing
with, for security reasons IIRC. It was never clear to me why I/O
functions should ever be directly callable by a user (and hence open to
security issues), but apparently it was enough to purge any that were
designed like that from the system, so I wasn't going to go down that
road with the patch.

Cheers

Tom


From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Subject: Re: Enums patch v2
Date: 2006-12-20 02:00:48
Message-ID: 45889950.5000603@tomd.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Peter Eisentraut wrote:
> An objection to enums on the ground that foreign keys can accomplish the
> same thing could be extended to object to any data type with a finite
> domain.

Exactly. The extreme case is the boolean type, which could easily be
represented by a two-value enum. Or, if you were feeling masochistic, a
FK to a separate table. Which is easier?

People regularly do stuff like having domains over finite text values,
or having a FK to a separate (static) table, or using some sort of EAV.
Enums are type-safe, easily ordered, relatively efficient and don't
leave zillions of little static tables all over the place, a combination
of attributes that none of the alternative solutions in this space present.

Cheers

Tom


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 03:35:21
Message-ID: 200612192235.22169.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tuesday 19 December 2006 11:25, Martijn van Oosterhout wrote:
> On Tue, Dec 19, 2006 at 10:48:41AM -0500, Andrew Dunstan wrote:
> > Sure, but the only sane way I can think of to do that would be have
> > separate logical and physical orderings, with a map between the two. I
> > guess we'd need to see what the potential space savings would be and
> > establish what the processing overhead would be, before considering it.
> > One side advantage would be that it would allow us to do the often
> > requested "add column at position x".
>
> A patch to allow seperate physical and logical orderings was submitted
> and rejected. Unless something has changed on that front, any
> discussion in this direction isn't really useful.
>

The patch was rejected on technical means, and the author decided it was too
much work to finish it. If someone wanted to try and complete that work I
don't think anyone would stand against it.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 03:44:53
Message-ID: 16297.1166586293@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> On Tuesday 19 December 2006 11:25, Martijn van Oosterhout wrote:
>> A patch to allow seperate physical and logical orderings was submitted
>> and rejected. Unless something has changed on that front, any
>> discussion in this direction isn't really useful.

> The patch was rejected on technical means, and the author decided it was too
> much work to finish it. If someone wanted to try and complete that work I
> don't think anyone would stand against it.

Apparently you don't remember the discussion. The fundamental objection
to it was that it would create a never-ending source of bugs, ie, using
the logical column number where the physical number was required or vice
versa. Even assuming that we could eliminate all such bugs in the code
base at any instant, what would prevent introduction of another such bug
in every patch? Most ordinary test cases would fail to expose the
difference.

If you can show me a reasonably bulletproof or machine-checkable way to
keep the two kinds of column numbers distinct, I'd be all for it. But
without that, the answer will remain no.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 04:12:50
Message-ID: 20061220041250.GF24675@kenobi.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> If you can show me a reasonably bulletproof or machine-checkable way to
> keep the two kinds of column numbers distinct, I'd be all for it. But
> without that, the answer will remain no.

Force references to go through macros which implement the lookup for the
appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic. I guess my
feeling on how this would be approached would be that there'd simply be
a level where logical columns are used and a seperate level where
physical columns are used. Perhaps the storage layer isn't well enough
abstracted for that though. Another possibility would be to declare
seperate structures for them (or do something else along those lines,
aka, whatever it is the Linux kernel does) and get the compiler to whine
whenever the typing isn't followed correctly.

Just tossing some thoughts out there, I'd *really* like to have
movable-columns and the ability to add columns in where they're most
appropriate instead of off on the end... If we can settle on an
approach to deal with Tom's concern I'd be willing to look at updating
the patch to implement it though it's not really high enough that I can
promise anything.

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 04:29:24
Message-ID: 16675.1166588964@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> Force references to go through macros which implement the lookup for the
> appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
> PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.

It doesn't really address the question of how you know which one to
use at any particular line of code; or even more to the point, what
mechanism will warn you if you use the wrong one.

My gut feeling about this is that we could probably enforce such a
distinction if we were using C++, but while coding in C I have no
confidence in it. (And no, that's not a vote to move to C++ ...)

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Enums patch v2
Date: 2006-12-20 11:03:11
Message-ID: 20061220110311.GA12639@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Dec 20, 2006 at 01:39:58AM +0000, Tom Dunstan wrote:
> Not with this patch, and AFAIK not possible generally, without writing
> separate I/O functions for each type. I'd love to be able to do that,
> but I don't think it's possible currently. The main stumbling block is
> the output function (and cast-to-text function), because output
> functions do not get provided the oid of the type that they're dealing
> with, for security reasons IIRC. It was never clear to me why I/O
> functions should ever be directly callable by a user (and hence open to
> security issues), but apparently it was enough to purge any that were
> designed like that from the system, so I wasn't going to go down that
> road with the patch.

I worked around this in taggedtypes by indeed creating seperate copies
of the i/o functions on demand and at execution time looking up the
required type from the function signiture.

The only solution indeed is to change the calling convention if the I/O
functions so that the relevent datatype oid stored in a safe place,
that isn't set for normal function calls.

BTW, being able to call type i/o functions directly is very useful. For
example date_in(text_out(blah)) is a form of cast between types that
don't usually have a cast. If you change the calling convention as
indicated, that trick will still work, just not for types with the
restricted i/o functions.

Also, it's not just I/O functions that are the issue, consider the
enum-to-integer cast.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 11:33:53
Message-ID: 45891FA1.5000902@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>
>> Force references to go through macros which implement the lookup for the
>> appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
>> PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.
>>
>
> It doesn't really address the question of how you know which one to
> use at any particular line of code; or even more to the point, what
> mechanism will warn you if you use the wrong one.
>
> My gut feeling about this is that we could probably enforce such a
> distinction if we were using C++, but while coding in C I have no
> confidence in it. (And no, that's not a vote to move to C++ ...)
>
What about a comprimise...

The 8.1 documentation for ALTER TABLE states the following.

Adding a column with a non-null default or changing the type of an
existing column will require the entire table to be rewritten. This may
take a significant amount of time for a large table; and it will
temporarily require double the disk space.

Now, we are rewriting the table from scratch anyway, the on disk format
is changing. What is stopping us from switching the column order at the
same time. The only thing I can think is that the catalogs will need
more work to update them. It's a middle sized price to pay for being
able to reorder the columns in the table. One of the problems I have is
wanting to add a column in the middle of the table, but FK constraints
stop me dropping the table to do the reorder. If ALTER TABLE would let
me stick it in the middle and rewrite the table on disk, I wouldn't
care. It's likely that I would be rewriting the table anyway. And by
specifying AT POSITION, or BEFORE/AFTER you know for big tables it's
going to take a while.

Not that I'm able to code this at all, but I'm interested in feedback on
this option.

Regards

Russell Smith
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate
>
>
>


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 12:06:54
Message-ID: 20061220120654.GC12639@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, Dec 19, 2006 at 11:29:24PM -0500, Tom Lane wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > Force references to go through macros which implement the lookup for the
> > appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
> > PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.
>
> It doesn't really address the question of how you know which one to
> use at any particular line of code; or even more to the point, what
> mechanism will warn you if you use the wrong one.

There's one method: Set it up so that when you create a table, it
randomizes the order of the fields on disk. Obviously for production
this isn't smart, but it would test the code a lot. Though in the
regression tests many tables only have one column so they won't be
affected.

If we had unit tests you could create a function called
"heap_mangle_tuple" which simply does physical reordering but logically
does nothing and feed it in at each point to check the code is
invarient.

Another approach is to number logical columns starting at 1000. This
would mean that at a glance you could tell what you're talking about.
And code using the wrong one will do something obviously bad. If
performance is an issue you could only enable the offset for
--enable-assert builds.

Personally I like this approach because it would "encourage" everyone
to use the macro to access the fields, since not doing so will place a
constant in an obvious place. It's also trivial for the system to
check.

Personally I'm unsure of the scope of the problem. AFAICS there's
hardly anywhere that would use physical offsets...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 12:26:59
Message-ID: 200612201327.02142.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
> If you can show me a reasonably bulletproof or machine-checkable way to
> keep the two kinds of column numbers distinct, I'd be all for it.

The only way I can see is to make the domains of the numbers distinct.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Kenneth Marshall <ktm(at)it(dot)is(dot)rice(dot)edu>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 13:20:14
Message-ID: 20061220132014.GY27765@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:
> Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
> > If you can show me a reasonably bulletproof or machine-checkable way to
> > keep the two kinds of column numbers distinct, I'd be all for it.
>
> The only way I can see is to make the domains of the numbers distinct.
>
Negative vs. positive numbers?

Ken


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Kenneth Marshall <ktm(at)it(dot)is(dot)rice(dot)edu>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 13:38:30
Message-ID: 20061220133830.GE12639@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Dec 20, 2006 at 07:20:14AM -0600, Kenneth Marshall wrote:
> On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:
> > Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
> > > If you can show me a reasonably bulletproof or machine-checkable way to
> > > keep the two kinds of column numbers distinct, I'd be all for it.
> >
> > The only way I can see is to make the domains of the numbers distinct.
> >
> Negative vs. positive numbers?

Negative is used by system columns. Just adding some large constant
(say 10000) should be enough.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Kenneth Marshall <ktm(at)it(dot)is(dot)rice(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 13:59:21
Message-ID: 200612201459.24704.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Am Mittwoch, 20. Dezember 2006 14:20 schrieb Kenneth Marshall:
> On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:
> > Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
> > > If you can show me a reasonably bulletproof or machine-checkable way to
> > > keep the two kinds of column numbers distinct, I'd be all for it.
> >
> > The only way I can see is to make the domains of the numbers distinct.
>
> Negative vs. positive numbers?

That would be an obvious choice, but negative column numbers are already in
use for system columns.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Kenneth Marshall <ktm(at)is(dot)rice(dot)edu>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 14:11:48
Message-ID: 458944A4.8080704@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout wrote:
> On Wed, Dec 20, 2006 at 07:20:14AM -0600, Kenneth Marshall wrote:
>
>> On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:
>>
>>> Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
>>>
>>>> If you can show me a reasonably bulletproof or machine-checkable way to
>>>> keep the two kinds of column numbers distinct, I'd be all for it.
>>>>
>>> The only way I can see is to make the domains of the numbers distinct.
>>>
>>>
>> Negative vs. positive numbers?
>>
>
> Negative is used by system columns. Just adding some large constant
> (say 10000) should be enough.
>
> Have a nice day,
>

Or we could divide the positive number space in two, by starting at 2^14
(attnums are int2). Then a simple bitmask test would work to distinguish
them.

cheers

andrew


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 14:15:05
Message-ID: 20061220141504.GG24675@kenobi.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > Force references to go through macros which implement the lookup for the
> > appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
> > PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.
>
> It doesn't really address the question of how you know which one to
> use at any particular line of code; or even more to the point, what
> mechanism will warn you if you use the wrong one.

That'd be the point of doing the typing, you then declare functions as
accepting the type and then if someone passes the wrong type to a
function the compiler will complain. Inside of a particular function it
would hopefully be easier to keep it clear. I'd think that most
functions would deal with one type or the other (which would be declared
in the arguments or in the local variables) and that functions which
have to deal with both would be able to keep them straight.

> My gut feeling about this is that we could probably enforce such a
> distinction if we were using C++, but while coding in C I have no
> confidence in it. (And no, that's not a vote to move to C++ ...)

I need to go research what Linux does for this because aiui it's
pretty good about being able to enforce better type-checking than the
stock C types. The only downside is that I *think* it might be a
GCC-only thing. In that case I'd think we would still use it but build
some macros which essentially disable it for non-GCC compilers. As a
mainly-for-developers compile-time check I think as long as a build-farm
member is running GCC and complaining when there are errors (and it can
be disabled on non-GCC compilers) we won't lose any portability from it.

Thanks,

Stephen


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 14:19:12
Message-ID: 45894660.3070706@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Russell Smith wrote:
> Tom Lane wrote:
>> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>>
>>> Force references to go through macros which implement the lookup for
>>> the
>>> appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
>>> PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.
>>>
>>
>> It doesn't really address the question of how you know which one to
>> use at any particular line of code; or even more to the point, what
>> mechanism will warn you if you use the wrong one.
>>
>> My gut feeling about this is that we could probably enforce such a
>> distinction if we were using C++, but while coding in C I have no
>> confidence in it. (And no, that's not a vote to move to C++ ...)
>>
> What about a comprimise...
>
> The 8.1 documentation for ALTER TABLE states the following.
>
> Adding a column with a non-null default or changing the type of an
> existing column will require the entire table to be rewritten. This
> may take a significant amount of time for a large table; and it will
> temporarily require double the disk space.
>
>
> Now, we are rewriting the table from scratch anyway, the on disk
> format is changing. What is stopping us from switching the column
> order at the same time. The only thing I can think is that the
> catalogs will need more work to update them. It's a middle sized
> price to pay for being able to reorder the columns in the table. One
> of the problems I have is wanting to add a column in the middle of the
> table, but FK constraints stop me dropping the table to do the
> reorder. If ALTER TABLE would let me stick it in the middle and
> rewrite the table on disk, I wouldn't care. It's likely that I would
> be rewriting the table anyway. And by specifying AT POSITION, or
> BEFORE/AFTER you know for big tables it's going to take a while.
>

This isn't really a compromise. Remember that this discussion started
with consideration of optimal record layout (minimising space use by
reducing or eliminating alignment padding). The above proposal really
does nothing for that.

cheers

andrew


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Russell Smith <mr-russ(at)pws(dot)com(dot)au>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 14:27:13
Message-ID: 20061220142713.GH24675@kenobi.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

* Andrew Dunstan (andrew(at)dunslane(dot)net) wrote:
> This isn't really a compromise. Remember that this discussion started
> with consideration of optimal record layout (minimising space use by
> reducing or eliminating alignment padding). The above proposal really
> does nothing for that.

While I agree that's how the discussion started the column ordering
issue can stand on its own and any proposal which provides that feature
should be considered. I don't think we should throw out the
rewrite-the-table idea because it doesn't solve other problems.

Thanks,

Stephen


From: Richard Huxton <dev(at)archonet(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Kenneth Marshall <ktm(at)is(dot)rice(dot)edu>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 14:36:32
Message-ID: 45894A70.4020509@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
>
> Or we could divide the positive number space in two, by starting at 2^14
> (attnums are int2). Then a simple bitmask test would work to distinguish
> them.

Perhaps divide-by-four, then it would be possible to have calculated
columns (as mentioned recently on one of the lists). In particular, that
would let you have FK constraints with a constant as part of the key.

--
Richard Huxton
Archonet Ltd


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Dunstan <pgsql(at)tomd(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Enums patch v2
Date: 2006-12-20 14:49:32
Message-ID: 45894D7C.2000202@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout wrote:
> Also, it's not just I/O functions that are the issue, consider the
> enum-to-integer cast.
>

er, what cast? :-)

IIRC Tom hasn't provided one. If you don't break the enum abstraction
there should be no need for one, and given the implementation it's not
quite trivial - probably the best way if this is needed would be to
precalculate it at type creation time and store the value in an extra
column in pg_enum.

cheers

andrew


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Russell Smith <mr-russ(at)pws(dot)com(dot)au>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch
Date: 2006-12-20 16:20:16
Message-ID: 200612201620.kBKGKGW17573@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
> > Now, we are rewriting the table from scratch anyway, the on disk
> > format is changing. What is stopping us from switching the column
> > order at the same time. The only thing I can think is that the
> > catalogs will need more work to update them. It's a middle sized
> > price to pay for being able to reorder the columns in the table. One
> > of the problems I have is wanting to add a column in the middle of the
> > table, but FK constraints stop me dropping the table to do the
> > reorder. If ALTER TABLE would let me stick it in the middle and
> > rewrite the table on disk, I wouldn't care. It's likely that I would
> > be rewriting the table anyway. And by specifying AT POSITION, or
> > BEFORE/AFTER you know for big tables it's going to take a while.
> >
>
> This isn't really a compromise. Remember that this discussion started
> with consideration of optimal record layout (minimising space use by
> reducing or eliminating alignment padding). The above proposal really
> does nothing for that.

I assume space waste will be mostly fixed when we have 0/1 byte headers
for varlena data types.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 16:24:02
Message-ID: 24395.1166631842@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> I assume space waste will be mostly fixed when we have 0/1 byte headers
> for varlena data types.

Hardly. int float timestamp etc types will all still have alignment issues.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 20:17:12
Message-ID: 20061220201712.GG12639@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Dec 20, 2006 at 09:15:05AM -0500, Stephen Frost wrote:
> > It doesn't really address the question of how you know which one to
> > use at any particular line of code; or even more to the point, what
> > mechanism will warn you if you use the wrong one.
>
> That'd be the point of doing the typing, you then declare functions as
> accepting the type and then if someone passes the wrong type to a
> function the compiler will complain. Inside of a particular function it
> would hopefully be easier to keep it clear. I'd think that most
> functions would deal with one type or the other (which would be declared
> in the arguments or in the local variables) and that functions which
> have to deal with both would be able to keep them straight.

I'm not sure how much you can do with typing. Things like heap_getattr
are macros, and thus untyped. Most places use attr as an index to an
array, which also can't be type checked.

If you switched everything over to inline functions you might get it to
work, but that's about it.

IMHO the best solution is to offset the logical numbers by some
constant...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-20 21:23:49
Message-ID: 4589A9E5.40509@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
> Russell Smith wrote:
>> Tom Lane wrote:
>>> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>>>
>>>> Force references to go through macros which implement the lookup
>>>> for the
>>>> appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
>>>> PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.
>>>>
>>>
>>> It doesn't really address the question of how you know which one to
>>> use at any particular line of code; or even more to the point, what
>>> mechanism will warn you if you use the wrong one.
>>>
>>> My gut feeling about this is that we could probably enforce such a
>>> distinction if we were using C++, but while coding in C I have no
>>> confidence in it. (And no, that's not a vote to move to C++ ...)
>>>
>> What about a comprimise...
>>
>> The 8.1 documentation for ALTER TABLE states the following.
>>
>> Adding a column with a non-null default or changing the type of an
>> existing column will require the entire table to be rewritten. This
>> may take a significant amount of time for a large table; and it will
>> temporarily require double the disk space.
>>
>>
>> Now, we are rewriting the table from scratch anyway, the on disk
>> format is changing. What is stopping us from switching the column
>> order at the same time. The only thing I can think is that the
>> catalogs will need more work to update them. It's a middle sized
>> price to pay for being able to reorder the columns in the table. One
>> of the problems I have is wanting to add a column in the middle of
>> the table, but FK constraints stop me dropping the table to do the
>> reorder. If ALTER TABLE would let me stick it in the middle and
>> rewrite the table on disk, I wouldn't care. It's likely that I would
>> be rewriting the table anyway. And by specifying AT POSITION, or
>> BEFORE/AFTER you know for big tables it's going to take a while.
>>
>
> This isn't really a compromise. Remember that this discussion started
> with consideration of optimal record layout (minimising space use by
> reducing or eliminating alignment padding). The above proposal really
> does nothing for that.
>
> cheers
>
> andrew
>
>
This is partly true. If you have the ability to rewrite the table and
put columns in a specific order you can "manually" minimize the
alignment padding. However that will probably produce a table that is
not in the logical order you would like. I still see plenty of use case
for both my initial case as the alignment padding case, even without
logical layout being different to disk layout.

Also there has been a large about of discussion on performance relating
to having firm numbers for proposals for different compiler options. Do
anybody have tested numbers, and known information about where/how you
can eliminate padding by column ordering? Tom suggests in this thread
that lots of types have padding issues, so how much is it really going
to buy us space wise if we re-order the table in optimal format. What
is the optimal ordering to reduce disk usage?

Russell.


From: "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 14:39:07
Message-ID: E1539E0ED7043848906A8FF995BDA57901A34E7A@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> I'm not sure how much you can do with typing. Things like heap_getattr
> are macros, and thus untyped. Most places use attr as an index to an
> array, which also can't be type checked.
>
> If you switched everything over to inline functions you might
> get it to
> work, but that's about it.
>
> IMHO the best solution is to offset the logical numbers by some
> constant...

Um, surely you meant "offset the physical numbers". Imho the logical
numbers
need to stay 1-n, because those numbers are used way more often and are
more user visible than the physical.

Andreas


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 15:27:12
Message-ID: 458AA7D0.1020802@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Zeugswetter Andreas ADI SD wrote:
>> I'm not sure how much you can do with typing. Things like heap_getattr
>> are macros, and thus untyped. Most places use attr as an index to an
>> array, which also can't be type checked.
>>
>> If you switched everything over to inline functions you might
>> get it to
>> work, but that's about it.
>>
>> IMHO the best solution is to offset the logical numbers by some
>> constant...
>>
>
> Um, surely you meant "offset the physical numbers". Imho the logical
> numbers
> need to stay 1-n, because those numbers are used way more often and are
> more user visible than the physical.
>
>
>

I don't think we should expose the offset to user view at all - this is
just for internal use, no?

cheers

andrew


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 15:37:03
Message-ID: 20061221153703.GD14992@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Dec 21, 2006 at 10:27:12AM -0500, Andrew Dunstan wrote:
> >Um, surely you meant "offset the physical numbers". Imho the logical
> >numbers
> >need to stay 1-n, because those numbers are used way more often and are
> >more user visible than the physical.
> >
> >
> >
>
> I don't think we should expose the offset to user view at all - this is
> just for internal use, no?

The thing is, physical index numbers has meaning, the logical index
number does not. In a view definition we're going to store the physical
index, not the logical one, for example. We don't want rearranging
columns to invalidate view definitions or plans.

The number of places needing the logical index are not that man,
relativelyy, and given it has no intrinsic meaning, it's better to give
it a numeric value which is obviously abritrary (like 10001).

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 15:50:59
Message-ID: 19759.1166716259@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> The thing is, physical index numbers has meaning, the logical index
> number does not. In a view definition we're going to store the physical
> index, not the logical one, for example.

Really? To me that's one of a large number of questions that are
unresolved about how we'd do this. You can make a case for either
choice in quite a number of places.

regards, tom lane


From: "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:06:53
Message-ID: E1539E0ED7043848906A8FF995BDA57901A34ECC@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> > I don't think we should expose the offset to user view at all - this
is
> > just for internal use, no?
>
> The thing is, physical index numbers has meaning, the logical index
> number does not. In a view definition we're going to store the
physical
> index, not the logical one, for example. We don't want rearranging
> columns to invalidate view definitions or plans.

I think we lack a definition here:

logical number: the order of columns when doing select *
physical number: the position inside the heap tuple (maybe with
offset)

All views and plans and index definitions and most everyting else
needs to reference the logical number.

Andreas


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:07:20
Message-ID: 20061221160720.GE14992@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Dec 21, 2006 at 10:50:59AM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > The thing is, physical index numbers has meaning, the logical index
> > number does not. In a view definition we're going to store the physical
> > index, not the logical one, for example.
>
> Really? To me that's one of a large number of questions that are
> unresolved about how we'd do this. You can make a case for either
> choice in quite a number of places.

Can we? For anything of any permenence (view definitions, rules,
compiled functions, plans, etc) you're going to want the physical
number, for the same reason we store the oids of functions and tables.

I can't see the optimiser or executor caring about logical numbers
either. The planner would use it only when looking up column names.

The logical number isn't going to be used much I think. You can go from
column name to physical index directly, without ever looking up the
logical index. That's why I'm suggesting adding some large constant to
the logical numbers, since they're going to be less used in general.

Where do you think we have the choice?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:11:46
Message-ID: 20061221161146.GF14992@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Dec 21, 2006 at 05:06:53PM +0100, Zeugswetter Andreas ADI SD wrote:
> > The thing is, physical index numbers has meaning, the logical index
> > number does not. In a view definition we're going to store the
> > physical index, not the logical one, for example. We don't want
> > rearranging columns to invalidate view definitions or plans.
>
> I think we lack a definition here:
>
> logical number: the order of columns when doing select *
> physical number: the position inside the heap tuple (maybe with
> offset)
>
> All views and plans and index definitions and most everyting else
> needs to reference the logical number.

Huh? If I have an index on the first two columns of a table, it's going
to refernce columns 1 and 2.

If you alter the table to put a column in front of those two, the new
column will be physical 3, logical 1.

If the index references logical numbers, the index has just been
broken. If the index references physical numbers, everything works
without changes.

Same with views, if you use logical numbers you have to rebuild the
view each time. Why bother, when physical numbers work and don't have
that problem?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:15:38
Message-ID: 20099.1166717738@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Thu, Dec 21, 2006 at 10:50:59AM -0500, Tom Lane wrote:
>> Really? To me that's one of a large number of questions that are
>> unresolved about how we'd do this. You can make a case for either
>> choice in quite a number of places.

> Can we? For anything of any permenence (view definitions, rules,
> compiled functions, plans, etc) you're going to want the physical
> number, for the same reason we store the oids of functions and tables.

Not if we intend to rearrange the physical numbers during column
add/drop to provide better packing.

You could make a case that we need *three* numbers: a permanent column
ID, a display position, and a storage position.

regards, tom lane


From: "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:18:02
Message-ID: E1539E0ED7043848906A8FF995BDA57901A34ECD@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> > > The thing is, physical index numbers has meaning, the logical
index
> > > number does not. In a view definition we're going to store the
> > > physical index, not the logical one, for example. We don't want
> > > rearranging columns to invalidate view definitions or plans.
> >
> > I think we lack a definition here:
> >
> > logical number: the order of columns when doing select *
> > physical number: the position inside the heap tuple (maybe with
> > offset)
> >
> > All views and plans and index definitions and most everyting else
> > needs to reference the logical number.
>
> Huh? If I have an index on the first two columns of a table,
> it's going
> to refernce columns 1 and 2.
>
> If you alter the table to put a column in front of those two, the new
> column will be physical 3, logical 1.

No, you change pg_index to now contain 2,3.

> If the index references logical numbers, the index has just been
> broken. If the index references physical numbers, everything works
> without changes.

yup, sinval

> Same with views, if you use logical numbers you have to rebuild the
> view each time. Why bother, when physical numbers work and don't have
> that problem?

Because it would imho be a nightmare to handle ...

Andreas


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Russell Smith" <mr-russ(at)pws(dot)com(dot)au>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Stephen Frost" <sfrost(at)snowman(dot)net>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch
Date: 2006-12-21 16:26:30
Message-ID: 458A6155.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

>>> On Wed, Dec 20, 2006 at 5:33 AM, in message
<45891FA1(dot)5000902(at)pws(dot)com(dot)au>,
Russell Smith <mr-russ(at)pws(dot)com(dot)au> wrote:
>
> The 8.1 documentation for ALTER TABLE states the following.
>
> Adding a column with a non-null default or changing the type of an
> existing column will require the entire table to be rewritten. This
may
> take a significant amount of time for a large table; and it will
> temporarily require double the disk space.
>
>
> Now, we are rewriting the table from scratch anyway, the on disk
format
> is changing. What is stopping us from switching the column order at
the
> same time. The only thing I can think is that the catalogs will need

> more work to update them. It's a middle sized price to pay for being

> able to reorder the columns in the table. One of the problems I have
is
> wanting to add a column in the middle of the table, but FK
constraints
> stop me dropping the table to do the reorder. If ALTER TABLE would
let
> me stick it in the middle and rewrite the table on disk, I wouldn't
> care. It's likely that I would be rewriting the table anyway. And
by
> specifying AT POSITION, or BEFORE/AFTER you know for big tables it's

> going to take a while.
>
> Not that I'm able to code this at all, but I'm interested in feedback
on
> this option.

+1

Currently, I often have to make the choice between adding a column at
the "logical" place in relation to the other columns or adding it at the
end. The former requires creating a whole new table, populating it with
INSERT/SELECT, dropping the old table, renaming the new table, and
restoring permissions, constraints, indexes, etc. The latter is a
simple ALTER TABLE. When I choose the former, I save significant time
and reduce errors by using pg_dump to generate a lot of the code; but it
should would be a nice feature if ALTER TABLE could do all this "under
the covers".

-Kevin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:27:44
Message-ID: 458AB600.7060401@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
>
>> On Thu, Dec 21, 2006 at 10:50:59AM -0500, Tom Lane wrote:
>>
>>> Really? To me that's one of a large number of questions that are
>>> unresolved about how we'd do this. You can make a case for either
>>> choice in quite a number of places.
>>>
>
>
>> Can we? For anything of any permenence (view definitions, rules,
>> compiled functions, plans, etc) you're going to want the physical
>> number, for the same reason we store the oids of functions and tables.
>>
>
> Not if we intend to rearrange the physical numbers during column
> add/drop to provide better packing.
>
> You could make a case that we need *three* numbers: a permanent column
> ID, a display position, and a storage position.
>
>
>

Could this not be handled by some catalog fixup after an add/drop? If we
get the having 3 numbers you will almost have me convinced that this
might be too complicated after all.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:43:27
Message-ID: 20414.1166719407@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> You could make a case that we need *three* numbers: a permanent column
>> ID, a display position, and a storage position.

> Could this not be handled by some catalog fixup after an add/drop? If we
> get the having 3 numbers you will almost have me convinced that this
> might be too complicated after all.

Actually, the more I think about it the more I think that 3 numbers
might be the answer. 99% of the code would use only the permanent ID.
Display position would be used in *exactly* one place, namely while
expanding "SELECT foo.*" --- I can't think of any other part of the
backend that would care about it. (Obviously, client-side code such
as psql's \d would use it too.) Use of storage position could be
localized into a few low-level tuple access functions, probably.

The problems we've been having with the concept stem precisely from
trying to misuse either display or storage position as a permanent ID.
That's fine as long as it actually is permanent, but as soon as you
want to change it then you have problems. We should all understand
this perfectly well from a database theory standpoint: pg_attribute
has to have a persistent primary key. (attrelid, attnum) is that key,
and we can't go around altering a column's attnum without creating
problems for ourselves.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:47:56
Message-ID: 20061221164756.GG14992@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Dec 21, 2006 at 11:15:38AM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > Can we? For anything of any permenence (view definitions, rules,
> > compiled functions, plans, etc) you're going to want the physical
> > number, for the same reason we store the oids of functions and tables.
>
> Not if we intend to rearrange the physical numbers during column
> add/drop to provide better packing.

Urk! If that's what people are suggesting, I'd run away very quickly.
Getting better packing during table create is a nice idea, but
preserving it across add/drop column is just... evil.

Run CLUSTER is you want that, I was expecting add/drop to be a simple
catalog change, nothing more.

> You could make a case that we need *three* numbers: a permanent column
> ID, a display position, and a storage position.

That's just way too complicated IMHO. It add's extra levels of
indirection all over the place.

I was envisiging the physical number to be fixed and immutable (ie
storage position = permanent position).

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 16:59:47
Message-ID: 20900.1166720387@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> I was envisiging the physical number to be fixed and immutable (ie
> storage position = permanent position).

There are two different problems being discussed here, and one of them
is insoluble if we take that position: people would like the system to
automatically lay out tables to minimize alignment overhead and access
costs (eg, put fixed-width columns first). This is not the same as
"I would like to change the display column order".

It's true that for an ADD COLUMN that doesn't already force a table
rewrite, forcing one to improve packing is probably bad. My thought
would be that we leave the column storage order alone if we don't have
to rewrite the table ... but any rewriting variant of ALTER TABLE could
optimize the storage order while it was at it.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-21 17:09:46
Message-ID: 458ABFDA.5090407@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Actually, the more I think about it the more I think that 3 numbers
> might be the answer. 99% of the code would use only the permanent ID.
> Display position would be used in *exactly* one place, namely while
> expanding "SELECT foo.*" --- I can't think of any other part of the
> backend that would care about it.

Insert without a column list will need the logical ordering, I think.
Also use of "like foo" in a create table statement. I'm not dead sure
there aren't one or two others lurking. But I agree that the number is
small.

cheers

andrew


From: Gregory Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-22 01:48:49
Message-ID: 8764c4em4e.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Actually, the more I think about it the more I think that 3 numbers
> might be the answer. 99% of the code would use only the permanent ID.

Don't we already have such a permanent number -- just one we don't use
anywhere in the data model? Namely the oid of the pg_attribute entry. It's
actually a bit odd that we don't use it since we use the oid of just about
every other system catalog record as the primary key.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <gsstark(at)mit(dot)edu>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-22 02:28:58
Message-ID: 18938.1166754538@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark <gsstark(at)mit(dot)edu> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> Actually, the more I think about it the more I think that 3 numbers
>> might be the answer. 99% of the code would use only the permanent ID.

> Don't we already have such a permanent number -- just one we don't use
> anywhere in the data model? Namely the oid of the pg_attribute entry.

Nope, because pg_attribute hasn't got OIDs.

regards, tom lane


From: "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-22 09:25:08
Message-ID: E1539E0ED7043848906A8FF995BDA57901A34F65@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> >> You could make a case that we need *three* numbers: a permanent
column
> >> ID, a display position, and a storage position.
>
> > Could this not be handled by some catalog fixup after an add/drop?
If we
> > get the having 3 numbers you will almost have me convinced that this

> > might be too complicated after all.
>
> Actually, the more I think about it the more I think that 3 numbers
> might be the answer. 99% of the code would use only the permanent ID.

I am still of the opinion, that the system tables as such are too
visible
to users and addon developers as to change the meaning of attnum.

And I don't quite see what the point is. To alter a table's column you
need
an exclusive lock, and plan invalidation (or are you intending to
invalidate only
plans that reference * ?). Once there you can just as well fix the
numbering.
Yes, it is more work :-(

Andreas


From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2006-12-30 04:46:27
Message-ID: 20061230044626.GF71246@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Dec 21, 2006 at 11:43:27AM -0500, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > Tom Lane wrote:
> >> You could make a case that we need *three* numbers: a permanent column
> >> ID, a display position, and a storage position.
>
> > Could this not be handled by some catalog fixup after an add/drop? If we
> > get the having 3 numbers you will almost have me convinced that this
> > might be too complicated after all.
>
> Actually, the more I think about it the more I think that 3 numbers
> might be the answer. 99% of the code would use only the permanent ID.
> Display position would be used in *exactly* one place, namely while
> expanding "SELECT foo.*" --- I can't think of any other part of the
> backend that would care about it. (Obviously, client-side code such
> as psql's \d would use it too.) Use of storage position could be
> localized into a few low-level tuple access functions, probably.
>
> The problems we've been having with the concept stem precisely from
> trying to misuse either display or storage position as a permanent ID.
> That's fine as long as it actually is permanent, but as soon as you
> want to change it then you have problems. We should all understand
> this perfectly well from a database theory standpoint: pg_attribute
> has to have a persistent primary key. (attrelid, attnum) is that key,
> and we can't go around altering a column's attnum without creating
> problems for ourselves.

Is there enough consensus on this to add it to the TODO?
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2007-02-02 03:50:12
Message-ID: 200702020350.l123oCx13234@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Where are we on this?

---------------------------------------------------------------------------

Tom Dunstan wrote:
> Hi all
>
> Here is an updated version of the enums patch. It has been brought up to
> date and applies against current CVS HEAD. The original email is at [1],
> and describes the implementation.
>
> This version contains sgml documentation, and contains the missing
> bounds checks and error codes noted in the last email.
>
> As mentioned before, the only part that I'm not super keen on is the
> hack required to pass the type oid in to the text-to-enum cast function,
> since normally those take type mods but not type oids. I wasn't sure how
> else to get a cast working though, short of allowing another type of
> cast function which accepts type oids as well. That seemed overkill for
> just one case, though, and was getting a bit beyond the realms of what I
> wanted to get done with this patch.
>
> Anyway, it's reasonably feature complete and should be appropriately
> documented now, so feedback gratefully accepted.
>
> Cheers
>
> Tom
>
> [1] http://archives.postgresql.org/pgsql-patches/2006-09/msg00000.php

[ application/x-gzip is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: column ordering, was Re: [PATCHES] Enums patch v2
Date: 2007-02-02 05:43:04
Message-ID: 200702020543.l125h4o10529@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Added to TODO:

> o Allow column display reordering by recording a display,
> storage, and permanent id for every column?
>
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00782.php
>

---------------------------------------------------------------------------

Jim C. Nasby wrote:
> On Thu, Dec 21, 2006 at 11:43:27AM -0500, Tom Lane wrote:
> > Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > > Tom Lane wrote:
> > >> You could make a case that we need *three* numbers: a permanent column
> > >> ID, a display position, and a storage position.
> >
> > > Could this not be handled by some catalog fixup after an add/drop? If we
> > > get the having 3 numbers you will almost have me convinced that this
> > > might be too complicated after all.
> >
> > Actually, the more I think about it the more I think that 3 numbers
> > might be the answer. 99% of the code would use only the permanent ID.
> > Display position would be used in *exactly* one place, namely while
> > expanding "SELECT foo.*" --- I can't think of any other part of the
> > backend that would care about it. (Obviously, client-side code such
> > as psql's \d would use it too.) Use of storage position could be
> > localized into a few low-level tuple access functions, probably.
> >
> > The problems we've been having with the concept stem precisely from
> > trying to misuse either display or storage position as a permanent ID.
> > That's fine as long as it actually is permanent, but as soon as you
> > want to change it then you have problems. We should all understand
> > this perfectly well from a database theory standpoint: pg_attribute
> > has to have a persistent primary key. (attrelid, attnum) is that key,
> > and we can't go around altering a column's attnum without creating
> > problems for ourselves.
>
> Is there enough consensus on this to add it to the TODO?
> --
> Jim Nasby jim(at)nasby(dot)net
> EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Neil Conway <neilc(at)samurai(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2007-02-02 15:54:44
Message-ID: 1170431684.5447.59.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, 2007-02-01 at 22:50 -0500, Bruce Momjian wrote:
> Where are we on this?

I can commit to reviewing this. The patch looked fairly solid on a quick
glance through, but I won't have the cycles to review it properly for a
week or two.

-Neil


From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2007-02-14 00:03:47
Message-ID: 45D251E3.7040908@tomd.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Neil Conway wrote:
> On Thu, 2007-02-01 at 22:50 -0500, Bruce Momjian wrote:
>> Where are we on this?
>
> I can commit to reviewing this. The patch looked fairly solid on a quick
> glance through, but I won't have the cycles to review it properly for a
> week or two.

I've brought this up to date with the operator family stuff now, and the
attached patch once more applies cleanly against HEAD. Hopefully that'll
make life a little easier when reviewing it. Thanks.

I got sick of manually shifting the new OID values every time someone
had added something new to the catalogs, so I moved all of my OIDs up to
the 9000 range. Attached is a hacky bash script which can move them back
to something sane. The idea is to run unused_oids first, see where the
main block of unused OIDs starts, and then run shift_oids on the
(unzipped) patch file before applying the patch. We discussed something
similar a while ago, but no-one ever got around to implementing the script.

I've attached the script and left the OIDs in my patch in the upper
range rather than just running it myself before submitting the patch as
I reckon that this might be a useful convention for authors of patches
which add a non-trivial number of OIDs to the catalogs. If there's
agreement then anyone can feel free to commit the script to CVS or put
it on the wiki or whatever.

Cheers

Tom

Attachment Content-Type Size
enums.patch.gz application/x-gzip 23.7 KB
shift_oids text/plain 823 bytes

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc: Neil Conway <neilc(at)samurai(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Enums patch v2
Date: 2007-02-21 17:06:10
Message-ID: 200702211706.l1LH6Av24112@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


I am putting this in the patches queue so it is not lost. I believe
Neil is working applying this.

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Tom Dunstan wrote:
> Neil Conway wrote:
> > On Thu, 2007-02-01 at 22:50 -0500, Bruce Momjian wrote:
> >> Where are we on this?
> >
> > I can commit to reviewing this. The patch looked fairly solid on a quick
> > glance through, but I won't have the cycles to review it properly for a
> > week or two.
>
> I've brought this up to date with the operator family stuff now, and the
> attached patch once more applies cleanly against HEAD. Hopefully that'll
> make life a little easier when reviewing it. Thanks.
>
> I got sick of manually shifting the new OID values every time someone
> had added something new to the catalogs, so I moved all of my OIDs up to
> the 9000 range. Attached is a hacky bash script which can move them back
> to something sane. The idea is to run unused_oids first, see where the
> main block of unused OIDs starts, and then run shift_oids on the
> (unzipped) patch file before applying the patch. We discussed something
> similar a while ago, but no-one ever got around to implementing the script.
>
> I've attached the script and left the OIDs in my patch in the upper
> range rather than just running it myself before submitting the patch as
> I reckon that this might be a useful convention for authors of patches
> which add a non-trivial number of OIDs to the catalogs. If there's
> agreement then anyone can feel free to commit the script to CVS or put
> it on the wiki or whatever.
>
> Cheers
>
> Tom

[ application/x-gzip is not supported, skipping... ]

> #!/bin/sh
>
> start=$1
> end=$2
> new_start=$3
> filename=$4
>
> if [ -z "$filename" ] ; then
> echo Usage: $0 start-oid end-oid new-start-oid filename
> exit 1
> fi
>
> if [ ! -f $filename ] ; then
> echo $0: $filename is not a file
> exit 1
> fi
>
> if [ $end -le $start ] ; then
> echo $0: End of OID range must be greater than or equal to the start
> exit 1
> fi
>
> start_len=`echo -n $start | wc -c`
> end_len=`echo -n $end | wc -c`
>
> if [ $start_len -ne 4 -o $end_len -ne 4 ] ; then
> echo $0: Source OID range must have 4 digits
> exit 1
> fi
>
> let new_end=$new_start+$end-$start
> if [ $start -le $new_start -a $end -ge $new_start -o $new_start -le $start -a $new_end -ge $start ] ; then
> echo $0: OID ranges may not overlap
> exit 1
> fi
>
>
> i=$start
> j=$new_start
> while [ $i -le $end ] ; do
> #echo $i $j
> sed -i "s/$i/$j/g" $filename
> let i=i+1
> let j=j+1
> done

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +