Re: t_self as system column

Lists: pgsql-hackers
From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: t_self as system column
Date: 2010-07-05 17:40:18
Message-ID: 1278351547-sup-2819@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Is there a reason we don't have t_self as one of the system columns that
you can examine from SQL? I'd propose its addition otherwise.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-05 18:08:07
Message-ID: 19647.1278353287@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> Is there a reason we don't have t_self as one of the system columns that
> you can examine from SQL? I'd propose its addition otherwise.

pg_attribute bloat? I'm a bit hesitant to add a row per table for
something we've gotten along without for so long, especially something
with as bizarre a definition as "t_self" has got.

At one time I was hoping to get rid of explicit entries in pg_attribute
for system columns, which would negate this concern. I think we're
stuck with them now, though, because of per-column permissions.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-05 19:20:53
Message-ID: AANLkTik9mPTnAHa1L-ZALT4IWkqUV6fGMxDerDlxUN0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 5, 2010 at 2:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
>> Is there a reason we don't have t_self as one of the system columns that
>> you can examine from SQL?  I'd propose its addition otherwise.
>
> pg_attribute bloat?  I'm a bit hesitant to add a row per table for
> something we've gotten along without for so long, especially something
> with as bizarre a definition as "t_self" has got.
>
> At one time I was hoping to get rid of explicit entries in pg_attribute
> for system columns, which would negate this concern.  I think we're
> stuck with them now, though, because of per-column permissions.

Because someone might want to grant per-column permissions on those
columns? That seems like an awfully thin reason to keep all that
bloat around. I bet the number of people who have granted per-column
permissions on, say, cmax can be counted on one hand - possibly with
five fingers left over.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-05 19:26:54
Message-ID: 7444.1278358014@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Jul 5, 2010 at 2:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> At one time I was hoping to get rid of explicit entries in pg_attribute
>> for system columns, which would negate this concern. I think we're
>> stuck with them now, though, because of per-column permissions.

> Because someone might want to grant per-column permissions on those
> columns? That seems like an awfully thin reason to keep all that
> bloat around. I bet the number of people who have granted per-column
> permissions on, say, cmax can be counted on one hand - possibly with
> five fingers left over.

I'd agree with that argument for the most part, but I'm not entirely
sure about oid, which has some characteristics of a user-data column.

(OTOH, maybe we could allow just oid to retain an explicit pg_attribute
entry... could be messy though.)

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-06 14:08:21
Message-ID: AANLkTin7ab3GOqViwgR3TlyearhRSDujR9OdyStrduBk@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 5, 2010 at 3:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Jul 5, 2010 at 2:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> At one time I was hoping to get rid of explicit entries in pg_attribute
>>> for system columns, which would negate this concern.  I think we're
>>> stuck with them now, though, because of per-column permissions.
>
>> Because someone might want to grant per-column permissions on those
>> columns?  That seems like an awfully thin reason to keep all that
>> bloat around.  I bet the number of people who have granted per-column
>> permissions on, say, cmax can be counted on one hand - possibly with
>> five fingers left over.
>
> I'd agree with that argument for the most part, but I'm not entirely
> sure about oid, which has some characteristics of a user-data column.
>
> (OTOH, maybe we could allow just oid to retain an explicit pg_attribute
> entry... could be messy though.)

[woops, forgot to reply on-list]

Treating OID as a user-defined column seems reasonable, and probably
not even that messy if we put some appropriate macros in place. I'm
guessing the messy part would be finding all the places that expect to
be consulting a real pg_attribute row and supplying them with a
faked-up one in its place.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-06 18:49:37
Message-ID: 1278442093-sup-6629@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mar jul 06 10:08:21 -0400 2010:

> Treating OID as a user-defined column seems reasonable, and probably
> not even that messy if we put some appropriate macros in place. I'm
> guessing the messy part would be finding all the places that expect to
> be consulting a real pg_attribute row and supplying them with a
> faked-up one in its place.

Agreed.

I'm intending to work on logical column identifiers for 9.1. Perhaps I
could try to have a look at this, too, while at it.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-06 20:29:53
Message-ID: AANLkTinruZWVLvk7g5-HogSN3ZA-QluwXW__sJxbT02I@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 6, 2010 at 2:49 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of mar jul 06 10:08:21 -0400 2010:
>
>> Treating OID as a user-defined column seems reasonable, and probably
>> not even that messy if we put some appropriate macros in place.  I'm
>> guessing the messy part would be finding all the places that expect to
>> be consulting a real pg_attribute row and supplying them with a
>> faked-up one in its place.
>
> Agreed.
>
> I'm intending to work on logical column identifiers for 9.1.  Perhaps I
> could try to have a look at this, too, while at it.

I have a strong suspicion that's going to be a, ahem, challenging
project. But it would be great to have. Getting rid of the system
column entries from pg_attribute is probably easy by comparison.

When we discussed this previously, Tom suggested that we might want to
have a three-tiered structure: (1) permanent identifier (never
changes, used by other system catalogs to reference the attribute in
question), (2) display position, and (3) physical storage position.
I'm not sure if it's feasible to think about splitting out (2) and (3)
in a single patch, but either one would be useful by itself. Which
are you planning to work on?

One other thought for you to mull over. Currently, we can never
really totally get rid of an attribute because it would leave us at a
loss as to how to interpret the tuples already on disk - we can't cope
with HeapTupleHeaderGetNatts ever decreasing. But maybe instead of
storing natts in the tuple header, we could store a "tuple version
number". Whenever a column is added or dropped, physical storage
layout is changed, etc., we bump the tuple version number but retain
the information necessary to interpret old tuple versions. After
CLUSTER or VACUUM FULL, we can forget about all the old tuple
versions. A regular VACUUM can, if it visits the entire table, forget
about all tuple versions other than the latest which are observed not
to be in use. It's a bit awkward if you go to make a change and
discover that all 2047 possible tuple versions are in use, because now
you have to force a table rewrite for an operation that doesn't
normally require one. But in the immortal words of Bill Gates, 640K
ought to be enough for anybody.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-06 21:18:37
Message-ID: 4C339DAD.2090808@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> On Tue, Jul 6, 2010 at 2:49 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
>
>>
>> I'm intending to work on logical column identifiers for 9.1. Perhaps I
>> could try to have a look at this, too, while at it.
>>
>
> I have a strong suspicion that's going to be a, ahem, challenging
> project. But it would be great to have. Getting rid of the system
> column entries from pg_attribute is probably easy by comparison.
>

It will be a bit invasive, but I'm not so sure that it's difficult, just
a mass of details to take care of. Like you I'd be very glad to see it done.

> When we discussed this previously, Tom suggested that we might want to
> have a three-tiered structure: (1) permanent identifier (never
> changes, used by other system catalogs to reference the attribute in
> question), (2) display position, and (3) physical storage position.
> I'm not sure if it's feasible to think about splitting out (2) and (3)
> in a single patch, but either one would be useful by itself. Which
> are you planning to work on?
>

Why wouldn't it be feasible? In any case, having a mutable logical
column position is the feature that's been most requested.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-06 21:24:21
Message-ID: AANLkTim6nwSVPbA9_7m_xv4Vy5C_ehWfRAJz3o9_0AAu@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 6, 2010 at 5:18 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> I have a strong suspicion that's going to be a, ahem, challenging
>> project.  But it would be great to have.  Getting rid of the system
>> column entries from pg_attribute is probably easy by comparison.
>
> It will be a bit invasive, but I'm not so sure that it's difficult, just a
> mass of details to take care of. Like you I'd be very glad to see it done.

I guess we'll find out...!

>> When we discussed this previously, Tom suggested that we might want to
>> have a three-tiered structure: (1) permanent identifier (never
>> changes, used by other system catalogs to reference the attribute in
>> question), (2) display position, and (3) physical storage position.
>> I'm not sure if it's feasible to think about splitting out (2) and (3)
>> in a single patch, but either one would be useful by itself.  Which
>> are you planning to work on?
>
> Why wouldn't it be feasible?

Just because it might be too much to do all at once.

> In any case, having a mutable logical column
> position is the feature that's been most requested.

I think that's true. But the physical storage position would give us
a performance benefit, by allowing us to try to avoid useless
alignment padding.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-06 21:32:39
Message-ID: 14669.1278451959@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Jul 6, 2010 at 5:18 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> Why wouldn't it be feasible?

> Just because it might be too much to do all at once.

My thought is that the hardest part of this is going to be making sure
that every "column index" usage in the code is properly categorized as
to whether it's physical, logical, or identifier index. If we try to
divide the problem into sub-patches, that will probably just increase
the amount of effort because all that code will have to be looked at
twice.

Think of it as Polya's paradox in action.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-07 17:29:04
Message-ID: 1278523544-sup-4241@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mar jul 06 17:24:21 -0400 2010:
> On Tue, Jul 6, 2010 at 5:18 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> > In any case, having a mutable logical column
> > position is the feature that's been most requested.
>
> I think that's true. But the physical storage position would give us
> a performance benefit, by allowing us to try to avoid useless
> alignment padding.

That's true too. I intend to look at both problems simultaneously, i.e.
decoupling the current attnum in three columns as previously discussed;
as Tom says, I think it'll end up being less work than attacking them
separately. However, I will not attempt to include optimizations such
as avoiding padding, in the first patch, just the possibility that it is
added later.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: t_self as system column
Date: 2010-07-07 20:44:10
Message-ID: AANLkTimvi1QvlLSeUfmI_dcqPplF3bKMXZ3VfotVJFtL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 7, 2010 at 1:29 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of mar jul 06 17:24:21 -0400 2010:
>> On Tue, Jul 6, 2010 at 5:18 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>> > In any case, having a mutable logical column
>> > position is the feature that's been most requested.
>>
>> I think that's true.  But the physical storage position would give us
>> a performance benefit, by allowing us to try to avoid useless
>> alignment padding.
>
> That's true too.  I intend to look at both problems simultaneously, i.e.
> decoupling the current attnum in three columns as previously discussed;
> as Tom says, I think it'll end up being less work than attacking them
> separately.  However, I will not attempt to include optimizations such
> as avoiding padding, in the first patch, just the possibility that it is
> added later.

Sounds great.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company