Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jameison Martin <jameisonb(at)yahoo(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-24 20:17:28
Message-ID: 24266.1359058648@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jameison Martin <jameisonb(at)yahoo(dot)com> writes:
> there have been a lot of different threads on this patch. i'm going to take a stab at teasing them out, summarizing them, and addressing them individually.

Thanks for reviewing the bidding so carefully.

> 4.3) does it violate a principle in the code (Tom's latest email)

> from my perspective, the code has to deal with the fact that natts on the persistent row is different than the tupdesc already, this is the foundation upon which ALTER TABLE ADD NULL and DROP COLUMN as a metadata operation are built. so i believe that this patch strengthens the code's ability to handle the ALTER TABLE ADD NULL case by generalizing it: now the code will more frequently need to deal with the disparity between natts and tupdesc rather than only after someone did an ALTER TABLE. i'm an advocate of making corner cases go through generalized code where possible.

I think we're already pretty convinced that that case works, since ALTER
TABLE ADD COLUMN has been around for a very long time.

> however, in all honestly i hadn't consider Tom's point that the patch has created a situation where natts on a row can deviate from the tupdesc that it was built with (as opposed to the current tupdesc which it could always deviate due to a subsequent ALTER TABLE). this is a subtle point and i don't think i have enough experience in the Postgres code to argue one way or another about it. so i'll have to leave that determination up to people that are more familiar with the code.

To be a bit more concrete, the thing that is worrying me is that the
executor frequently transforms tuples between "virtual" and HeapTuple
formats (look at the TupleTableSlot infrastructure, junk filters, and
tuplesort/tuplestore, for example). Up to now such transformation could
be counted on not to affect the apparent number of columns in the tuple;
but with this patch, a tuple materialized as a HeapTuple might well show
an natts smaller than what you'd conclude from looking at it in virtual
slot format. Now it's surely true that any code that's broken by that
would be broken anyway if it were fed a tuple direct-from-disk from a
relation that had suffered a later ADD COLUMN, but there are lots of
code paths where raw disk tuples would never appear. So IMO there's a
pretty fair chance of exposing latent bugs with this.

As an example that this type of concern isn't hypothetical, and that
identifying such bugs can be very difficult, see commit 039680aff.
That bug had been latent for years before it was exposed by an unrelated
change, and it took several more years to track it down.

As I said, I'd be willing to take this risk if the patch showed more
attractive performance benefits ... but it still seems mighty marginal
from here.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-01-24 20:35:24 Re: gistchoose vs. bloat
Previous Message Kevin Grittner 2013-01-24 20:14:15 Re: Materialized views WIP patch