Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

Lists: pgsql-hackers
From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Simon Riggs" <simon(at)2ndQuadrant(dot)com>,"Amit Kapila" <amit(dot)kapila(at)huawei(dot)com>
Cc: robertmhaas(at)gmail(dot)com,josh(at)agliodbs(dot)com,pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-24 14:28:10
Message-ID: 20121224142811.144680@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:

> Not really sure about the 100s of columns use case.
>
> But showing gain in useful places in these more common cases wins
> my vote.
>
> Thanks for testing. Barring objections, will commit.

Do we have any results on just a plain, old stock pgbench run, with
the default table definitions?

That would be a reassuring complement to the other tests.

-Kevin


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-24 14:49:30
Message-ID: 00b501cde1e5$e19fd080$a4df7180$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, December 24, 2012 7:58 PM Kevin Grittner wrote:
> Simon Riggs wrote:
>
> > Not really sure about the 100s of columns use case.
> >
> > But showing gain in useful places in these more common cases wins
> > my vote.
> >
> > Thanks for testing. Barring objections, will commit.
>
> Do we have any results on just a plain, old stock pgbench run, with
> the default table definitions?
>
> That would be a reassuring complement to the other tests.

Shall run pgbench tpc_b with scalefactor - 50 for 10 mins with/without patch and will post the results.
Kindly let me know if you feel any other test needs to be run.

With Regards,
Amit Kapila.


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-24 16:57:21
Message-ID: 00bc01cde1f7$be4b4cb0$3ae1e610$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, December 24, 2012 7:58 PM Kevin Grittner wrote:
> Simon Riggs wrote:
>
> > Not really sure about the 100s of columns use case.
> >
> > But showing gain in useful places in these more common cases wins
> > my vote.
> >
> > Thanks for testing. Barring objections, will commit.
>
> Do we have any results on just a plain, old stock pgbench run, with
> the default table definitions?
>
> That would be a reassuring complement to the other tests.

Sever Configuration:
The database cluster will be initialized with locales
COLLATE: C
CTYPE: C
MESSAGES: en_US.UTF-8
MONETARY: en_US.UTF-8
NUMERIC: en_US.UTF-8
TIME: en_US.UTF-8

shared_buffers = 1GB
checkpoint_segments = 255
checkpoint_timeout = 15min

pgbench:
transaction type: TPC-B (sort of)
scaling factor: 75
query mode: simple
number of clients: 8
number of threads: 8
duration: 600 s

Performance: Average of 3 runs of pgbench in tps
9.3devel | with trailing null patch
----------+--------------------------
578.9872 | 573.4980

With Regards,
Amit Kapila.


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-09 11:22:06
Message-ID: CA+U5nMJgbpT67cft5FyCcyz2LUV5Zv+doo6yip7e8cs0r71X0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 December 2012 16:57, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:

> Performance: Average of 3 runs of pgbench in tps
> 9.3devel | with trailing null patch
> ----------+--------------------------
> 578.9872 | 573.4980

On balance, it would seem optimizing for this special case would
affect everybody negatively; not much, but enough. Which means we
should rekect this patch.

Do you have a reason why a different view might be taken?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-09 12:06:14
Message-ID: 00b301cdee61$b9994340$2ccbc9c0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday, January 09, 2013 4:52 PM Simon Riggs wrote:
> On 24 December 2012 16:57, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>
> > Performance: Average of 3 runs of pgbench in tps
> > 9.3devel | with trailing null patch
> > ----------+--------------------------
> > 578.9872 | 573.4980
>
> On balance, it would seem optimizing for this special case would
> affect everybody negatively; not much, but enough. Which means we
> should rekect this patch.
>
> Do you have a reason why a different view might be taken?

I have tried to dig why this gap is coming. I have observed that there is
very less change in normal path.
I wanted to give it some more time to exactly find if something can be done
to avoid performance dip in normal execution.

Right now I am busy in certain other work. But definitely in coming week or
so, I shall spare time to work on it again.

With Regards,
Amit Kapila.


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-09 12:14:34
Message-ID: CA+U5nMJgHY9=ZA88cH41bYaiH1DOcmF5SymYEe6+HrFEXCy_iw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9 January 2013 12:06, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Wednesday, January 09, 2013 4:52 PM Simon Riggs wrote:
>> On 24 December 2012 16:57, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>>
>> > Performance: Average of 3 runs of pgbench in tps
>> > 9.3devel | with trailing null patch
>> > ----------+--------------------------
>> > 578.9872 | 573.4980
>>
>> On balance, it would seem optimizing for this special case would
>> affect everybody negatively; not much, but enough. Which means we
>> should rekect this patch.
>>
>> Do you have a reason why a different view might be taken?
>
> I have tried to dig why this gap is coming. I have observed that there is
> very less change in normal path.
> I wanted to give it some more time to exactly find if something can be done
> to avoid performance dip in normal execution.
>
> Right now I am busy in certain other work. But definitely in coming week or
> so, I shall spare time to work on it again.

Perhaps. Not every idea produces useful outcomes. Even after your
excellent research, it appears we haven't made this work yet. It's a
shame. Should we invest more time? It's considered rude to advise
others how to spend their time, but let me say this: we simply don't
have enough time to do everything and we need to be selective,
prioritising our time on to the things that look to give the best
benefit.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-10 09:25:17
Message-ID: 006b01cdef14$67e11100$37a33300$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday, January 09, 2013 5:45 PM Simon Riggs wrote:
> On 9 January 2013 12:06, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> > On Wednesday, January 09, 2013 4:52 PM Simon Riggs wrote:
> >> On 24 December 2012 16:57, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
> wrote:
> >>
> >> > Performance: Average of 3 runs of pgbench in tps
> >> > 9.3devel | with trailing null patch
> >> > ----------+--------------------------
> >> > 578.9872 | 573.4980
> >>
> >> On balance, it would seem optimizing for this special case would
> >> affect everybody negatively; not much, but enough. Which means we
> >> should rekect this patch.
> >>
> >> Do you have a reason why a different view might be taken?
> >
> > I have tried to dig why this gap is coming. I have observed that
> there is
> > very less change in normal path.
> > I wanted to give it some more time to exactly find if something can
> be done
> > to avoid performance dip in normal execution.
> >
> > Right now I am busy in certain other work. But definitely in coming
> week or
> > so, I shall spare time to work on it again.
>
> Perhaps. Not every idea produces useful outcomes. Even after your
> excellent research, it appears we haven't made this work yet. It's a
> shame. Should we invest more time? It's considered rude to advise
> others how to spend their time, but let me say this: we simply don't
> have enough time to do everything and we need to be selective,
> prioritising our time on to the things that look to give the best
> benefit.

I think we can reject this patch.

With Regards,
Amit Kapila.


From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jameisonb(at)yahoo(dot)com
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-24 02:12:05
Message-ID: 20130124021205.GB29954@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 09, 2013 at 11:22:06AM +0000, Simon Riggs wrote:
> On 24 December 2012 16:57, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> > Performance: Average of 3 runs of pgbench in tps
> > 9.3devel | with trailing null patch
> > ----------+--------------------------
> > 578.9872 | 573.4980
>
> On balance, it would seem optimizing for this special case would
> affect everybody negatively; not much, but enough. Which means we
> should rekect this patch.
>
> Do you have a reason why a different view might be taken?

We've mostly ignored performance changes of a few percentage points, because
the global consequences of adding or removing code to the binary image can
easily change performance that much. It would be great to get to the point
where we can reason about 1% performance changes, but we're not there. If a
measured +1% performance change would have yielded a different decision, we
should rethink the decision based on more-robust criteria.

(Most of this was said in initial April 2012 discussion.) This patch is a
tough one, because it will rarely help the most-common workloads. If it can
reduce the tuple natts from 9 to 8, the tuple shrinks by a respectable eight
bytes. But if it reduces natts from 72 to 9, we save nothing. It pays off
nicely for the widest, sparsest tables: say, a table with 1000 columns, all
but ten are NULL, and those non-NULL columns are near the front of the table.
I've never seen such a design, but a few folks seemed to find it credible.

I've failed to come up with a non-arbitrary reason to recommend for or against
this patch, so I profess neutrality on the ultimate issue.

Thanks,
nm


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jameisonb(at)yahoo(dot)com
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-24 04:13:33
Message-ID: 25975.1359000813@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Wed, Jan 09, 2013 at 11:22:06AM +0000, Simon Riggs wrote:
>> On balance, it would seem optimizing for this special case would
>> affect everybody negatively; not much, but enough. Which means we
>> should rekect this patch.

> I've failed to come up with a non-arbitrary reason to recommend for or against
> this patch, so I profess neutrality on the ultimate issue.

I think if we can't convincingly show an attractive performance gain,
we should reject the patch on the grounds of added complexity. Now,
the amount of added code surely isn't much, but nonetheless this patch
eliminates an invariant that's always been there (ie, that a tuple's
natts matches the tupdesc it was built with). That will at the very
least complicate forensic work, and it might well break third-party
code, or corners of the core system that we haven't noticed yet.

I'd be okay with this patch if it had more impressive performance
results, but as it is I think we're better off without it.

regards, tom lane


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Noah Misch'" <noah(at)leadboat(dot)com>, "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, <jameisonb(at)yahoo(dot)com>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-24 10:26:36
Message-ID: 003601cdfa1d$4a8fb520$dfaf1f60$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, January 24, 2013 7:42 AM Noah Misch wrote:
> On Wed, Jan 09, 2013 at 11:22:06AM +0000, Simon Riggs wrote:
> > On 24 December 2012 16:57, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
> wrote:
> > > Performance: Average of 3 runs of pgbench in tps
> > > 9.3devel | with trailing null patch
> > > ----------+--------------------------
> > > 578.9872 | 573.4980
> >
> > On balance, it would seem optimizing for this special case would
> > affect everybody negatively; not much, but enough. Which means we
> > should rekect this patch.
> >
> > Do you have a reason why a different view might be taken?
>
> We've mostly ignored performance changes of a few percentage points,
> because
> the global consequences of adding or removing code to the binary image
> can
> easily change performance that much. It would be great to get to the
> point
> where we can reason about 1% performance changes, but we're not there.
> If a
> measured +1% performance change would have yielded a different
> decision, we
> should rethink the decision based on more-robust criteria.

Today, I had taken one more set of readings of original pg_bench which are
below in this mail. The difference is that this set of readings are on Suse
11 and with Shared buffers - 4G
IMO, the changes in this patch are not causing any regression, however it
gives performance/size reduction
in some of the cases as shown by data in previous mails.
So the point to decide is whether the usecases in which it is giving benefit
are worth to have this Patch?
As Tom had already raised some concern's about the code, I think the Patch
can only have merit if the usecase
makes sense to people.

System Configuration:
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz) & RAM : 24GB
Operating System: Suse-Linux 11.2 x86_64

Sever Configuration:
The database cluster will be initialized with locales
COLLATE: C
CTYPE: C
MESSAGES: en_US.UTF-8
MONETARY: en_US.UTF-8
NUMERIC: en_US.UTF-8
TIME: en_US.UTF-8

shared_buffers = 4GB
checkpoint_segments = 255
checkpoint_timeout = 10min

pgbench:
transaction type: TPC-B (sort of)
scaling factor: 75
query mode: simple
number of clients: 4
number of threads: 4
duration: 300 s

original patch
Run-1: 311 312
Run-2: 307 302
Run-3: 300 312
Run-4: 285 295
Run-5: 308 305

Avg : 302.2 305.2

With Regards,
Amit Kapila.


From: Jameison Martin <jameisonb(at)yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-24 19:43:59
Message-ID: 1359056639.38918.YahooMailNeo@web160101.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

there have been a lot of different threads on this patch. i'm going to take a stab at teasing them out, summarizing them, and addressing them individually.

here is my categorization of the issues that have been raised about this patch:
1. is there really anyone that would actually benefit from this change?
2. what is the performance impact on more canonical use-cases (e.g. no trailing nulls)?
1. look at the impact on a micro benchmark of insert performance (Tom's original suggestion)
2. look at the impact on pgbench (the recent thread on this)
3. what is statistically relevant when interpreting benchmarks (Noah's thoughtful comments)?
3. what is the impact on the use-case this is meant to address?
1. space impact
2. performance impact
3. is this impact relevant ("attractive" in Tom's latest email)
* is the proposed change itself problematic?
1. is it overly complicated?
2. is there a more generalized approach that would be better (e.g. compress the bitmap)?
3. does it violate a principle in the code (Tom's latest email)?

1) is there really anyone (other than me) that would benefit from this change?

this is a tough one since it is so subjective. i mentioned a particular real world case of mine where this patch as is makes a very significant improvement. others have chimed in with examples where they have seen this in the wild (Josh Berkus and his telemetry example, and Greg Stark mentioned seeing patterns where users have wide schemas and trailing nulls though i'm not sure about the context). that said, it is hard to turn these examples into a generalized notion of how applicable it might be.

i think the primary beneficiaries of this change are enterprise application vendors (though there are clearly other users that would benefit like Josh). enterprise applications often need to deal with customizable fields in their object model. they typically model this with very wide schemas with a lot of generic nullable columns that have no semantic meaning until the customer decides to 'create' a custom field. i honestly suspect that this is actually a fairly common pattern with enterprise applications, though it is hard to say how big of a constituency this may be in the Postgres community. application developers at enterprise software vendors are often very savvy database users and generally know exactly how the database lays out rows and will consequently optimize their use of their wide schemas to minimize the space and performance impact of their custom fields. on other commercial databases trailing nulls are 'free' so enterprise application
developers carefully make use of generic columns designated as custom fields by using the first ones first. they may even 'recycle' columns in crafty ways to ensure they are using early columns as much as possible.

i can give more detail if folks would like to hear more about this.

also, i know there have been some suggestions about different ways to approach this issue of customized columns without without using such wide schemas. i can assure you that they are all even more problematic (generally in terms of performance). we can discuss this further as well. but i guess the main point here isn't whether this is the 'right' approach to application development, but that it is what people in fact are doing. 

in summary, it is difficult to quantify how often Postgres users have wide schemas with trailing nulls, but i believe it might be a fairly common pattern with enterprise software vendors that support customization, and it may even show up a bit in end-user applications as well (e.g. Josh's example).

2)  what is the performance impact on more canonical use-cases (e.g. no trailing nulls)?

a number of people (Tom, Simon, Kevin etc) have expressed a concern that any improvement to the particular use-case this patch is meant to address shouldn't negatively impact the performance of the more canonical use cases (e.g. no trailing nulls). i think there is a broad consensus on this (i agree as well). there are 3 sets of benchmark data addressing this concern. i generated a bunch of numbers doing a very specific micro-benchmark that attempts to isolate insert performance as suggested by Tom. that showed no performance degradation. Amit did some additional benchmarking. and finally there was a recent run of pgbench by Amit that originally showed a 1% degradation and a more recent run with 5 runs of both that showed a 0.2% degradation. [i'm very grateful to Amit for all of the work he has done on behalf of this, i'm sure it was more than he bargained for] 

since the pgbench numbers were the ones that caused the committed patch to be backed out it makes sense to address this at length. i agree with Noah's comments that maybe 1% performance changes shouldn't necessarily be treated as significant because of the inherent variance in test runs (stdev), because of the necessarily stricter and potentially onerous requirements this would likely impose on benchmarking in general (e.g. potentially require the benchmarker to flush the filesystem across runs, isolate the machine from the network, follow some rules for proper statical analysis of results, etc), and because even the changing size of a binary can have and a demonstrable impact some benchmarks. [parenthetically it is probably a good idea for someone to take a stab and formalizing the interpretation of pgbench results (what degree of isolation is required and so forth, thoughts about variance and so forth). if folks think this is a good idea a colleague is
willing to take a stab at this]

nonetheless, it seemed reasonable to me to look into Amit's original numbers more closely. Amit's more recent 5 iteration run number showed a ~0.2% performance degradation with the patch. that is arguably statistically irrelevant but it did seemed fairly repeatable, so a colleague of mine was kind enough to look into it. instead of trying to formally explain or eliminate this degradation he made a slight adjustment to the order of if/else in the patch and and now the patch seems marginally faster in pgbench (though arguably statistically irrelevant). i have attached this latest version of the patch and have included his pgbench numbers below.

so, i personally believe that the patch doesn't have a negative performance impact on the more canonical use-cases. there is a pretty significant amount of benchmarking to show this, and this makes intuitive sense if you examine the patch. so i believe the original reason for backing out the patch has been remedied by more extensive pgbench runs to remove a bit more noise and by a tiny adjustment in the code to make a minuscule performance improvement.

and, just to be clear, i'm not claiming that the patch actually improves the performance of canonical use cases. and i don't believe that anyone suggested that improving the performance of canonical use cases is a necessary requirement for this patch, but i might be mistaken about that.

3) what is the impact on the use-case this is meant to address?

i have produced some numbers that show significant space savings and performance improvements for the real world use-cases that i'm seeking to address. Amit and i have also demonstrated the impact of this change across a number of other scenarios (maybe 10 different scenarios in total).  in the particular real world use-case i'm trying to address -- a table with 800+ nullable columns of which only 50 are set on average -- the impact is  significant. there is a ~20% performance improvement on insert microbenchmarks and, more importantly to me, ~100 bytes saved per row on average. that is certainly relevant to me. given the challenges of achieving high page utilization with the mvcc/vacuuming model it might be argued that 100 bytes per row isn't important but it adds up in my case. 

but of course many people have pointed out that the impact of this change is roughly proportional to the number of trailing nulls we are talking about. this comes back to the original question of whether the benefit is general enough. it is hard for me to assess whether the impact is relevant for other use cases without a clear statement of those other use cases. Josh may have an opinion on his particular use case.

4) is the proposed change itself problematic?
4.1) complexity

the change complexity vs. the original code complexity is pretty subjective. the change basically just uses the number of non-null attributes in the row to format the row header instead of the tuple descriptor's natts. i obviously don't find the change excessively complicated but i have bias. i think the only thing to do is for individuals to look at the patch and see if the code has grown a more complicated or too complicated.

4.2) is there a more generalized approach that would be better (e.g. compress the bitmap)

i think it was Simon that suggested perhaps a more general approach such as compressing the bitmap might be better because it might be more widely applicable. this is certainly a valid point. ironically i took the simplistic approach of truncating the bitmap expressly because of its simplicity, and because i thought there might be a decent chance that a more complicated change might be rejected on the basis of the complexity alone (if nothing else there would be two bitmap formats that would need to be supported assuming no upgrade). that said, it would be interesting to prototype a bitmap compression approach and see how much it complicates the code and how it impacts performance. the added complexity vs. the potential gain of this approach may not make that much sense for my particular use-case, but it may for the community as a whole.

4.3) does it violate a principle in the code (Tom's latest email)

from my perspective, the code has to deal with the fact that natts on the persistent row is different than the tupdesc already, this is the foundation upon which ALTER TABLE ADD NULL and DROP COLUMN as a metadata operation are built. so i believe that this patch strengthens the code's ability to handle the ALTER TABLE ADD NULL case by generalizing it: now the code will more frequently need to deal with the disparity between natts and tupdesc rather than only after someone did an ALTER TABLE. i'm an advocate of making corner cases go through generalized code where possible.

however, in all honestly i hadn't consider Tom's point that the patch has created a situation where natts on a row can deviate from the tupdesc that it was built with (as opposed to the current tupdesc which it could always deviate due to a subsequent ALTER TABLE). this is a subtle point and i don't think i have enough experience in the Postgres code to argue one way or another about it. so i'll have to leave that determination up to people that are more familiar with the code.

thanks for reading all of this.

cheers.

-jamie

here are the performance numbers generated by my colleague against the current version of this patch (which is attached). to me this demonstrates that pgbench is not degraded, which was the reason the patch was backed out.

| | 20130121 | 20130121 |
| | vanilla | Modified patch |
|-------+----------+----------------|
| | 5213.30 | 5168.70 |
| | 5229.80 | 5158.28 |
| | 5183.47 | 5183.01 |
| | 5140.82 | 5217.11 |
| | 5180.93 | 5194.44 |
| | 5169.55 | 5202.02 |
| | 5199.78 | 5197.82 |
| | 5228.76 | 5175.75 |
| | 5187.35 | 5233.96 |
| | 5118.66 | 5221.44 |
| | 5176.20 | 5148.09 |
| | 5215.89 | 5192.26 |
| | 5221.13 | 5126.39 |
| | 5232.44 | 5164.58 |
| | 5188.14 | 5210.91 |
| | 5176.77 | 5200.41 |
| | 5218.18 | 5189.85 |
| | 5207.82 | 5158.01 |
| | 5203.77 | 5144.67 |
| | 5213.74 | 5171.70 |
| | 5154.11 | 5171.24 |
| | 5165.22 | 5174.00 |
| | 5196.96 | 5136.03 |
| | 5166.75 | 5176.05 |
| | 5116.69 | 5188.85 |
| | 5170.82 | 5197.51 |
| | 5124.63 | 5145.85 |
| | 5162.05 | 5190.66 |
| | 5198.82 | 5187.08 |
| | 5155.55 | 5199.11 |
| | 5166.95 | 5195.08 |
| | 5197.23 | 5170.88 |
| | 5145.07 | 5152.88 |
| | 5178.24 | 5170.48 |
| | 5128.73 | 5228.55 |
| | 5201.38 | 5189.90 |
| | 5161.96 | 5163.39 |
| | 5191.68 | 5164.13 |
| | 5193.44 | 5172.01 |
| | 5150.87 | 5140.21 |
| | 5118.95 | 5163.93 |
| | 5184.59 | 5145.37 |
| | 5135.52 | 5183.75 |
| | 5197.49 | 5173.54 |
| | 5186.67 | 5207.20 |
| | 5176.33 | 5183.88 |
| | 5185.09 | 5210.38 |
| | 5124.34 | 5182.11 |
|-------+----------+----------------|
| avg | 5178.50 | 5180.27 |
| stdev | 31.51 | 24.53 |
|-------+----------+----------------| Here is the output of pgbench: transaction type: TPC-B (sort of)
scaling factor: 75
query mode: simple
number of clients: 8
number of threads: 8
duration: 300 s I'm running with the database in /run/shm, and the following changes to
the standard postgresql.conf: shared_buffers = 1024MB
checkpoint_segments = 255
checkpoint_timeout = 15min

________________________________
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>; Amit Kapila <amit(dot)kapila(at)huawei(dot)com>; Kevin Grittner <kgrittn(at)mail(dot)com>; robertmhaas(at)gmail(dot)com; josh(at)agliodbs(dot)com; pgsql-hackers(at)postgresql(dot)org; jameisonb(at)yahoo(dot)com
Sent: Wednesday, January 23, 2013 8:13 PM
Subject: Re: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Wed, Jan 09, 2013 at 11:22:06AM +0000, Simon Riggs wrote:
>> On balance, it would seem optimizing for this special case would
>> affect everybody negatively; not much, but enough. Which means we
>> should rekect this patch.

> I've failed to come up with a non-arbitrary reason to recommend for or against
> this patch, so I profess neutrality on the ultimate issue.

I think if we can't convincingly show an attractive performance gain,
we should reject the patch on the grounds of added complexity.  Now,
the amount of added code surely isn't much, but nonetheless this patch
eliminates an invariant that's always been there (ie, that a tuple's
natts matches the tupdesc it was built with).  That will at the very
least complicate forensic work, and it might well break third-party
code, or corners of the core system that we haven't noticed yet.

I'd be okay with this patch if it had more impressive performance
results, but as it is I think we're better off without it.

            regards, tom lane

Attachment Content-Type Size
1-24-2013-trailing-nulls.patch text/x-patch 28.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jameison Martin <jameisonb(at)yahoo(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-24 20:17:28
Message-ID: 24266.1359058648@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jameison Martin <jameisonb(at)yahoo(dot)com> writes:
> there have been a lot of different threads on this patch. i'm going to take a stab at teasing them out, summarizing them, and addressing them individually.

Thanks for reviewing the bidding so carefully.

> 4.3) does it violate a principle in the code (Tom's latest email)

> from my perspective, the code has to deal with the fact that natts on the persistent row is different than the tupdesc already, this is the foundation upon which ALTER TABLE ADD NULL and DROP COLUMN as a metadata operation are built. so i believe that this patch strengthens the code's ability to handle the ALTER TABLE ADD NULL case by generalizing it: now the code will more frequently need to deal with the disparity between natts and tupdesc rather than only after someone did an ALTER TABLE. i'm an advocate of making corner cases go through generalized code where possible.

I think we're already pretty convinced that that case works, since ALTER
TABLE ADD COLUMN has been around for a very long time.

> however, in all honestly i hadn't consider Tom's point that the patch has created a situation where natts on a row can deviate from the tupdesc that it was built with (as opposed to the current tupdesc which it could always deviate due to a subsequent ALTER TABLE). this is a subtle point and i don't think i have enough experience in the Postgres code to argue one way or another about it. so i'll have to leave that determination up to people that are more familiar with the code.

To be a bit more concrete, the thing that is worrying me is that the
executor frequently transforms tuples between "virtual" and HeapTuple
formats (look at the TupleTableSlot infrastructure, junk filters, and
tuplesort/tuplestore, for example). Up to now such transformation could
be counted on not to affect the apparent number of columns in the tuple;
but with this patch, a tuple materialized as a HeapTuple might well show
an natts smaller than what you'd conclude from looking at it in virtual
slot format. Now it's surely true that any code that's broken by that
would be broken anyway if it were fed a tuple direct-from-disk from a
relation that had suffered a later ADD COLUMN, but there are lots of
code paths where raw disk tuples would never appear. So IMO there's a
pretty fair chance of exposing latent bugs with this.

As an example that this type of concern isn't hypothetical, and that
identifying such bugs can be very difficult, see commit 039680aff.
That bug had been latent for years before it was exposed by an unrelated
change, and it took several more years to track it down.

As I said, I'd be willing to take this risk if the patch showed more
attractive performance benefits ... but it still seems mighty marginal
from here.

regards, tom lane


From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-25 01:35:54
Message-ID: CAM-w4HOiT8actZr_HUeoB7m+ddU91_fZaeLEE-7PKijKTPa+Ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 24, 2013 at 8:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> As I said, I'd be willing to take this risk if the patch showed more
> attractive performance benefits ... but it still seems mighty marginal
> from here.

I think the benchmarks given so far are actually barking up the wrong
tree though. There are usage patterns that some people do engage in
that involve dozens or hundreds of columns that are mostly NULL. If
they're saving 10-20 bytes per row that's not insignificant. And they
could be saving much more than that.

That said I don't know just how common that usage pattern is. And I'm
not sure how many of those people would be able to arrange for the
null columns to land at the end of the row.

It's a bit frustrating because it does seem like if it had been
written this way to begin with nobody would every have questioned
whether it was a good idea and nobody would ever have suggested
ripping it out.

--
greg


From: Craig Ringer <craig(at)2ndQuadrant(dot)com>
To: Jameison Martin <jameisonb(at)yahoo(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-25 02:01:58
Message-ID: 5101E796.60404@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/25/2013 03:43 AM, Jameison Martin wrote:
> there have been a lot of different threads on this patch. i'm going to
> take a stab at teasing them out, summarizing them, and addressing them
> individually.

Is this patch on the CF app? I can't seem to find it in 2013-01 or
2013-next, and I wanted to add your summary.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-25 02:02:16
Message-ID: 14877.1359079336@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> That said I don't know just how common that usage pattern is. And I'm
> not sure how many of those people would be able to arrange for the
> null columns to land at the end of the row.

Obviously, they can't arrange that all the time (else their trailing
columns are unnecessary). It'd have to be something like ordering the
columns in descending probability of being not-null --- and unless the
later columns *all* have very small probability of being not-null,
you're not gonna win much. Much as I hate to suggest that somebody
consider EAV-like storage, one does have to wonder whether a schema
like that doesn't need redesign more than it needs an implementation
tweak.

> It's a bit frustrating because it does seem like if it had been
> written this way to begin with nobody would every have questioned
> whether it was a good idea and nobody would ever have suggested
> ripping it out.

True, but we'd also have a much higher probability that there aren't
latent bugs because of expectations that t_natts is trustworthy.
It's the cost of getting from here to there that's scaring me.

regards, tom lane


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, "'Jameison Martin'" <jameisonb(at)yahoo(dot)com>
Cc: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Noah Misch'" <noah(at)leadboat(dot)com>, "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>, "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-01-25 03:36:12
Message-ID: 008001cdfaad$203da710$60b8f530$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/25/2013 03:43 AM, Jameison Martin wrote:
> there have been a lot of different threads on this patch. i'm going to
take a stab at > teasing them out, summarizing them, and addressing them
individually.

> Is this patch on the CF app? I can't seem to find it in 2013-01 or
2013-next, and I
> wanted to add your summary.

It is in previous CF (2012-11) in Returned with Feedback section
https://commitfest.postgresql.org/action/commitfest_view?id=16

With Regards,
Amit Kapila.


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Jameison Martin' <jameisonb(at)yahoo(dot)com>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'Noah Misch' <noah(at)leadboat(dot)com>, 'Simon Riggs' <simon(at)2ndQuadrant(dot)com>, 'Kevin Grittner' <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 12:17:51
Message-ID: 20130624121751.GE6471@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-01-25 09:06:12 +0530, Amit Kapila wrote:
> On 01/25/2013 03:43 AM, Jameison Martin wrote:
> > there have been a lot of different threads on this patch. i'm going to
> take a stab at > teasing them out, summarizing them, and addressing them
> individually.
>
> > Is this patch on the CF app? I can't seem to find it in 2013-01 or
> 2013-next, and I
> > wanted to add your summary.
>
> It is in previous CF (2012-11) in Returned with Feedback section
> https://commitfest.postgresql.org/action/commitfest_view?id=16

I'd argue that the patch ought to be still marked as "Returned with
Feedback" instead of again being marked as "Needs Review". I don't
really see anything new that has come up?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Andres Freund'" <andres(at)2ndquadrant(dot)com>
Cc: "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, "'Jameison Martin'" <jameisonb(at)yahoo(dot)com>, "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Noah Misch'" <noah(at)leadboat(dot)com>, "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>, "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 14:23:39
Message-ID: 002e01ce70e6$6c861a80$45924f80$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, June 24, 2013 5:48 PM Andres Freund wrote:
> On 2013-01-25 09:06:12 +0530, Amit Kapila wrote:
> > On 01/25/2013 03:43 AM, Jameison Martin wrote:
> > > there have been a lot of different threads on this patch. i'm going
> to
> > take a stab at > teasing them out, summarizing them, and addressing
> them
> > individually.
> >
> > > Is this patch on the CF app? I can't seem to find it in 2013-01 or
> > 2013-next, and I
> > > wanted to add your summary.
> >
> > It is in previous CF (2012-11) in Returned with Feedback section
> > https://commitfest.postgresql.org/action/commitfest_view?id=16
>
> I'd argue that the patch ought to be still marked as "Returned with
> Feedback" instead of again being marked as "Needs Review". I don't
> really see anything new that has come up?

You are right that nothing new has come up. The major reason why this patch
could not go into 9.3 was that it is not clearly evident whether
Performance gains of this patch are sufficient to take risks (Tom points out
that bugs caused by such changes can be difficult to identify) of committing
this code.

I will summarize the results, and if most of us feel that they are not good
enough, then we can return this patch.

Observations from Performance Results for tables with less columns
--------------------------------------------------------------------
1. Approximately 13% space savings for table with 12 columns, out of which
last 4 are Nulls.
This table schema is such first 3 columns are integers and last 9 are
boolean's.

Refer below link for detailed results:
http://www.postgresql.org/message-id/00b401cde1d8$746c90f0$5d45b2d0$@kapila@
huawei.com

Observations from Performance Results for tables with large columns
--------------------------------------------------------------------
1. There is a visible performance increase when number of columns containing
NULLS are more than > 60~70% in table have large number of columns.
Approximately 12% for table (having 800 cols, out of which 600 are nulls
OR having 300 columns, out of which 250 are nulls)
2. There are visible space savings when number of columns containing NULLS
are more than > 60~70% in table have large number of columns.
Approximately 11% for table (having 800 cols, out of which 600 are nulls
OR having 300 columns, out of which 250 are nulls)

Refer below link for detailed results:
http://www.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C382853AA4
0(at)szxeml509-mbs

Observation from original pgbench
----------------------------------

1. There is < 1% performance dip for original pgbench, this could be due to
fluctuation in readings or some consequences of addition of code which is
difficult to reason out.
Refer below link for detailed results:
http://www.postgresql.org/message-id/00bc01cde1f7$be4b4cb0$3ae1e610$@kapila@
huawei.com

With Regards,
Amit Kapila.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: "'Andres Freund'" <andres(at)2ndquadrant(dot)com>, "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, "'Jameison Martin'" <jameisonb(at)yahoo(dot)com>, "'Noah Misch'" <noah(at)leadboat(dot)com>, "'Simon Riggs'" <simon(at)2ndquadrant(dot)com>, "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 14:49:53
Message-ID: 13816.1372085393@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila(at)huawei(dot)com> writes:
> I will summarize the results, and if most of us feel that they are not good
> enough, then we can return this patch.

Aside from the question of whether there's really any generally-useful
performance improvement from this patch, it strikes me that this change
forecloses other games that we might want to play with interpretation of
the value of a tuple's natts.

In particular, when I was visiting Salesforce last week, the point came
up that they'd really like ALTER TABLE ADD COLUMN to be "free" even when
the column had a non-null DEFAULT. It's not too difficult to imagine
how we might support that, at least when the default expression is a
constant: decree that if the requested attribute number is beyond natts,
look aside at the tuple descriptor to see if the column is marked as
having a magic default value, and if so, substitute that rather than
just returning NULL. (This has to be a "magic" value separate from
the current default, else a subsequent DROP or SET DEFAULT would do
the wrong thing.)

However, this idea conflicts with an optimization that supposes it can
reduce natts to suppress null columns: if the column was actually stored
as null, you'd lose that information, and would incorrectly return the
magic default on subsequent reads.

I think it might be possible to make both ideas play together, by
not reducing natts further than the last column with a magic default.
However, that means extra complexity in heap_form_tuple, which would
invalidate the performance measurements done in support of this patch.

In short, there's no free lunch ...

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 20:05:27
Message-ID: CA+U5nMJOsrVGC7oAPRoca0+tQiYzk1sb+QFXMix0WCEU9qGw7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 June 2013 15:49, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Kapila <amit(dot)kapila(at)huawei(dot)com> writes:
>> I will summarize the results, and if most of us feel that they are not good
>> enough, then we can return this patch.
>
> Aside from the question of whether there's really any generally-useful
> performance improvement from this patch, it strikes me that this change
> forecloses other games that we might want to play with interpretation of
> the value of a tuple's natts.
>
> In particular, when I was visiting Salesforce last week, the point came
> up that they'd really like ALTER TABLE ADD COLUMN to be "free" even when
> the column had a non-null DEFAULT. It's not too difficult to imagine
> how we might support that, at least when the default expression is a
> constant: decree that if the requested attribute number is beyond natts,
> look aside at the tuple descriptor to see if the column is marked as
> having a magic default value, and if so, substitute that rather than
> just returning NULL. (This has to be a "magic" value separate from
> the current default, else a subsequent DROP or SET DEFAULT would do
> the wrong thing.)

Now that is a mighty fine idea.

> However, this idea conflicts with an optimization that supposes it can
> reduce natts to suppress null columns: if the column was actually stored
> as null, you'd lose that information, and would incorrectly return the
> magic default on subsequent reads.
>
> I think it might be possible to make both ideas play together, by
> not reducing natts further than the last column with a magic default.
> However, that means extra complexity in heap_form_tuple, which would
> invalidate the performance measurements done in support of this patch.
>
> In short, there's no free lunch ...

Agreed.

I think its rather a shame that the proponents of this patch have
tried so hard to push this through without working variations on the
theme. Please guys, go away and get creative; rethink the approach so
that you can have your lunch without anybody else losing theirs. I
would add that I have seen the use case and want to support it, but
we're looking to add to the codebase not just steal small bites of
performance from existing use cases.

As a practical suggestion, why not change the macro so it doesn't even
try to do anything different unless the number of columns is > 72. A
constant comparison should go very quickly and isolate the code better
from the more typical code path. If not, lets try some other ideas,
like Tom just did.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 20:30:41
Message-ID: CA+TgmobbcNpbju6WnWnkhhcx-XK+ZKVu=avfrdLQCuXFynozRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 24, 2013 at 4:05 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I think its rather a shame that the proponents of this patch have
> tried so hard to push this through without working variations on the
> theme. Please guys, go away and get creative; rethink the approach so
> that you can have your lunch without anybody else losing theirs. I
> would add that I have seen the use case and want to support it, but
> we're looking to add to the codebase not just steal small bites of
> performance from existing use cases.

If there's an actual performance consequence of applying this patch,
then I think that's a good reason for rejecting it. But if the best
argument we can come up with is that we might someday try to do even
more clever things with the tuple's natts value, I guess I'm not very
impressed. The reason why we have to rewrite the table when someone
adds a column with a non-NULL default is because we need to store the
new value in every row. Sure, we could defer that in this particular
case. But users might be mighty dismayed to see CLUSTER or VACUUM
FULL -- or a dump-and-reload! -- cause the table to become much LARGER
than it was before. Having some kind of column-oriented compression
would be darn nifty, but this particular path doesn't seem
particularly promising to me.

I think the larger and more fundamental problem with natts is that
there's just not very much bit space available there. Even as it is,
someone who adds and drops columns with any regularity will eventually
hit the wall, and there aren't any good alternatives at that point.
Were there more bit space available here, we could even do things like
allow some special cases of ALTER TABLE .. SET DATA TYPE not to
rewrite the table; natts could become a sort of tuple version number,
and we'd remember how to convert to newer versions on the fly. But
with only 2048 values available, it's not really feasible to start
consuming them for any more operations than we already do. I'm
generally a big fan of the principle that no single patch should be
allowed to crowd out room for future extensibility in this particular
area, but in this case I think the bit space available is *already* so
limited that we're not likely to get much further with it without
changing the storage format.

So, Tom, how's that pluggable storage format coming? :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 20:32:41
Message-ID: 51C8ACE9.2090901@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon,

> I think its rather a shame that the proponents of this patch have
> tried so hard to push this through without working variations on the
> theme. Please guys, go away and get creative; rethink the approach so
> that you can have your lunch without anybody else losing theirs. I
> would add that I have seen the use case and want to support it, but
> we're looking to add to the codebase not just steal small bites of
> performance from existing use cases.

Do we really have ironclad numbers which show that the patch affects
performance negatively? I didn't think the previous performance test
was comprehensive; I was planning to develop one myself this week.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 20:50:58
Message-ID: 21604.1372107058@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> If there's an actual performance consequence of applying this patch,
> then I think that's a good reason for rejecting it. But if the best
> argument we can come up with is that we might someday try to do even
> more clever things with the tuple's natts value, I guess I'm not very
> impressed. The reason why we have to rewrite the table when someone
> adds a column with a non-NULL default is because we need to store the
> new value in every row. Sure, we could defer that in this particular
> case. But users might be mighty dismayed to see CLUSTER or VACUUM
> FULL -- or a dump-and-reload! -- cause the table to become much LARGER
> than it was before. Having some kind of column-oriented compression
> would be darn nifty, but this particular path doesn't seem
> particularly promising to me.

The point of what I was suggesting isn't to conserve storage, but to
reduce downtime during a schema change. Remember that a rewriting ALTER
TABLE locks everyone out of that table for a long time.

> So, Tom, how's that pluggable storage format coming? :-)

Well, actually, it's looking to me like heap_form_tuple will be
underneath the API cut, because alternate storage managers will probably
have other tuple storage formats --- column stores being the poster
child here, but in any case the tuple header format is very unlikely
to be universal.

Which means that whether this patch gets applied to mainline is going
to be moot for Salesforce's purposes; they will certainly want the
equivalent logic in their storage code, because they've got tables with
many hundreds of mostly-null columns, but whether heap_form_tuple acts
this way or not won't affect them.

So unless we consider that many-hundreds-of-columns is a design center
for general purpose use of Postgres, we should be evaluating this patch
strictly on its usefulness for more typical table widths. And my take
on that is that (1) lots of columns isn't our design center (for the
reasons you mentioned among others), and (2) the case for the patch
looks pretty weak otherwise.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 21:02:02
Message-ID: CA+U5nMJG6gQLHxB93JNnYRMcmMEj_WMOM2t2UDz87oHS8VYKfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 June 2013 21:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> > So, Tom, how's that pluggable storage format coming? :-)
>
> Well, actually, it's looking to me like heap_form_tuple will be
> underneath the API cut, because alternate storage managers will probably
> have other tuple storage formats --- column stores being the poster
> child here, but in any case the tuple header format is very unlikely
> to be universal.
>

Hopefully we would allow multiple storage managers to be active at once,
one per table?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Jameison Martin <jameisonb(at)yahoo(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 21:03:21
Message-ID: 1372107801.90455.YahooMailNeo@web160103.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

i believe the last submission of the patch had no negative performance impact on any of the tested use cases, but you'd have to go back and see the last exchange on that. 

i think it was the concern about potentially regressing the codeline in unforeseen ways without a clear benefit to all but one use case (wide tables) that stalled things.

>________________________________
> From: Josh Berkus <josh(at)agliodbs(dot)com>
>To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
>Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Amit Kapila <amit(dot)kapila(at)huawei(dot)com>; Andres Freund <andres(at)2ndquadrant(dot)com>; Craig Ringer <craig(at)2ndquadrant(dot)com>; Jameison Martin <jameisonb(at)yahoo(dot)com>; Noah Misch <noah(at)leadboat(dot)com>; Kevin Grittner <kgrittn(at)mail(dot)com>; robertmhaas(at)gmail(dot)com; pgsql-hackers(at)postgresql(dot)org
>Sent: Monday, June 24, 2013 10:32 PM
>Subject: Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
>
>
>Simon,
>
>> I think its rather a shame that the proponents of this patch have
>> tried so hard to push this through without working variations on the
>> theme. Please guys, go away and get creative; rethink the approach so
>> that you can have your lunch without anybody else losing theirs. I
>> would add that I have seen the use case and want to support it, but
>> we're looking to add to the codebase not just steal small bites of
>> performance from existing use cases.
>
>Do we really have ironclad numbers which show that the patch affects
>performance negatively?  I didn't think the previous performance test
>was comprehensive; I was planning to develop one myself this week.
>
>--
>Josh Berkus
>PostgreSQL Experts Inc.
>http://pgexperts.com
>
>
>


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 21:09:18
Message-ID: 51C8B57E.6040902@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/24/2013 01:50 PM, Tom Lane wrote:
> The point of what I was suggesting isn't to conserve storage, but to
> reduce downtime during a schema change. Remember that a rewriting ALTER
> TABLE locks everyone out of that table for a long time.

Right, but I'm worried about the "surprise!" factor. That is, if we
make the first default "free" by using a magic value, then a SET DEFAULT
on that column is going to have some very surprising results as suddenly
the whole table needs to get written out for the old default. In many
use cases, this would still be a net win, since 80% of the time users
don't change defaults after column creation. But we'd have to make it
much less surprising somehow. Also for the reason Tom pointed out, the
optimization would only work on with NOT NULL columns ... leading to
another potential unexpected surprise when the column is marked NULLable.

> So unless we consider that many-hundreds-of-columns is a design center
> for general purpose use of Postgres, we should be evaluating this patch
> strictly on its usefulness for more typical table widths. And my take
> on that is that (1) lots of columns isn't our design center (for the
> reasons you mentioned among others), and (2) the case for the patch
> looks pretty weak otherwise.

Well, actually, hundreds of columns is reasonably common for a certain
user set (ERP, CRM, etc.). If we could handle that use case very
efficiently, then it would win us some users, since other RDMBSes don't.
However, there are multiple issues with having hundreds of columns, of
which storage optimization is only one ... and probably the smallest one
at that.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 21:21:06
Message-ID: 22240.1372108866@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> On 06/24/2013 01:50 PM, Tom Lane wrote:
>> The point of what I was suggesting isn't to conserve storage, but to
>> reduce downtime during a schema change. Remember that a rewriting ALTER
>> TABLE locks everyone out of that table for a long time.

> Right, but I'm worried about the "surprise!" factor. That is, if we
> make the first default "free" by using a magic value, then a SET DEFAULT
> on that column is going to have some very surprising results as suddenly
> the whole table needs to get written out for the old default.

No, that's why we'd store the magic default separately. That will be
permanent and unaffected by later SET DEFAULT operations. (This
requires that every subsequently created tuple store the column
explicitly so that the magic default doesn't affect it; which is exactly
why there's a conflict with the currently-proposed patch.)

> ... Also for the reason Tom pointed out, the
> optimization would only work on with NOT NULL columns ... leading to
> another potential unexpected surprise when the column is marked NULLable.

Huh? We already have the case of null default handled.

> Well, actually, hundreds of columns is reasonably common for a certain
> user set (ERP, CRM, etc.). If we could handle that use case very
> efficiently, then it would win us some users, since other RDMBSes don't.
> However, there are multiple issues with having hundreds of columns, of
> which storage optimization is only one ... and probably the smallest one
> at that.

Agreed; there are a lot of things we'd have to address if we really
wanted to claim this is a domain we work well in. (I suspect Salesforce
will be chipping away at some of those issues, but as I said,
heap_form_tuple is not in their critical path.)

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 22:46:09
Message-ID: 51C8CC31.8070402@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/24/2013 02:21 PM, Tom Lane wrote:
>> Right, but I'm worried about the "surprise!" factor. That is, if we
>> make the first default "free" by using a magic value, then a SET DEFAULT
>> on that column is going to have some very surprising results as suddenly
>> the whole table needs to get written out for the old default.
>
> No, that's why we'd store the magic default separately. That will be
> permanent and unaffected by later SET DEFAULT operations. (This
> requires that every subsequently created tuple store the column
> explicitly so that the magic default doesn't affect it; which is exactly
> why there's a conflict with the currently-proposed patch.)

Yah. So how likely is this to get done sometime in the next 2 releases?
It's only a conflict if someone is going to write the code ...

> Agreed; there are a lot of things we'd have to address if we really
> wanted to claim this is a domain we work well in. (I suspect Salesforce
> will be chipping away at some of those issues, but as I said,
> heap_form_tuple is not in their critical path.)

Well, doing the trailing column truncation as part of a general plan to
make having 600 columns less painful would make more sense than doing it
on its own. For my personal use case(s), I don't really care about the
null bitmap that much; that amount of space simply isn't that
significant compared to the other performance issues involved. I
started evaluating this patch just because I'm one of the few people
with a realistic test case.

Anyway, let's decide if acceptance of this patch hinges on performance
or not. I'll take me a whole evening to set up a good performance test,
so I don't want to do it if it's going to be a moot point.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-25 01:13:00
Message-ID: CA+TgmoYEgRvoTnCeKm9fZKVZgRXFGpnFpRWwCjWsBHdmppS8gQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 24, 2013 at 4:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The point of what I was suggesting isn't to conserve storage, but to
> reduce downtime during a schema change. Remember that a rewriting ALTER
> TABLE locks everyone out of that table for a long time.

Sure, I understand. As Josh says, though, it'd still be potentially
quite surprising.

>> So, Tom, how's that pluggable storage format coming? :-)
>
> Well, actually, it's looking to me like heap_form_tuple will be
> underneath the API cut, because alternate storage managers will probably
> have other tuple storage formats --- column stores being the poster
> child here, but in any case the tuple header format is very unlikely
> to be universal.

I've had the same thought. Unfortunately, there are a lot of things -
like the TupleTableSlot abstraction - that are fairly deeply in bed
with the format used by our current heap AM; so we might be imposing a
performance overhead for anyone who uses some other representation.
Unless you have some idea how to avoid that?

> Which means that whether this patch gets applied to mainline is going
> to be moot for Salesforce's purposes; they will certainly want the
> equivalent logic in their storage code, because they've got tables with
> many hundreds of mostly-null columns, but whether heap_form_tuple acts
> this way or not won't affect them.

OK.

> So unless we consider that many-hundreds-of-columns is a design center
> for general purpose use of Postgres, we should be evaluating this patch
> strictly on its usefulness for more typical table widths. And my take
> on that is that (1) lots of columns isn't our design center (for the
> reasons you mentioned among others), and (2) the case for the patch
> looks pretty weak otherwise.

I guess I have yet to be convinced that it really hurts anything.
Your example seems fairly hypothetical and could easily be something
no one ever implements. I don't feel terribly strongly that we need
to take this patch, and I don't really care that much whether we do or
not; I'm just not really convinced there's any actual evidence that we
shouldn't. The standard isn't that it's got to make something really
important better; it's just got to make something better without
making anything more important worse.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Noah Misch <noah(at)leadboat(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-25 02:14:22
Message-ID: 20130625021422.GC840390@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 24, 2013 at 01:32:41PM -0700, Josh Berkus wrote:
> Do we really have ironclad numbers which show that the patch affects
> performance negatively? I didn't think the previous performance test
> was comprehensive; I was planning to develop one myself this week.

Not ironclad, no:
http://www.postgresql.org/message-id/20130124021205.GB29954@tornado.leadboat.com

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'Andres Freund'" <andres(at)2ndquadrant(dot)com>, "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, "'Jameison Martin'" <jameisonb(at)yahoo(dot)com>, "'Noah Misch'" <noah(at)leadboat(dot)com>, "'Simon Riggs'" <simon(at)2ndquadrant(dot)com>, "'Kevin Grittner'" <kgrittn(at)mail(dot)com>, <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-25 11:26:51
Message-ID: 00bd01ce7196$e4447d90$accd78b0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, June 24, 2013 8:20 PM Tom Lane wrote:
> Amit Kapila <amit(dot)kapila(at)huawei(dot)com> writes:
> > I will summarize the results, and if most of us feel that they are
> not good
> > enough, then we can return this patch.
>
> Aside from the question of whether there's really any generally-useful
> performance improvement from this patch, it strikes me that this change
> forecloses other games that we might want to play with interpretation
> of
> the value of a tuple's natts.
>
> In particular, when I was visiting Salesforce last week, the point came
> up that they'd really like ALTER TABLE ADD COLUMN to be "free" even
> when
> the column had a non-null DEFAULT. It's not too difficult to imagine
> how we might support that, at least when the default expression is a
> constant: decree that if the requested attribute number is beyond
> natts,
> look aside at the tuple descriptor to see if the column is marked as
> having a magic default value, and if so, substitute that rather than
> just returning NULL. (This has to be a "magic" value separate from
> the current default, else a subsequent DROP or SET DEFAULT would do
> the wrong thing.)
>
> However, this idea conflicts with an optimization that supposes it can
> reduce natts to suppress null columns: if the column was actually
> stored
> as null, you'd lose that information, and would incorrectly return the
> magic default on subsequent reads.
>
> I think it might be possible to make both ideas play together, by
> not reducing natts further than the last column with a magic default.
> However, that means extra complexity in heap_form_tuple, which would
> invalidate the performance measurements done in support of this patch.

It can have slight impact on normal scenario's, but I am not sure how much
because
the change will be very less(may be one extra if check and one assignment)

For this Patch's scenario, I think the major benefit for Performance is in
heap_fill_tuple() where the
For loop is reduced. However added some logic in heap_form_tuple can
reduce the performance improvement,
but there can still be space saving benefit.

With Regards,
Amit Kapila.


From: Jamie Martin <jameisonb(at)yahoo(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "<robertmhaas(at)gmail(dot)com>" <robertmhaas(at)gmail(dot)com>, "<josh(at)agliodbs(dot)com>" <josh(at)agliodbs(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-26 14:05:35
Message-ID: 3049CCE0-5FEF-45D1-931E-F3B08222E620@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

FYI I submitted a slightly modified patch since Amit's measurements that is slightly faster.

On Jun 25, 2013, at 1:26 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:

> On Monday, June 24, 2013 8:20 PM Tom Lane wrote:
>> Amit Kapila <amit(dot)kapila(at)huawei(dot)com> writes:
>>> I will summarize the results, and if most of us feel that they are
>> not good
>>> enough, then we can return this patch.
>>
>> Aside from the question of whether there's really any generally-useful
>> performance improvement from this patch, it strikes me that this change
>> forecloses other games that we might want to play with interpretation
>> of
>> the value of a tuple's natts.
>>
>> In particular, when I was visiting Salesforce last week, the point came
>> up that they'd really like ALTER TABLE ADD COLUMN to be "free" even
>> when
>> the column had a non-null DEFAULT. It's not too difficult to imagine
>> how we might support that, at least when the default expression is a
>> constant: decree that if the requested attribute number is beyond
>> natts,
>> look aside at the tuple descriptor to see if the column is marked as
>> having a magic default value, and if so, substitute that rather than
>> just returning NULL. (This has to be a "magic" value separate from
>> the current default, else a subsequent DROP or SET DEFAULT would do
>> the wrong thing.)
>>
>> However, this idea conflicts with an optimization that supposes it can
>> reduce natts to suppress null columns: if the column was actually
>> stored
>> as null, you'd lose that information, and would incorrectly return the
>> magic default on subsequent reads.
>>
>> I think it might be possible to make both ideas play together, by
>> not reducing natts further than the last column with a magic default.
>> However, that means extra complexity in heap_form_tuple, which would
>> invalidate the performance measurements done in support of this patch.
>
> It can have slight impact on normal scenario's, but I am not sure how much
> because
> the change will be very less(may be one extra if check and one assignment)
>
> For this Patch's scenario, I think the major benefit for Performance is in
> heap_fill_tuple() where the
> For loop is reduced. However added some logic in heap_form_tuple can
> reduce the performance improvement,
> but there can still be space saving benefit.
>
> With Regards,
> Amit Kapila.
>


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Jamie Martin <jameisonb(at)yahoo(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "<robertmhaas(at)gmail(dot)com>" <robertmhaas(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-27 17:12:28
Message-ID: 51CC727C.7020500@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

So, is this patch currently depending on performance testing, or not?
Like I said, it'll be a chunk of time to set up what I beleive is a
realistic performance test, so I don't want to do it if the patch is
likely to be bounced for other reasons.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Jamie Martin <jameisonb(at)yahoo(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "<robertmhaas(at)gmail(dot)com>" <robertmhaas(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-27 18:11:59
Message-ID: 26839.1372356719@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> So, is this patch currently depending on performance testing, or not?
> Like I said, it'll be a chunk of time to set up what I beleive is a
> realistic performance test, so I don't want to do it if the patch is
> likely to be bounced for other reasons.

AFAICS, the threshold question here is whether the patch helps usefully
for tables with typical numbers of columns (ie, not 800 ;-)), and
doesn't hurt materially in any common scenario. If it does, I think
we'd take it. I've not read the patch, so I don't know if there are
minor cosmetic or correctness issues, but at bottom this seems to be a
very straightforward performance tradeoff.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jamie Martin <jameisonb(at)yahoo(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "<robertmhaas(at)gmail(dot)com>" <robertmhaas(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-27 21:12:29
Message-ID: 51CCAABD.4090206@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/27/2013 11:11 AM, Tom Lane wrote:
> AFAICS, the threshold question here is whether the patch helps usefully
> for tables with typical numbers of columns (ie, not 800 ;-)), and

I wouldn't expect it to help at all for < 50 columns, and probably not
measurably below 200.

> doesn't hurt materially in any common scenario. If it does, I think

Yeah, I think that's be bigger question. Ok, I'll start working on a
new test case. Here's my thinking on performance tests:

1. run pgbench 10 times both with and without the patch. See if there's
any measurable difference. There should not be.

2. Run with/without comparisons for the following scenarios:

Each table would have a SERIAL pk, a timestamp, and:

Chains of booleans:
# attributes NULL probability
16 0% 50% 90%
128 0% 50% 90%
512 0% 50% 90%

Chains of text:
16 0% 50% 90%
256 0% 50% 90%

... for 100,000 rows each.

Comparisons would include:

1) table size before and after testing
2) time required to read 1000 rows, by key
3) time required to read rows with each of
100 random column positions = some value
4) time required to insert 1000 rows
5) time required to update 1000 rows

Geez, I feel tired just thinking about it. Jamie, can your team do any
of this?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Jamie Martin <jameisonb(at)yahoo(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "<robertmhaas(at)gmail(dot)com>" <robertmhaas(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-07-08 18:25:51
Message-ID: 51DB042F.6070400@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/26/2013 07:05 AM, Jamie Martin wrote:
> FYI I submitted a slightly modified patch since Amit's measurements that is slightly faster.

Yes. My perspective is that this is a worthwhile optimization for a
minority, but fairly well-known, use case, provided that it doesn't
negatively impact any other, more common use case. Potential cases
where I can see negative impact are:

A) normal table with a few, mostly non-null columns (recent pgbench
testing seems to have validated no measurable impact).

B) table with many (400+) mostly non-null columns

C) table with many (400+) mostly null columns, where column #390 was
null and gets updated to a not null value

I don't *think* that Jamie's performance tests have really addressed the
above cases. However, do people agree that if performance on the patch
passes for all of A, B and C, then it's OK to apply?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Greg Stark <stark(at)mit(dot)edu>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jamie Martin <jameisonb(at)yahoo(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "<robertmhaas(at)gmail(dot)com>" <robertmhaas(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-07-11 16:28:37
Message-ID: CAM-w4HNkQXUKe29z1wN4f4YP9XHqj-5X+ui7TFsKBOkNHn2T0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 27, 2013 at 10:12 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Yeah, I think that's be bigger question. Ok, I'll start working on a
> new test case. Here's my thinking on performance tests:
>
> 1. run pgbench 10 times both with and without the patch. See if there's
> any measurable difference. There should not be.

I don't even see the point in extensive empirical results. Empirical
results are somewhat useful for measuring the cpu cycle cost when the
optimization doesn't give any benefit. That should be one fairly
simple test and my understanding is that it's been done and shown no
significant cost.

When the optimization does kick in it saves space. Saving space is
something we can calculate the effect precisely of and don't need
empirical tests to validate. I would still want to check that it
actually works as expected of course but that's a matter of measuring
the row sizes, not timing lengthy pgbench runs.

Neither of these address Tom's concerns about API changes and future
flexibility. I was assigned this patch in the rreviewers list and my
inclination would be to take it but I wasn't about to
overrule Tom. If he says he's ok with it then I'm fine going ahead and
reviewing the code. If I still have commit bits I could even commit
it.

--
greg


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jamie Martin <jameisonb(at)yahoo(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "<robertmhaas(at)gmail(dot)com>" <robertmhaas(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-07-16 21:34:55
Message-ID: 51E5BC7F.20107@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07/11/2013 09:28 AM, Greg Stark wrote:
> Neither of these address Tom's concerns about API changes and future
> flexibility. I was assigned this patch in the rreviewers list and my
> inclination would be to take it but I wasn't about to
> overrule Tom. If he says he's ok with it then I'm fine going ahead and
> reviewing the code. If I still have commit bits I could even commit
> it.

API changes? I can't find that issue in the discussion.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jameison Martin <jameisonb(at)yahoo(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-08-22 00:40:22
Message-ID: 1377132022.11715.2.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This patch is in the 2013-09 commitfest but needs a rebase.