Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

Lists: pgsql-hackers
From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-10-13 07:54:59
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C382853A391@szxeml509-mbs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote:

> I see you posted up a follow-up email asking Tom what he had in mind.
> Personally, I don't think this needs incredibly complicated testing.
> I think you should just test a workload involving inserting and/or
> updating rows with lots of trailing NULL columns, and then another
> workload with a table of similar width that... doesn't. If we can't
> find a regression - or, better, we find a win in one or both cases -
> then I think we're done here.

As per the last discussion for this patch, performance data needs to be provided before this patch's Review can proceed further.

So as per your suggestion and from the discussions about this patch, I have collected the performance data as below:

Results are taken with following configuration.
1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT type.
2. shared_buffers = 10GB
3. All the performance result are taken with single connection.
4. Performance is collected for INSERT operation (insert into temptable select * from inittable)

Platform details:
Operating System: Suse-Linux 10.2 x86_64
Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
RAM : 24GB

Documents Attached:
init.sh : Which will create the schema
sql_used.sql : sql's used for taking results

Trim_Nulls_Perf_Report.html : Performance data

Observations from Performance Results

------------------------------------------------

1. There is no performance change for cloumns that have all valid values(non- NULLs).

2. There is a visible performance increase when number of columns containing NULLS are more than > 60~70% in table have large number of columns.

3. There are visible space savings when number of columns containing NULLS are more than > 60~70% in table have large number of columns.

Let me know if there is more performance data needs to be collected for this patch?

With Regards,

Amit Kapila.

Attachment Content-Type Size
init.sh application/octet-stream 745 bytes
sql_used.sql text/plain 279 bytes
Trim_Nulls_Perf_Report.html text/html 42.5 KB

From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-10-15 13:58:52
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C382853AA40@szxeml509-mbs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday, October 13, 2012 1:24 PM Amit kapila wrote:
Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote:

>> I see you posted up a follow-up email asking Tom what he had in mind.
>> Personally, I don't think this needs incredibly complicated testing.
>> I think you should just test a workload involving inserting and/or
>> updating rows with lots of trailing NULL columns, and then another
>> workload with a table of similar width that... doesn't. If we can't
>> find a regression - or, better, we find a win in one or both cases -
>> then I think we're done here.

>As per the last discussion for this patch, performance data needs to be provided before this patch's Review can proceed >further.
>So as per your suggestion and from the discussions about this patch, I have collected the performance data as below:

>Results are taken with following configuration.
>1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT type.
>2. shared_buffers = 10GB
>3. All the performance result are taken with single connection.
>4. Performance is collected for INSERT operation (insert into temptable select * from inittable)

>Platform details:
> Operating System: Suse-Linux 10.2 x86_64
> Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
> RAM : 24GB

Further to Performance data, I have completed the review of the Patch.

Basic stuff:
------------
- Rebase of Patch is required.
As heap_fill_tuple function prototype is moved to different file [htup.h to htup_details.h]
- Compiles cleanly without any errors/warnings
- Regression tests pass.

Code Review comments:
---------------------
1. There is possibility of memory growth in case of toast table, if trailing toasted columns are updated to NULLs;
i.e. In Function toast_insert_or_update, for tuples when 'need_change' variable is true, numAttrs are modified to last non null column values,
and in old tuple de-toasted columns are not getting freed, if this repeats for more number of tuples there is chance of out of memory.

if ( need_change)
{
numAttrs = lastNonNullValOffset + 1;
....
}

if (need_delold)
for (i = 0; i < numAttrs; i++) <== Tailing toasted value wouldn't be freed as updated to NULL and numAttrs is modified to smaller value.
if (toast_delold[i])
toast_delete_datum(rel, toast_oldvalues[i]);

2. Comments need to updated in following functions; how ending null columns are skipped in header part.
heap_fill_tuple - function header
heap_form_tuple, heap_form_minimal_tuple, heap_form_minimal_tuple.

3. Why following change is required in function toast_flatten_tuple_attribute
- numAttrs = tupleDesc->natts;
+ numAttrs = HeapTupleHeaderGetNatts(olddata);

Detailed Performance Report for Insert and Update Operations is attached with this mail.

Observations from Performance Results
------------------------------------------------
1. There is no performance change for cloumns that have all valid values(non- NULLs).
2. There is a visible performance increase when number of columns containing NULLS are more than > 60~70% in table have large number of columns.
3. There are visible space savings when number of columns containing NULLS are more than > 60~70% in table have large number of columns.

With Regards,
Amit Kapila.

Attachment Content-Type Size
Trim_Tailing_Nulls_Perf_Report.html text/html 52.5 KB

From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-10-19 06:46:42
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C382853BB58@szxeml509-mbs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: pgsql-hackers-owner(at)postgresql(dot)org [pgsql-hackers-owner(at)postgresql(dot)org] on behalf of Amit kapila [amit(dot)kapila(at)huawei(dot)com]
Sent: Monday, October 15, 2012 7:28 PM
To: robertmhaas(at)gmail(dot)com; josh(at)agliodbs(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

On Monday, October 15, 2012 7:28 PM Amit kapila wrote:
On Saturday, October 13, 2012 1:24 PM Amit kapila wrote:
Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote:

>> I see you posted up a follow-up email asking Tom what he had in mind.
>> Personally, I don't think this needs incredibly complicated testing.
>> I think you should just test a workload involving inserting and/or
>> updating rows with lots of trailing NULL columns, and then another
>> workload with a table of similar width that... doesn't. If we can't
>> find a regression - or, better, we find a win in one or both cases -
>> then I think we're done here.

>As per the last discussion for this patch, performance data needs to be provided before this patch's Review can proceed >further.
>So as per your suggestion and from the discussions about this patch, I have collected the performance data as below:

>Results are taken with following configuration.
>1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT type.
>2. shared_buffers = 10GB
>3. All the performance result are taken with single connection.
>4. Performance is collected for INSERT operation (insert into temptable select * from inittable)

>Platform details:
> Operating System: Suse-Linux 10.2 x86_64
> Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
> RAM : 24GB

> Further to Performance data, I have completed the review of the Patch.

Please find the patch to address Review Comments attached with this mail.

IMO, now its ready for a committer.

With Regards,
Amit Kapila.

Attachment Content-Type Size
Truncate-trailing-null-columns-from-heap-rows.v2.patch application/octet-stream 28.4 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "josh(at)agliodbs(dot)com" <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-20 12:16:10
Message-ID: CA+U5nM+7Oj88ihyjwp2DRPERafSX0qj5_1V7TA58fUKVn9Jwkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 October 2012 08:54, Amit kapila <amit(dot)kapila(at)huawei(dot)com> wrote:

> As per the last discussion for this patch, performance data needs to be
> provided before this patch's Review can proceed further.
>
> So as per your suggestion and from the discussions about this patch, I have
> collected the performance data as below:
>
>
>
> Results are taken with following configuration.
> 1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT
> type.
> 2. shared_buffers = 10GB
> 3. All the performance result are taken with single connection.
> 4. Performance is collected for INSERT operation (insert into temptable
> select * from inittable)
>
> Platform details:
> Operating System: Suse-Linux 10.2 x86_64
> Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
> RAM : 24GB
>
> Documents Attached:
> init.sh : Which will create the schema
> sql_used.sql : sql's used for taking results
>
> Trim_Nulls_Perf_Report.html : Performance data
>
>
> Observations from Performance Results
>
> ------------------------------------------------
>
> 1. There is no performance change for cloumns that have all valid
> values(non- NULLs).
>
> 2. There is a visible performance increase when number of columns containing
> NULLS are more than > 60~70% in table have large number of columns.
>
> 3. There are visible space savings when number of columns containing NULLS
> are more than > 60~70% in table have large number of columns.
>
>
> Let me know if there is more performance data needs to be collected for this
> patch?

I can't make sense of your performance report. Because of that I can't
derive the same conclusions from it you do.

Can you explain the performance results in more detail, so we can see
what they mean? Like which are the patched, which are the unpatched
results? Which results are comparable, what the percentages mean etc..

We might then move quickly towards commit, or at least more tests.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-20 14:56:56
Message-ID: 007f01cddec2$41fa8050$c5ef80f0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, December 20, 2012 5:46 PM Simon Riggs wrote:
> On 13 October 2012 08:54, Amit kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>
> > As per the last discussion for this patch, performance data needs to
> be
> > provided before this patch's Review can proceed further.
> >
> > So as per your suggestion and from the discussions about this patch,
> I have

> >
> > ------------------------------------------------
> >
> > 1. There is no performance change for cloumns that have all valid
> > values(non- NULLs).
> >
> > 2. There is a visible performance increase when number of columns
> containing
> > NULLS are more than > 60~70% in table have large number of columns.
> >
> > 3. There are visible space savings when number of columns containing
> NULLS
> > are more than > 60~70% in table have large number of columns.
> >
> >
> > Let me know if there is more performance data needs to be collected
> for this
> > patch?
>
>
> I can't make sense of your performance report. Because of that I can't
> derive the same conclusions from it you do.
>
> Can you explain the performance results in more detail, so we can see
> what they mean? Like which are the patched, which are the unpatched
> results?
On the extreme let it is mentioned Original Code/ Trim Triling Nulls Patch.
In any case I have framed the results again as below:
1. Table with 800 columns
A. INSERT tuples with 600 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
1A: 0.2068 250000 | 0.2302 222223 | 10.1% tps, 11.1% space
1B: 0.0448 500000 | 0.0481 472223 | 6.8% tps, 5.6% space
1C: 0.0433 750000 | 0.0493 694445 | 12.2% tps, 7.4% space

2. Table with 800 columns
A. INSERT tuples with 300 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
2A: 0.0280 666667 | 0.0287 666667 | 2.3% tps, 0% space
2B: 0.0143 1333334 | 0.0152 1333334 | 5.3% tps, 0% space
2C: 0.0145 2000000 | 0.0149 2000000 | 2.9% tps, 0% space

3. Table with 300 columns
A. INSERT tuples with 150 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+--------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+--------------------
3A: 0.2815 166667 | 0.2899 166667 | 2.9% tps, 0% space
3B: 0.0851 333334 | 0.0870 333334 | 2.2% tps, 0% space
3C: 0.0846 500000 | 0.0852 500000 | 0.7% tps, 0% space

4. Table with 300 columns
A. INSERT tuples with 250 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+-------------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+-------------------------
4A: 0.5447 66667 | 0.5996 58824 | 09.2% tps, 11.8% space
4B: 0.1251 135633 | 0.1232 127790 | -01.5% tps, 5.8% space
4C: 0.1223 202299 | 0.1361 186613 | 10.1% tps, 7.5% space

Please let me know, if still it is not clear.

With Regards,
Amit Kapila.


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-23 14:41:24
Message-ID: CA+U5nMKJw6fEduwb6LZKkiBZAjAe6m7+5boaZFEa6jM5EZyo0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20 December 2012 14:56, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:

>> > 1. There is no performance change for cloumns that have all valid
>> > values(non- NULLs).

I don't see any tests (at all) that measure this.

I'm particularly interested in lower numbers of columns, so we can
show no regression for the common case.

>> > 2. There is a visible performance increase when number of columns
>> containing
>> > NULLS are more than > 60~70% in table have large number of columns.
>> >
>> > 3. There are visible space savings when number of columns containing
>> NULLS
>> > are more than > 60~70% in table have large number of columns.

Agreed.

I would call that quite disappointing though and was expecting better.
Are we sure the patch works and the tests are correct?

The lack of any space saving for lower % values is strange and
somewhat worrying. There should be a 36? byte saving for 300 null
columns in an 800 column table - how does that not show up at all?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-23 17:38:02
Message-ID: 26779.1356284282@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> The lack of any space saving for lower % values is strange and
> somewhat worrying. There should be a 36? byte saving for 300 null
> columns in an 800 column table - how does that not show up at all?

You could only fit about 4 such rows in an 8K page (assuming the columns
are all int4s). Unless the savings is enough to allow 5 rows to fit in
a page, the effective savings will be zilch.

This may well mean that the whole thing is a waste of time in most
scenarios --- the more likely it is to save anything, the more likely
that the savings will be lost anyway due to page alignment
considerations, because wider rows inherently pack less efficiently.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-23 18:04:51
Message-ID: CA+U5nM+5Ufq5VmgjSaxEd4Q5=SqAAC+4eZzbXS2bGRAHPH-7Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 23 December 2012 17:38, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> The lack of any space saving for lower % values is strange and
>> somewhat worrying. There should be a 36? byte saving for 300 null
>> columns in an 800 column table - how does that not show up at all?
>
> You could only fit about 4 such rows in an 8K page (assuming the columns
> are all int4s). Unless the savings is enough to allow 5 rows to fit in
> a page, the effective savings will be zilch.

If that's the case, the use case is tiny, especially considering how
sensitive the saving is to the exact location of the NULLs.

> This may well mean that the whole thing is a waste of time in most
> scenarios --- the more likely it is to save anything, the more likely
> that the savings will be lost anyway due to page alignment
> considerations, because wider rows inherently pack less efficiently.

ISTM that we'd get a better gain and a wider use case by compressing
the whole block, with some bits masked out to allow updates/deletes.
The string of zeroes in the null bitmap would compress easily, but so
would other aspects also.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: <robertmhaas(at)gmail(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-24 13:13:23
Message-ID: 00b401cde1d8$746c90f0$5d45b2d0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday, December 23, 2012 8:11 PM Simon Riggs wrote:
> On 20 December 2012 14:56, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>
> >> > 1. There is no performance change for cloumns that have all valid
> >> > values(non- NULLs).
>
> I don't see any tests (at all) that measure this.
>
> I'm particularly interested in lower numbers of columns, so we can
> show no regression for the common case.

For now I have taken for 300 columns, I can take for 10~30 columns reading
as well if required

1. Table with 300 columns (all integer columns)
A. INSERT tuples without trailing nulls
B. UPDATE last column value to "null"
----------------+---------------------+------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS | TPS | Results
----------------+---------------------+------------------
1A: 0.1348 | 0.1352 | 0.3%
1B: 0.0495 | 0.0495 | 0.0%

> >> > 2. There is a visible performance increase when number of columns
> >> containing
> >> > NULLS are more than > 60~70% in table have large number of
> columns.
> >> >
> >> > 3. There are visible space savings when number of columns
> containing
> >> NULLS
> >> > are more than > 60~70% in table have large number of columns.
>
> Agreed.
>
> I would call that quite disappointing though and was expecting better.
> Are we sure the patch works and the tests are correct?
>
> The lack of any space saving for lower % values is strange and
> somewhat worrying. There should be a 36? byte saving for 300 null
> columns in an 800 column table - how does that not show up at all?

300 NULL's case will save approximately 108 bytes, as 3 tuples will be
accommodated in such case.
So now the total space left in page will be approximately 1900 bytes
(including 108 bytes saved by optimization).
Now the point is that in existing test case all rows are same (approx 2100
bytes), so no space saving is shown, but incase the last row is such that it
can get accommodated in space saved (remaining space of page + space saved
due to NULLS optimization), then it can show space savings as well.

In anycase there is a performance gain for 300 NULLS case as well.

Apart from above, the performance data for less number of columns (where the
trailing nulls are such that they cross word boundary) also show similar
gains:

The below cases (2 & 3) can give benefit as it will save 4 bytes per tuple

2. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
A. INSERT tuples with 9 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
2A: 0.8485 12739 | 0.8524 10811 | 0.4% 15.1%
2B: 0.5847 25478 | 0.5749 23550 | -1.5% 7.5%
2C: 0.5591 38217 | 0.5545 34361 | 0.8% 10.0%

3. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
A. INSERT tuples with 4 trailing nulls
B. UPDATE last column value to "non-null"
C. UPDATE last column value to "null"
---------------------+---------------------+---------------------
Original Code | Trim Tailing NULLs | Improvement (%)
TPS space used| TPS space used | Results
(pages) | (pages) |
---------------------+---------------------+----------------------
3A: 0.8443 14706 | 0.8626 12739 | 2.3% 13.3%
3B: 0.5307 29412 | 0.5272 27445 | -0.6% 6.7%
3C: 0.5102 44118 | 0.5218 40184 | 2.2% 8.9%

As a conclusion point, I would like to say that this patch doesn't have
performance regression for most used scenario's
and it gives benefit in some of the trailing null's cases.

With Regards,
Amit Kapila.


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: robertmhaas(at)gmail(dot)com, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2012-12-24 14:13:32
Message-ID: CA+U5nM+j23n0FqLM3AADefrXheH1Ln==HYQ3ubcVdqRHOiWtkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 December 2012 13:13, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:

> Apart from above, the performance data for less number of columns (where the
> trailing nulls are such that they cross word boundary) also show similar
> gains:
>
> The below cases (2 & 3) can give benefit as it will save 4 bytes per tuple
>
> 2. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
> A. INSERT tuples with 9 trailing nulls
> B. UPDATE last column value to "non-null"
> C. UPDATE last column value to "null"
> ---------------------+---------------------+---------------------
> Original Code | Trim Tailing NULLs | Improvement (%)
> TPS space used| TPS space used | Results
> (pages) | (pages) |
> ---------------------+---------------------+----------------------
> 2A: 0.8485 12739 | 0.8524 10811 | 0.4% 15.1%
> 2B: 0.5847 25478 | 0.5749 23550 | -1.5% 7.5%
> 2C: 0.5591 38217 | 0.5545 34361 | 0.8% 10.0%
>
>
> 3. Table with 12 columns (first 3 integer followed by 9 Boolean columns)
> A. INSERT tuples with 4 trailing nulls
> B. UPDATE last column value to "non-null"
> C. UPDATE last column value to "null"
> ---------------------+---------------------+---------------------
> Original Code | Trim Tailing NULLs | Improvement (%)
> TPS space used| TPS space used | Results
> (pages) | (pages) |
> ---------------------+---------------------+----------------------
> 3A: 0.8443 14706 | 0.8626 12739 | 2.3% 13.3%
> 3B: 0.5307 29412 | 0.5272 27445 | -0.6% 6.7%
> 3C: 0.5102 44118 | 0.5218 40184 | 2.2% 8.9%
>
> As a conclusion point, I would like to say that this patch doesn't have
> performance regression for most used scenario's
> and it gives benefit in some of the trailing null's cases.

Not really sure about the 100s of columns use case.

But showing gain in useful places in these more common cases wins my vote.

Thanks for testing. Barring objections, will commit.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services