Quick Links

Re: [WIP] Performance Improvement by reducing WAL for Update Operation

Lists:	pgsql-hackers

From:	Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"heikki(dot)linnakangas(at)enterprisedb(dot)com" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-21 14:57:25
Message-ID:	6C0B27F7206C9E4CA54AE035729E9C382852C382@szxeml509-mbs
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

From: Heikki Linnakangas [mailto:heikki(dot)linnakangas(at)enterprisedb(dot)com]
Sent: Saturday, August 04, 2012 1:33 AM
On 03.08.2012 14:46, Amit kapila wrote:

>> Currently the change is done only for fixed length columns for simple tables and the tuple should not contain >>NULLS.
>
>> This is a Proof of concept, the design and implementation needs to be
>>changed based on final design required for handling other scenario's
>

> The performance will need to be re-verified after you fix these
> limitations. Those limitations need to be fixed before this can be applied.

> It would be nice to use some well-known binary delta algorithm for this,
> rather than invent our own. OTOH, we have more knowledge of the
> attribute boundaries, so a custom algorithm might work better. In any
> case, I'd like to see the code to do the delta encoding/decoding to be
> put into separate functions, outside of heapam.c. It would be good for
> readability, and we might want to reuse this in other places too.

I have modified the patch to handle varible length fields and NULLS as well. I have used custom algorithm based on
attribute boundaries.
Currently it handles updates across pages, but as a concern is raised by Robert and Simon about handle it only if the update is on same page, I can do it in the next version of patch if that is required.

Design of Update operation:
---------------------------------------------
1. Check for tuple whether the optimization can be applicable or not. For following it is applicable
a. Tuple should not be toasted tuple
b. Before update triggers should return NEW
c. Length of modified tuple should be more than 128 bytes.
d. When the pull page write is enabled. Currently this is checked inside heap_update, however it can be pulled
to ExecUpdate()
2. Identify the modified columns from the target entry.
3. Based on the modified column list, frame the wal record in the below format.
WAL update header + Tuple header(no change from previous format) +
COPY + offset until copy required from old tuple.
IGN + length needs to be ignored from the old tuple.
PAD + length needs to padded with zero in new tuple because of alignment.
ADD + length of data + data which is modified.
....
....

Recovery Design
------------------------------
Frame the new tuple from old tuple and WAL record during recovery:

1. For the COPY command, copy the specified length from old tuple.
Once the old tuple data copied, then increase the offset by the
copied length.
2. For the IGN command, ignore the specified length in the old tuple.
3. For the PAD command, fill with zeros of the specified length in the new tuple.
4. For the ADD command, copy the corresponding length of data from WAL record to the new tuple.
5. Repeat this procedure until the WAL record reaches the end.
6. If any remaining left out old tuple data will be copied at last.

Test results:

1. The pgbench test run for 10min.
2. The test reult is for modified pgbench (such that total row size is 1800 and updated columns are of length 300) tpc-b testcase.
The result and modified pgbench code is attached with mail.
3. The performance improvement shown in the m/c I have tested is quite good (more than 100% for sync commit = off).

Comments and further suggestions?

With Regards,
Amit Kapila.

Attachment	Content-Type	Size
pgbench.c	application/octet-stream	63.2 KB
modified_pgbench.htm	text/html	24.4 KB
wal_update_changes2.patch	application/octet-stream	28.0 KB

From:	Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"'Jesper Krogh'" <jesper(at)krogh(dot)cc>
Cc:	<pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-22 03:04:05
Message-ID:	003a01cd8012$caf3bbb0$60db3310$@kapila@huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

From: Jesper Krogh [mailto:jesper(at)krogh(dot)cc]
Sent: Wednesday, August 22, 2012 1:13 AM
On 21/08/12 16:57, Amit kapila wrote:

>>Test results:

>>1. The pgbench test run for 10min.
>> 2. The test reult is for modified pgbench (such that total row size is
1800 and updated columns are of length 300) tpc-b testcase.
>> The result and modified pgbench code is attached with mail.

>> 3. The performance improvement shown in the m/c I have tested is quite
good (more than 100% for sync commit = off).

> I cannot comment on completeness or correctness of the code, but I do
think a relevant test would be
> to turn synchronous_commit on as default.

> Even though you aim at an improved performance, it would be nice to see
the reduction in WAL-size
>as an effect of this patch.

Yes, I shall take care of doing both the above tests and send the report.

With Regards,

Amit Kapila.

From:	Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"'Amit Kapila'" <amit(dot)kapila(at)huawei(dot)com>, "'Jesper Krogh'" <jesper(at)krogh(dot)cc>
Cc:	<pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-22 14:08:33
Message-ID:	008201cd806f$9e084640$da18d2c0$@kapila@huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Amit Kapila
Sent: Wednesday, August 22, 2012 8:34 AM
From: Jesper Krogh [mailto:jesper(at)krogh(dot)cc]
Sent: Wednesday, August 22, 2012 1:13 AM
On 21/08/12 16:57, Amit kapila wrote:

>>Test results:

>>>1. The pgbench test run for 10min.
>>> 2. The test reult is for modified pgbench (such that total row size is
1800 and updated columns are of length 300) tpc-b testcase.
>>> The result and modified pgbench code is attached with mail.

>>> 3. The performance improvement shown in the m/c I have tested is quite
good (more than 100% for sync commit = off).

>> I cannot comment on completeness or correctness of the code, but I do
think a relevant test would be
>> to turn synchronous_commit on as default.

>> Even though you aim at an improved performance, it would be nice to see
the reduction in WAL-size
>>as an effect of this patch.

> Yes, I shall take care of doing both the above tests and send the report.

The data for WAL reduction is as below:

The number of transactions processed are 16000 by doing update only of size
250 bytes with an record size of 1800.

I had made sure no full_page_write happens by making checkpoint interval and
checkpoints segments longer.

Original code - 1.8G Modified code - 1.1G Diff - 63% reduction, incase
of fill factor 100.
Original code - 1.6G Modified code - 1.1G Diff - 45% reduction, incase
of fill factor 80.

I am still in process of collecting synchronous commit mode on data.

Please let me know what more kind of data will be helpful to indicate the
benefits of this implementation.

With Regards,

Amit Kapila.

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc:	'Jesper Krogh' <jesper(at)krogh(dot)cc>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-23 20:41:51
Message-ID:	20120823204151.GD5971@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Aug 22, 2012 at 07:38:33PM +0530, Amit Kapila wrote:
> I had made sure no full_page_write happens by making checkpoint interval and
> checkpoints segments longer.
>
>
>
> Original code - 1.8G Modified code - 1.1G Diff - 63% reduction, incase of
> fill factor 100.
> Original code - 1.6G Modified code - 1.1G Diff - 45% reduction, incase of
> fill factor 80.
>
>
>
> I am still in process of collecting synchronous commit mode on data.

Wow, that sounds promising.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

From:	Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"'Bruce Momjian'" <bruce(at)momjian(dot)us>
Cc:	"'Jesper Krogh'" <jesper(at)krogh(dot)cc>, <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-24 03:46:05
Message-ID:	001801cd81aa$fdca00f0$f95e02d0$@kapila@huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

From: Bruce Momjian [mailto:bruce(at)momjian(dot)us]
Sent: Friday, August 24, 2012 2:12 AM
On Wed, Aug 22, 2012 at 07:38:33PM +0530, Amit Kapila wrote:
>> I had made sure no full_page_write happens by making checkpoint interval
and
>> checkpoints segments longer.
>>
>
>
>> Original code - 1.8G Modified code - 1.1G Diff - 63% reduction,
incase of
>> fill factor 100.
>> Original code - 1.6G Modified code - 1.1G Diff - 45% reduction,
incase of
>> fill factor 80.
>
>
>
>> I am still in process of collecting synchronous commit mode on data.

> Wow, that sounds promising.
Thanks you.

Right now I am collecting the data for Synchronous_commit =on mode; My
initial observation is that
incase fsync is off, the results are good(around 50% perf improvement).
However if fsync is on, the performance results fall down to 3~5%. I am not
sure even if the data for I/O is reduced,
Still why there is no big performance gain as in case of Synchronous_commit
= off or when fsync is off.

I am trying with different methods of wal_sync_method parameter and by
setting some value of commit_delay as suggested by Peter Geoghegan in one of
his mails.

Please suggest me if anyone has any thoughts on what kind of parameter's are
best for such a use case or let me know if I am missing anything and such
kind of performance improvement can only improve performance for fsync =off
case.

With Regards,
Amit Kapila.

From:	Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"'Bruce Momjian'" <bruce(at)momjian(dot)us>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>
Cc:	"'Jesper Krogh'" <jesper(at)krogh(dot)cc>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-25 11:46:11
Message-ID:	6C0B27F7206C9E4CA54AE035729E9C382852DB51@szxeml509-mbs
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

From: pgsql-hackers-owner(at)postgresql(dot)org [pgsql-hackers-owner(at)postgresql(dot)org] on behalf of Amit Kapila [amit(dot)kapila(at)huawei(dot)com]
Sent: Friday, August 24, 2012 9:16 AM
From: Bruce Momjian [mailto:bruce(at)momjian(dot)us]
Sent: Friday, August 24, 2012 2:12 AM
On Wed, Aug 22, 2012 at 07:38:33PM +0530, Amit Kapila wrote:
>> I had made sure no full_page_write happens by making checkpoint interval
and
>> checkpoints segments longer.
>>
>
>
>> Original code - 1.8G Modified code - 1.1G Diff - 63% reduction,
incase of
>> fill factor 100.
>> Original code - 1.6G Modified code - 1.1G Diff - 45% reduction,
incase of
>> fill factor 80.
>
>
>
>> I am still in process of collecting synchronous commit mode on data.

> Wow, that sounds promising.
Thanks you.

> Right now I am collecting the data for Synchronous_commit =on mode; My
> initial observation is that
> incase fsync is off, the results are good(around 50% perf improvement).
> However if fsync is on, the performance results fall down to 3~5%. I am not
> sure even if the data for I/O is reduced,
> Still why there is no big performance gain as in case of Synchronous_commit
> = off or when fsync is off.

The modified pgbench test and testdata for synchronous commit mode is attached with this mail.
The test has shown upto 13% performance improvement in one of the cases.

I am still working on to collect some more performance data for wal_sync_method - OPEN_SYNC and by varying XLOG buffer size.

With Regards,
Amit Kapila.

Attachment	Content-Type	Size
pgbench_modified_for_sync_commit_on_data.c	text/plain	64.3 KB
pgbench_sync_commit_on.htm	text/html	23.6 KB

From:	Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-27 12:18:46
Message-ID:	6C0B27F7206C9E4CA54AE035729E9C382852DE51@szxeml509-mbs
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

From: pgsql-hackers-owner(at)postgresql(dot)org [pgsql-hackers-owner(at)postgresql(dot)org] on behalf of Amit kapila [amit(dot)kapila(at)huawei(dot)com]
From: pgsql-hackers-owner(at)postgresql(dot)org [pgsql-hackers-owner(at)postgresql(dot)org] on behalf of Amit Kapila [amit(dot)kapila(at)huawei(dot)com]
Sent: Friday, August 24, 2012 9:16 AM
From: Bruce Momjian [mailto:bruce(at)momjian(dot)us]
Sent: Friday, August 24, 2012 2:12 AM
On Wed, Aug 22, 2012 at 07:38:33PM +0530, Amit Kapila wrote:

I have implemented the WAL Reduction Patch for the case of HOT Update as pointed out by Simon and Robert. In this patch it only goes for Optimized WAL in case of HOT Update with other restrictions same as in previous patch.

The performance numbers for this patch are attached in this mail. It has improved by 90% if the page has fillfactor 80.

Now going forward I have following options:
a. Upload the patch in Open CF for WAL Reduction which contains reductution for HOT and non-HOT updates.
b. Upload the patch in Open CF for WAL Reduction which contains reductution for HOT updates.
c. Upload both the patches as different versions.

Can you suggest which should be the best way to go for this Patch?

Based on above I should send the testing I have done for this Patch.

Any other Suggestions?

With Regards,
Amit Kapila.

Attachment	Content-Type	Size
wal_update_changes_hot_update.patch	text/plain	26.8 KB
pgbench_hotupdate.htm	text/html	24.9 KB
pgbench_for_hot_update.c	text/plain	63.2 KB

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Cc:	"pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-27 12:28:19
Message-ID:	503B67E3.7060405@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 27.08.2012 15:18, Amit kapila wrote:
> I have implemented the WAL Reduction Patch for the case of HOT Update as pointed out by Simon and Robert. In this patch it only goes for Optimized WAL in case of HOT Update with other restrictions same as in previous patch.
>
> The performance numbers for this patch are attached in this mail. It has improved by 90% if the page has fillfactor 80.
>
> Now going forward I have following options:
> a. Upload the patch in Open CF for WAL Reduction which contains reductution for HOT and non-HOT updates.
> b. Upload the patch in Open CF for WAL Reduction which contains reductution for HOT updates.
> c. Upload both the patches as different versions.

Let's do it for HOT updates only. Simon & Robert made good arguments on
why this is a bad idea for non-HOT updates.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	<pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-27 13:30:40
Message-ID:	004501cd8458$26f2fea0$74d8fbe0$@kapila@huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

From: Heikki Linnakangas [mailto:heikki(dot)linnakangas(at)enterprisedb(dot)com]
Sent: Monday, August 27, 2012 5:58 PM
To: Amit kapila
On 27.08.2012 15:18, Amit kapila wrote:
>> I have implemented the WAL Reduction Patch for the case of HOT Update as
pointed out by Simon and Robert. In this patch it only goes for Optimized
WAL in case of HOT Update with other restrictions same as in previous patch.
>>
>> The performance numbers for this patch are attached in this mail. It has
improved by 90% if the page has fillfactor 80.
>>
>> Now going forward I have following options:
>> a. Upload the patch in Open CF for WAL Reduction which contains
reductution for HOT and non-HOT updates.
>> b. Upload the patch in Open CF for WAL Reduction which contains
reductution for HOT updates.
>> c. Upload both the patches as different versions.

> Let's do it for HOT updates only. Simon & Robert made good arguments on
> why this is a bad idea for non-HOT updates.

Okay, I shall do it that way.
So now I shall send information about all the testing I have done for this
Patch and then Upload it in CF.

With Regards,
Amit Kapila.

From:	Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date:	2012-08-29 04:53:10
Message-ID:	6C0B27F7206C9E4CA54AE035729E9C382852F41E@szxeml509-mbs
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On August 27, 2012 7:00 PM Amit Kapila wrote:
On August 27, 2012 5:58 PM Heikki Linnakangas wrote:
On 27.08.2012 15:18, Amit kapila wrote:
>>> I have implemented the WAL Reduction Patch for the case of HOT Update as

>> Let's do it for HOT updates only. Simon & Robert made good arguments on
>> why this is a bad idea for non-HOT updates.

> Okay, I shall do it that way.
> So now I shall send information about all the testing I have done for this
> Patch and then Upload it in CF.

Test Scenario's are below and testcases for same are attached with this mail.

Scenario1:
Recover the data where the field data is updated with different value from an exisitng data of an integer field.
Steps:
1. Start the server, create table, insert one record into the table.
2. update the integer field with other than existing data.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario2:
Recover the data where the field data is updated with different value from an exisitng data of char and varchar fields.
Steps:
1. Start the server, create table, insert one record into the table.
2. update both char and varchar fields with other than existing data.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario3:
Recover the data where the field data is updated with NULL value from an exisitng data of a field.
Steps:
1. Start the server, create table, insert one record into the table.
2. update a field with NULL value.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario4:
Recover the data where the field data is updated with a proper value from an exisitng data of a field where the row contains NULL data.
Steps:
1. Start the server, create table, insert one record into the table.
2. update a field with a different value other than existing data.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario5:
Recover the data where all fields data is updated with NULL value from an exisitng data of a fields.
Steps:
1. Start the server, create table, insert one record into the table.
2. update all fields with NULL values.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario6:
Recover the data of updated field of a table where the table contains a toast table.
Steps:
1. Start the server, create table, insert one record into the table.
2. update a field with a different value other than existing data.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario7:
Recover the data of updated field of a table where the row length is less than 128 bytes.
Steps:
1. Start the server, create table, insert one record into the table.
2. update a field with a different value other than existing data.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario8:
Recover the data of updated field of a table where the before trigger modifies the tuple before the tuple updates.
Steps:
1. Start the server, create table, insert one record into the table.
2. create a before trigger which modifies the same record.
3. update a field with a different value other than existing data.
4. Shutdown the server immediately.
5. Start the server and connect the client and check the data in the table.
Expected behavior:
The updated data should present in the table after database recovery.

Scenario9:
Recover the data where the update operation fails because of trigger returns NULL.
Steps:
1. Start the server, create table, insert one record into the table.
2. update a field fails as before trigger returns NULL.
3. Shutdown the server immediately.
4. Start the server and connect the client and check the data in the table.
Expected behavior:
The update command shouldn't be effective after recovery also.

With Regards,
Amit Kapila.

Attachment	Content-Type	Size
test_wal_reduction_update_sql.txt	text/plain	3.8 KB