Re: Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2014-01-27 17:03:41
Message-ID: CAA4eK1KZ27VANOfXFnS5UoTXi38adamx=n-=A-7JOoe2DPfCNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 22, 2014 at 12:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jan 21, 2014 at 2:00 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Mon, Jan 20, 2014 at 9:49 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> I ran Heikki's test suit on latest master and latest master plus
>>> pgrb_delta_encoding_v4.patch on a PPC64 machine, but the results
>>> didn't look too good. The only tests where the WAL volume changed by
>>> more than half a percent were the "one short and one long field, no
>>> change" test, where it dropped by 17%, but at the expense of an
>>> increase in duration of 38%; and the "hundred tiny fields, half
>>> nulled" test, where it dropped by 2% without a change in runtime.
>>
>>> Unfortunately, some of the tests where WAL didn't change significantly
>>> took a runtime hit - in particular, "hundred tiny fields, half
>>> changed" slowed down by 10% and "hundred tiny fields, all changed" by
>>> 8%.
>>
>> I think this part of result is positive, as with earlier approaches here the
>> dip was > 20%. Refer the result posted at link:
>> http://www.postgresql.org/message-id/51366323.8070606@vmware.com
>>
>> Basically if we don't go for longer match, then for test where most data
>> ("one short and one long field, no change") is similar, it has to do below
>> extra steps with no advantage:
>> a. copy extra tags
>> b. calculation for rolling hash
>> c. finding the match
>> I think here major cost is due to 'a', but others might also not be free.
>> To confirm the theory, if we run the test by just un-commenting above
>> code, there can be significant change in both WAL reduction and
>> runtime for this test.
>>
>> I have one idea to avoid the overhead of step a) which is to combine
>> the tags, means don't write the tag until it founds any un-matching data.
>> When any un-matched data is found, then combine all the previously
>> matched data and write it as one tag.
>> This should eliminate the overhead due to step a.
>
> I think that's a good thing to try. Can you code it up?

I have tried to improve algorithm in another way so that we can get
benefit of same chunks during find match (something similar to lz).
The main change is to consider chunks at fixed boundary (4 byte)
and after finding match, try to find if there is a longer match than
current chunk. While finding longer match, it still takes care that
next bigger match should be at chunk boundary. I am not
completely sure about the chunk boundary may be 8 or 16 can give
better results.

I think now we can once run with this patch on high end m/c.

Below is the data on my laptop.

Non-Default Settings
checkpoint_segments = 128
checkpoint_timeout = 15min
autovacuum = off

Before Patch

testname | wal_generated | duration
-----------------------------------------+---------------+------------------
one short and one long field, no change | 1054922336 | 25.4784970283508
one short and one long field, no change | 1054914728 | 45.9248871803284
one short and one long field, no change | 1054911288 | 42.0877709388733
hundred tiny fields, all changed | 633946880 | 21.4810841083527
hundred tiny fields, all changed | 633943520 | 29.5192229747772
hundred tiny fields, all changed | 633943944 | 38.1980679035187
hundred tiny fields, half changed | 633946784 | 36.0654091835022
hundred tiny fields, half changed | 638136544 | 36.231675863266
hundred tiny fields, half changed | 633944072 | 30.7445759773254
hundred tiny fields, half nulled | 570130888 | 28.6964628696442
hundred tiny fields, half nulled | 569755584 | 32.7119750976562
hundred tiny fields, half nulled | 569760312 | 32.4714169502258
(12 rows)

After Patch

testname | wal_generated | duration
-----------------------------------------+---------------+------------------
one short and one long field, no change | 662239704 | 22.8768830299377
one short and one long field, no change | 662896760 | 22.466646194458
one short and one long field, no change | 662878736 | 17.6034708023071
hundred tiny fields, all changed | 633946192 | 24.5791938304901
hundred tiny fields, all changed | 634161120 | 25.7798039913177
hundred tiny fields, all changed | 633946416 | 23.761885881424
hundred tiny fields, half changed | 633945512 | 24.7001428604126
hundred tiny fields, half changed | 633947944 | 25.2069280147552
hundred tiny fields, half changed | 633946480 | 26.6489980220795
hundred tiny fields, half nulled | 492199720 | 28.7052059173584
hundred tiny fields, half nulled | 492194576 | 26.6311559677124
hundred tiny fields, half nulled | 492449408 | 25.2788209915161
(12 rows)

With above modifications, I could see ~37% WAL reduction for best case
"one short and one long field, no change" and ~13% for
"hundred tiny fields, half nulled". The duration is quite fluctuating in most
runs, so may be running it on some better m/c can give us a clear picture.

Any suggestions?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
pgrb_delta_encoding_v5.patch application/octet-stream 39.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rohit Goyal 2014-01-27 17:05:34 Fwd: Request for error explaination || Adding a new integer in indextupleData Structure
Previous Message Simon Riggs 2014-01-27 16:56:26 Re: WIP patch (v2) for updatable security barrier views