Re: Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2014-02-06 12:43:55
Message-ID: CAA4eK1JwMaYZUYh8N+TsTnVRO-XZ-fpg22a_WqRRdo2RjpU_MA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 8:56 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Feb 5, 2014 at 5:13 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> On 02/05/2014 07:54 AM, Amit Kapila wrote:
>>
>> That's not the worst case, by far.
>>
>> First, note that the skipping while scanning new tuple is only performed in
>> the first loop. That means that as soon as you have a single match, you fall
>> back to hashing every byte. So for the worst case, put one 4-byte field as
>> the first column, and don't update it.
>>
>> Also, I suspect the runtimes in your test were dominated by I/O. When I
>> scale down the number of rows involved so that the whole test fits in RAM, I
>> get much bigger differences with and without the patch. You might also want
>> to turn off full_page_writes, to make the effect clear with less data.
>>
>> So with this test, the overhead is very significant.
>>
>> With the skipping logic, another kind of "worst case" case is that you have
>> a lot of similarity between the old and new tuple, but you miss it because
>> you skip.
>
> This is exactly the reason why I have not kept skipping logic in second
> pass(loop), but I think may be it would have been better to keep it not
> as aggressive as in first pass.

I have tried to merge pass-1 and pass-2 and kept skipping logic as same,
and it have reduced the overhead to a good extent but not completely for
the new case you have added. This change is to check if it can reduce
overhead, if we want to proceed, may be we can limit the skip factor, so
that chance of skipping some match data is reduced.

New version of patch is attached with mail

Unpatched

testname | wal_generated | duration
------------------------------+---------------+------------------
ten long fields, all changed | 348842856 | 6.93688106536865
ten long fields, all changed | 348843672 | 7.53063702583313
ten long fields, all changed | 352662344 | 7.76640701293945
(3 rows)

pgrb_delta_encoding_v8.patch
testname | wal_generated | duration
----------------------------------+---------------+------------------
ten long fields, but one changed | 348848144 | 9.22694897651672
ten long fields, but one changed | 348841376 | 9.11818099021912
ten long fields, but one changed | 352963488 | 8.37875485420227
(3 rows)

pgrb_delta_encoding_v9.patch

testname | wal_generated | duration
----------------------------------+---------------+------------------
ten long fields, but one changed | 350166320 | 8.84561610221863
ten long fields, but one changed | 348840728 | 8.45299792289734
ten long fields, but one changed | 348846656 | 8.34846496582031
(3 rows)

It appears to me that it can be good idea to merge both the patches
(prefix-suffix encoding + delta-encoding) in a way such that if we
get reasonable compression (50% or so) with prefix-suffix, then we
can return without doing delta encoding and if compression is lesser
than we can do delta encoding for rest of tuple. The reason I think it
will be good because by just doing prefix-suffix we might leave many
cases where good compression is possible.
If you think it is viable way, then I can merge both the patches and
check the results.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
pgrb_delta_encoding_v9.patch application/octet-stream 37.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-02-06 12:56:04 Re: Retain dynamic shared memory segments for postmaster lifetime
Previous Message Craig Ringer 2014-02-06 12:43:24 Re: Row-security on updatable s.b. views