Re: Performance Improvement by reducing WAL for Update Operation

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2014-02-05 11:43:28
Message-ID: 52F223E0.6030306@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/05/2014 07:54 AM, Amit Kapila wrote:
> On Tue, Feb 4, 2014 at 11:58 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Feb 4, 2014 at 12:39 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> Now there is approximately 1.4~5% CPU gain for
>>> "hundred tiny fields, half nulled" case
>
>> Assuming that the logic isn't buggy, a point in need of further study,
>> I'm starting to feel like we want to have this. And I might even be
>> tempted to remove the table-level off switch.
>
> I have tried to stress on worst case more, as you are thinking to
> remove table-level switch and found that even if we increase the
> data by approx. 8 times ("ten long fields, all changed", each field contains
> 80 byte data), the CPU overhead is still < 5% which clearly shows that
> the overhead doesn't increase much even if the length of unmatched data
> is increased by much larger factor.
> So the data for worst case adds more weight to your statement
> ("remove table-level switch"), however there is no harm in keeping
> table-level option with default as 'true' and if some users are really sure
> the updates in their system will have nothing in common, then they can
> make this new option as 'false'.
>
> Below is data for the new case " ten long fields, all changed" added
> in attached script file:

That's not the worst case, by far.

First, note that the skipping while scanning new tuple is only performed
in the first loop. That means that as soon as you have a single match,
you fall back to hashing every byte. So for the worst case, put one
4-byte field as the first column, and don't update it.

Also, I suspect the runtimes in your test were dominated by I/O. When I
scale down the number of rows involved so that the whole test fits in
RAM, I get much bigger differences with and without the patch. You might
also want to turn off full_page_writes, to make the effect clear with
less data.

So, I came up with the attached worst case test, modified from your
latest test suite.

unpatched:

testname | wal_generated | duration
--------------------------------------+---------------+------------------
ten long fields, all but one changed | 343385312 | 2.20806908607483
ten long fields, all but one changed | 336263592 | 2.18997097015381
ten long fields, all but one changed | 336264504 | 2.17843413352966
(3 rows)

pgrb_delta_encoding_v8.patch:

testname | wal_generated | duration
--------------------------------------+---------------+------------------
ten long fields, all but one changed | 338356944 | 3.33501315116882
ten long fields, all but one changed | 344059272 | 3.37364101409912
ten long fields, all but one changed | 336257840 | 3.36244201660156
(3 rows)

So with this test, the overhead is very significant.

With the skipping logic, another kind of "worst case" case is that you
have a lot of similarity between the old and new tuple, but you miss it
because you skip. For example, if you change the first few columns, but
leave a large text column at the end of the tuple unchanged.

- Heikki

Attachment Content-Type Size
wal-update-testsuite.sh application/x-sh 14.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-02-05 11:59:47 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Alexander Korotkov 2014-02-05 10:48:25 Re: Fix picksplit with nan values