Re: Performance Improvement by reducing WAL for Update Operation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2014-02-06 03:59:41
Message-ID: CA+TgmoYho9dRbdBvE8Dvw1jAohJceC4Qrax-zm=BcWtyt0TWUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 6:43 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> So, I came up with the attached worst case test, modified from your latest
> test suite.
>
> unpatched:
>
>
> testname | wal_generated | duration
> --------------------------------------+---------------+------------------
> ten long fields, all but one changed | 343385312 | 2.20806908607483
> ten long fields, all but one changed | 336263592 | 2.18997097015381
> ten long fields, all but one changed | 336264504 | 2.17843413352966
> (3 rows)
>
>
> pgrb_delta_encoding_v8.patch:
>
> testname | wal_generated | duration
> --------------------------------------+---------------+------------------
> ten long fields, all but one changed | 338356944 | 3.33501315116882
> ten long fields, all but one changed | 344059272 | 3.37364101409912
> ten long fields, all but one changed | 336257840 | 3.36244201660156
> (3 rows)
>
> So with this test, the overhead is very significant.

Yuck. Well that sucks.

> With the skipping logic, another kind of "worst case" case is that you have
> a lot of similarity between the old and new tuple, but you miss it because
> you skip. For example, if you change the first few columns, but leave a
> large text column at the end of the tuple unchanged.

I suspect there's no way to have our cake and eat it, too. Most of
the work that Amit has done on this patch in the last few revs is to
cut back CPU overhead in the cases where the patch can't help because
the tuple has been radically modified. If we're trying to get maximum
compression, we need to go the other way: for example, we could just
feed both the old and new tuples through pglz (or snappy, or
whatever). That would allow us to take advantage not only of
similarity between the old and new tuples but also internal
duplication within either the old or the new tuple, but it would also
cost more CPU. The concern with minimizing overhead in cases where
the compression doesn't help has thus far pushed us in the opposite
direction, namely passing over compression opportunities that a more
aggressive algorithm could find in order to keep the overhead low.

Off-hand, I'm wondering why we shouldn't apply the same skipping
algorithm that Amit is using at the beginning of the string for the
rest of it as well. It might be a little too aggressive (maybe the
skip distance shouldn't increase by quite as much as doubling every
time, or not beyond 16/32 bytes?) but I don't see why the general
principle isn't sound wherever we are in the tuple.

Unfortunately, despite changing things to make a history entry only
every 4th character, building the history is still pretty expensive.
By the time we even begin looking at the tuple we're gonna compress,
we've already spent something like half the total effort, and of
course we have to go further than that before we know whether our
attempt to compress is actually going anywhere. I think that's the
central problem here. pglz has several safeguards to ensure that it
doesn't do too much work in vain: we abort if we find nothing
compressible within first_success_by bytes, or if we emit enough total
output to be certain that we won't meet the need_rate threshold.
Those safeguards are a lot less effective here because they can't be
applied until *after* we've already paid the cost of building the
history. If we could figure out some way to apply those guards, or
other guards, earlier in the algorithm, we could do a better job
mitigating the worst-case scenarios, but I don't have a good idea.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-02-06 04:42:40 Re: Failure while inserting parent tuple to B-tree is not fun
Previous Message Amit Kapila 2014-02-06 03:43:50 Re: Performance Improvement by reducing WAL for Update Operation