Re: Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Andres Freund'" <andres(at)2ndquadrant(dot)com>, "'Hari Babu'" <haribabu(dot)kommi(at)huawei(dot)com>
Cc: "'Greg Smith'" <greg(at)2ndQuadrant(dot)com>, "'Mike Blackwell'" <mike(dot)blackwell(at)rrd(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2013-07-23 13:29:11
Message-ID: 003401ce87a8$9f8322b0$de896810$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, July 23, 2013 12:27 AM Andres Freund wrote:
> On 2013-07-19 10:40:01 +0530, Hari Babu wrote:
> >
> > On Friday, July 19, 2013 4:11 AM Greg Smith wrote:
> > >On 7/9/13 12:09 AM, Amit Kapila wrote:
> > >> I think the first thing to verify is whether the results posted
> can be validated in some other environment setup by another person.
> > >> The testcase used is posted at below link:
> > >> http://www.postgresql.org/message-
> id/51366323(dot)8070606(at)vmware(dot)com
> >
> > >That seems easy enough to do here, Heikki's test script is
> excellent.
> > >The latest patch Hari posted on July 2 has one hunk that doesn't
> apply
> > >anymore now.
> >
> > The Head code change from Heikki is correct.
> > During the patch rebase to latest PG LZ optimization code, the above
> code change is missed.
> >
> > Apart from the above changed some more changes are done in the patch,
> those are.
>
> FWIW I don't like this approach very much:
>
> * I'd be very surprised if this doesn't make WAL replay of update heavy
> workloads slower by at least factor of 2.

Yes, if you just consider the cost of replay, but it involves other
operations as well
like for standby case transfer of WAL, Write of WAL, Read from WAL and
then apply.
So among them most operation's will be benefited from reduced WAL size,
except apply where you need to decode.

> * It makes data recovery from WAL *noticeably* harder since data
> corruption now is carried forwards and you need the old data to
> decode
> new data

This is one of the reasons why this optimization is done only when the
new row goes in same page.

> * It makes changeset extraction either more expensive or it would have
> to be disabled there.

I think, if there is any such implication, we can probably have the
option of disable it

> I think my primary issue is that philosophically/architecturally I am
> of
> the opinion that a wal record should make sense of it's own without
> depending on heap data. And this patch looses that.

Is the main worry about corruption getting propagated?

With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-07-23 13:29:40 Re: Suggestion for concurrent index creation using a single full scan operation
Previous Message Tom Lane 2013-07-23 13:27:18 Re: make --silent