Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>, <noah(at)leadboat(dot)com>
Subject: Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date: 2012-09-25 15:27:06
Message-ID: 009001cd9b32$394a4090$abdec1b0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Tuesday, September 25, 2012 7:30 PM Heikki Linnakangas wrote:
> On 24.09.2012 13:57, Amit kapila wrote:
> > Rebased version of patch based on latest code.
>
> When HOT was designed, we decided that heap_update needs to compare the
> old and new attributes directly, with memcmp(), to determine whether
> any
> of the indexed columns have changed. It was not deemed infeasible to
> pass down that information from the executor. I don't remember the
> details of why that was, but you seem to trying to same thing in this
> patch, and pass the bitmap of modified cols from the executor to
> heap_update(). I'm pretty sure that won't work, for the same reasons we
> didn't do it for HOT.

I think the reason of not relying on modified columns can be some such case
where modified columns might not give the correct information.
It may be due to Before triggers can change the modified columns that's why
for HOT update we need to do
Comparison. In our case we have taken care of such a case by not doing
optimization, so not relying on modified columns.

If you feel it is must to do the comparison, we can do it in same way as we
identify for HOT?

> I still feel that it would probably be better to use a generic delta
> encoding scheme, instead of inventing one. How about VCDIFF
> (http://tools.ietf.org/html/rfc3284), for example? Or you could reuse
> the LZ compressor that we already have in the source tree. You can use
> LZ for delta compression by initializing the history buffer of the
> algorithm with the old tuple, and then compressing the new tuple as
> usual.

>Or you could still use the knowledge of where the attributes
> begin and end and which attributes were updated, and do the encoding
> similar to how you did in the patch, but use LZ as the output format.
> That way the decoding would be the same as LZ decompression.

Can you please explain me why you think that after doing encoding doing LZ
compression on it is better, as already we have reduced the amount of WAL
for update by only storing changed column information?

a. is it to further reduce the size of WAL
b. storing diff WAL in some standard format
c. or does it give any other kind of benefit

With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-09-25 16:14:29 Re: Oid registry
Previous Message Alvaro Herrera 2012-09-25 15:21:56 Re: Oid registry