Re: Performance Improvement by reducing WAL for Update Operation

From: Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2013-12-02 14:10:16
Message-ID: 8977CB36860C5843884E0A18D8747B0372BF1167@szxeml558-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29 November 2013 03:05 Robert Haas wrote:
> On Wed, Nov 27, 2013 at 9:31 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > Sure, but to explore (a), the scope is bit bigger. We have below
> > options to explore (a):
> > 1. try to optimize existing algorithm as used in patch, which we have
> > tried but ofcourse we can spend some more time to see if anything
> more
> > can be tried out.
> > 2. try fingerprint technique as suggested by you above.
> > 3. try some other standard methods like vcdiff, lz4 etc.
>
> Well, obviously, I'm hot on idea #2 and think that would be worth
> spending some time on. If we can optimize the algorithm used in the
> patch some more (option #1), that would be fine, too, but the code
> looks pretty tight to me, so I'm not sure how successful that's likely
> to be. But if you have an idea, sure.

I tried modifying the existing patch to support the dynamic rollup as follows.
For every 32 bytes mismatch between the old and new tuple and it resets back whenever it found a match.

1. pglz-with-micro-optimization-compress-using-newdata-5:

Adds all old tuple data to history and then check for the match from new tuple.
For every 32 bytes mismatch, it checks for the match for 2 bytes once. Like this
It repeats until it found a match or end of data.

2. pglz-with-micro-optimization-compress-using-newdata_snappy_hash-1:

Adds only first byte of old tuple data to the history and then check for the match
From new tuple. If any match found, then next unmatched byte from old tuple is added
To the history and repeats the process.

If no match founds then adds the next byte of the old tuple history followed by the
Unmatched byte from new tuple data to the history.

In this case the performance is good, but if there is any forward references in the
New data with old data then it will not compress the data.

Eg- old data - 12345 abcdefgh
New data - abcdefgh 56789

The updated patches and performance data is attached in the mail.
Please let me know your suggestions.

Regards,
Hari babu.

Attachment Content-Type Size
pglz-with-micro-optimization-compress-using-newdata-5.patch application/octet-stream 39.3 KB
pglz-with-micro-optimization-compress-using-newdata_snappy_hash-1.patch application/octet-stream 39.5 KB
test_readings_with_rollup.txt text/plain 3.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2013-12-02 14:14:46 Re: Extension Templates S03E11
Previous Message Stephen Frost 2013-12-02 13:59:31 Re: Extension Templates S03E11