Re: Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <greg(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2013-12-15 07:12:41
Message-ID: CAA4eK1+XOxnCzMBbywrtSN_xgLn9LgNBpwHnm75ddPTKViccEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 12, 2013 at 8:53 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Dec 12, 2013 at 12:27 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Thu, Dec 12, 2013 at 3:43 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>>> This patch fails the regression tests; see attachment.
>>
>> However, to keep the sanity of patch, I will do that and post an
>> updated patch, but I think the main idea behind new approach at this
>> point is to get feedback on if such an optimization is acceptable
>> for worst case scenarios and if not whether we can get this done
>> with table level or GUC option.
>
> I don't understand why lack of decoding support should cause
> regression tests to fail. I thought decoding was only being done
> during WAL replay, a case not exercised by the regression tests.

I had mentioned decoding, as the message in regression failures can come from
decompress/decode functions. Today, I had debugged the regression
failures and
found that they are coming from pglz_decompress() function and the
reason is that
optimizations done in pglz_find_match() to reduce the length of
'maxlen' has problem
due to which compression of data was not happening properly.
I have corrected the calculation for 'maxlen' and now compression
is happening properly
and regression tests are passed.
This problem was observed previously also and we have corrected in
one of the versions
of patch which I forgot to take care while preparing combined patch
of chunk-wise and byte
by byte encoding.

> A few other comments:
>
> +#define PGRB_HKEY_PRIME 11 /* prime number used for
> rolling hash */
> +#define PGRB_HKEY_SQUARE_PRIME 11 * 11 /* prime number
> used for rolling hash */
> +#define PGRB_HKEY_CUBE_PRIME 11 * 11 * 11 /* prime
> number used for rolling hash */
>
> 11 * 11 can't accurately be described as a prime number. Nor can 11 *
> 11 * 11. Please adjust the comment.

Fixed.

>Also, why 11?

I have tried with 11,31,101 to see which generates the better chunks
and I found 11 chooses better
chunks (with 31 and 101, there was almost no chunk, it was
considering whole data as one chunk).
The data I have used was the test data of the current test case
which we are using to evaluate this
patch. It contains mostly repetitive data, so might be we need to
test with other kind of data as well
to verify, if currently used number is okay.

> It doesn't appear that pglz_hist_idx is changed except for whitespace;
> please revert that hunk.

Fixed.

Thanks for review.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
pgrb_delta_encoding_v2.patch application/octet-stream 46.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-12-15 08:29:30 Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Previous Message David Rowley 2013-12-15 03:34:00 Re: [PATCH] Negative Transition Aggregate Functions (WIP)