Quick Links

Re: Performance Improvement by reducing WAL for Update Operation

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Subject:	Re: Performance Improvement by reducing WAL for Update Operation
Date:	2014-02-10 15:02:21
Message-ID:	CAA4eK1LOR9OYiOuDwg0SfgrQ0_Zw1q5MckB6THyUUK=2oY4rKA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Feb 6, 2014 at 5:57 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Thu, Feb 6, 2014 at 9:13 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Considering above change as correct, I have tried to see the worst
> case overhead for this patch by having new tuple such that after
> 25% or so of suffix/prefix match, there is a small change in tuple
> and kept rest of tuple same as old tuple and it shows overhead
> for this patch as well.
>
> Updated test script is attached.
>
> Unpatched
> testname | wal_generated | duration
> ----------------------------------+---------------+------------------
> ten long fields, 8 bytes changed | 348843824 | 5.56866788864136
> ten long fields, 8 bytes changed | 348844800 | 5.84434294700623
> ten long fields, 8 bytes changed | 350500000 | 5.92329406738281
> (3 rows)
>
>
>
> wal-update-prefix-suffix-encode-1.patch
>
> testname | wal_generated | duration
> ----------------------------------+---------------+------------------
> ten long fields, 8 bytes changed | 348845624 | 6.92243480682373
> ten long fields, 8 bytes changed | 348847000 | 8.35828399658203
> ten long fields, 8 bytes changed | 350204752 | 7.61826491355896
> (3 rows)
>
> One minor point, can we avoid having prefix tag if prefixlen is 0.
>
> + /* output prefix as a tag */
> + pgrb_out_tag(ctrlp, ctrlb, ctrl, bp, prefixlen, hlen);

I think generating out tag for suffix/prefix has one bug i.e it doesn't
consider the max length of 273 bytes (PGLZ_MAX_MATCH ) which
is mandatory for LZ format.

One more point about this patch is that in function pgrb_delta_encode(),
is it mandatory to return at end in below check:

if (result_size > result_max)
return false;

I mean to say as before starting for copying literal bytes we have
already ensured that the compressed data is greater than >25%, so
may be we can avoid this check. I have tried to take the data by
removing this check and found that it reduces overhead and improves
WAL reduction as well. The data is as below (compare this with data
in above mail for unpatched version data):

If we want to go with this approach, then I think apart from above
points there is no major change required (may be some comments,
function names etc. can be improved).

> But if we really just want to do prefix/suffix compression, this is a
> crappy and expensive way to do it. We needn't force everything
> through the pglz tag format just because we elide a common prefix or
> suffix.

Here are you bothered about below code where the patch is
doing byte-by-byte copy after prefix/suffix match?

/* output bytes between prefix and suffix as literals */
dp = &source[prefixlen];
dend = &source[slen - suffixlen];
while (dp < dend)
{
pgrb_out_literal(ctrlp, ctrlb, ctrl, bp, *dp);
dp++; /* Do not do this ++ in the line above! */
}

I think if we want to change LZ format, it will be bit more work and
verification for decoding has to be done much more strenuously.

Note - During performance test, I have focussed mainly on worst case,
because we already know that this idea is good for best and average cases.
However if we decide that this is better and good to proceed, I can take
the data for other cases as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Re: Performance Improvement by reducing WAL for Update Operation at 2014-02-06 12:27:50 from Amit Kapila

Responses

Re: Performance Improvement by reducing WAL for Update Operation at 2014-02-13 05:01:53 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2014-02-10 15:13:57	Re: jsonb and nested hstore
Previous Message	Marios Vodas	2014-02-10 14:59:15	GiST, getting the record in table that the leaf entry points to