Re: Performance Improvement by reducing WAL for Update Operation

From: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
To: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2013-07-08 21:21:43
Message-ID: CANPAkgvzGLxURUUoX7ZXMZWeO9r=VjMXM6T4KQ7L=nV4ZZL2BQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I can't comment on further direction for the patch, but since it was marked
as Needs Review in the CF app I took a quick look at it.

It patches and compiles clean against the current Git HEAD, and 'make
check' runs successfully.

Does it need documentation for the GUC variable
'wal_update_compression_ratio'?

__________________________________________________________________________________
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
Mike(dot)Blackwell(at)rrd(dot)com
http://www.rrdonnelley.com

<http://www.rrdonnelley.com/>
* <Mike(dot)Blackwell(at)rrd(dot)com>*

On Tue, Jul 2, 2013 at 2:26 AM, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com> wrote:

> On Friday, June 07, 2013 5:07 PM Amit Kapila wrote:
> >On Wednesday, March 06, 2013 2:57 AM Heikki Linnakangas wrote:
> >> On 04.03.2013 06:39, Amit Kapila wrote:
> >> > On Sunday, March 03, 2013 8:19 PM Craig Ringer wrote:
> >> >> On 02/05/2013 11:53 PM, Amit Kapila wrote:
> >> >>>> Performance data for the patch is attached with this mail.
> >> >>>> Conclusions from the readings (these are same as my previous
> >> patch):
> >> >>>>
> >>
> >> The attached patch also just adds overhead in most cases, but the
> >> overhead is much smaller in the worst case. I think that's the right
> >> tradeoff here - we want to avoid scenarios where performance falls off
> >> the cliff. That said, if you usually just get a slowdown, we certainly
> >> can't make this the default, and if we can't turn it on by default,
> >> this probably just isn't worth it.
> >>
> >> The attached patch contains the variable-hash-size changes I posted in
> >> the "Optimizing pglz compressor". But in the delta encoding function,
> >> it goes further than that, and contains some further micro-
> >> optimizations:
> >> the hash is calculated in a rolling fashion, and it uses a specialized
> >> version of the pglz_hist_add macro that knows that the input can't
> >> exceed 4096 bytes. Those changes shaved off some cycles, but you could
> >> probably do more. One idea is to only add every 10 bytes or so to the
> >> history lookup table; that would sacrifice some compressibility for
> >> speed.
> >>
> >> If you could squeeze pglz_delta_encode function to be cheap enough
> >> that we could enable this by default, this would be pretty cool patch.
> >> Or at least, the overhead in the cases that you get no compression
> >> needs to be brought down, to about 2-5 % at most I think. If it can't
> >> be done easily, I feel that this probably needs to be dropped.
>
> >After trying some more on optimizing pglz_delta_encode(), I found that if
> we use new data also in history, then the results of compression and cpu
> utilization >are much better.
>
> >In addition to the pg lz micro optimization changes, following changes are
> done in modified patch
>
> >1. The unmatched new data is also added to the history which can be
> referenced later.
> >2. To incorporate this change in the lZ algorithm, 1 extra control bit is
> needed to indicate if data is from old or new tuple
>
> The patch is rebased to use the new PG LZ algorithm optimization changes
> which got committed recently.
>
> Performance Data
> -----------------
>
> Head code:
>
> testname | wal_generated | duration
>
> -----------------------------------------+---------------+------------------
>
> two short fields, no change | 1232911016 | 35.1784930229187
> two short fields, one changed | 1240322016 | 35.0436308383942
> two short fields, both changed | 1235318352 | 35.4989421367645
> one short and one long field, no change | 1042332336 | 23.4457180500031
> ten tiny fields, all changed | 1395194136 | 41.9023628234863
> hundred tiny fields, first 10 changed | 626725984 | 21.2999589443207
> hundred tiny fields, all changed | 621899224 | 21.6676609516144
> hundred tiny fields, half changed | 623998272 | 21.2745981216431
> hundred tiny fields, half nulled | 557714088 | 19.5902800559998
>
>
> pglz-with-micro-optimization-compress-using-newdata-2:
>
> testname | wal_generated | duration
>
> -----------------------------------------+---------------+------------------
>
> two short fields, no change | 1232903384 | 35.0115969181061
> two short fields, one changed | 1232906960 | 34.3333759307861
> two short fields, both changed | 1232903520 | 35.7665238380432
> one short and one long field, no change | 649647992 | 19.4671010971069
> ten tiny fields, all changed | 1314957136 | 39.9727990627289
> hundred tiny fields, first 10 changed | 458684024 | 17.8197758197784
> hundred tiny fields, all changed | 461028464 | 17.3083391189575
> hundred tiny fields, half changed | 456528696 | 17.1769199371338
> hundred tiny fields, half nulled | 480548936 | 18.81720495224
>
> Observation
> ---------------
> 1. It yielded compression in more cases (refer all cases of hundred tiny
> fields)
> 2. CPU- utilization is also better.
>
>
> Performance data for pgbench related scenarios is attached in document
> (pgbench_lz_opt_compress_using_newdata-2.htm)
>
> 1. Better reduction in WAL
> 2. TPS increase can be observed after records size is >=250
> 3. There is small performance penality for single-thread (0.36~3.23),
> but when penality is 3.23 in single thread, for 8 threads TPS
> improvement
> is high.
>
> Please suggest any further proceedings on this patch.
>
> Regards,
> Hari babu.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2013-07-08 21:22:11 Re: [9.3 bug fix] backends emit hundreds of messages when statistics file is inaccessible
Previous Message Alvaro Herrera 2013-07-08 21:14:08 Re: pageinspect documentation for 9.3