Quick Links

Re: Performance Improvement by reducing WAL for Update Operation

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Subject:	Re: Performance Improvement by reducing WAL for Update Operation
Date:	2014-02-04 17:39:00
Message-ID:	CAA4eK1LGoDL_P6fq3j+RdjT829Fg-Y5or5MYquLXCYXLeF7_eA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Jan 31, 2014 at 1:35 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Jan 31, 2014 at 12:33 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Thu, Jan 30, 2014 at 12:23 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> On Wed, Jan 29, 2014 at 8:13 PM, Heikki Linnakangas
>>> <hlinnakangas(at)vmware(dot)com> wrote:
>>>
>>> After basic verification of back-to-pglz-like-delta-encoding-1, I will
>>> take the data with both the patches and report the same.
>>
>> I have corrected the problems reported in back-to-pglz-like-delta-encoding-1
>> and removed hindex from pgrb_delta_encoding_v6 and attached are
>> new versions of both patches.
>>
>> I/O Reduction Data
>> -----------------------------
>> Non-Default settings
>> autovacuum = off
>> checkpoitnt_segments = 256
>> checkpoint_timeout =15min
>>
>> Observations
>> --------------------
>> 1. With both the patches WAL reduction is similar i.e ~37% for
>> "one short and one long field, no change" and 12% for
>> "hundred tiny fields, half nulled"
>> 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best
>> case "one short and one long field, no change".
>> 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead
>> for cases where there is no match
>> 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead
>> for "hundred tiny fields, half nulled" case
>> 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except
>> for "hundred tiny fields, half nulled" where CPU overhead is much more.
>>
>> I think the main reason for overhead is that we store last offset
>> of matching data in history at front, so during match, it has to traverse back
>> many times to find longest possible match and in real world it won't be the
>> case that most of history entries contain same hash index, so it should not
>> effect.
>
> If we want to improve CPU usage for cases like "hundred tiny fields,
> half nulled"
> (which I think is not important), forming history table by traversing from end
> rather than beginning, can serve the purpose, I have not tried it but I think
> it can certainly help.

I had implemented the above idea of forming the history table by traversing
the old tuple from end instead of from beginning and had done some
optimizations in find match for breaking the loop based on good match
concept similar to pglz. The advantage of this is that we can find longer
matches quickly and due to which even for case "hundred tiny fields,
half nulled", now there is no CPU overhead without having any
significant effect on any other case.

Please find the updated patch attached with mail and new
data as below:

Non-Default settings
---------------------------------
autovacuum = off
checkpoitnt_segments = 256
checkpoint_timeout =15min

Unpatched

testname | wal_generated |
duration

------------------------------------------------------+---------------+------------------
one short and one long field, no change | 1055025424
| 14.3506939411163
one short and one long field, no change | 1056580160
| 18.1261160373688
one short and one long field, no change | 1054914792
| 15.104973077774
hundred tiny fields, all changed |
636948992 | 16.3172590732574
hundred tiny fields, all changed |
633943680 | 16.308168888092
hundred tiny fields, all changed |
636516776 | 16.4316298961639
hundred tiny fields, half changed |
633948288 | 16.5795118808746
hundred tiny fields, half changed |
636068648 | 16.2913551330566
hundred tiny fields, half changed |
635848432 | 15.9602961540222
hundred tiny fields, half nulled |
569758744 | 15.9501180648804
hundred tiny fields, half nulled |
569760112 | 15.9422838687897
hundred tiny fields, half nulled |
570609712 | 16.5659689903259
nine short and one long field, thirty % change | 698908824 |
12.7938749790192
nine short and one long field, thirty % change | 698905400 |
12.0160901546478
nine short and one long field, thirty % change | 698909720 |
12.2999179363251

After pgrb_delta_encoding_v8.patch
----------------------------------------------------------
testname | wal_generated
| duration
------------------------------------------------------+---------------+------------------
one short and one long field, no change | 680203392
| 12.4820687770844
one short and one long field, no change | 677340120
| 11.8634090423584
one short and one long field, no change | 677333288
| 11.9269840717316
hundred tiny fields, all changed |
633950264 | 16.7694170475006
hundred tiny fields, all changed |
635496520 | 16.9294109344482
hundred tiny fields, all changed |
633942832 | 18.0690770149231
hundred tiny fields, half changed |
633948024 | 17.0814690589905
hundred tiny fields, half changed |
633947488 | 17.0073189735413
hundred tiny fields, half changed |
633949224 | 17.0454230308533
hundred tiny fields, half nulled |
499950184 | 16.3303508758545
hundred tiny fields, half nulled |
499952888 | 15.7197980880737
hundred tiny fields, half nulled |
499958120 | 15.7198679447174
nine short and one long field, thirty % change | 559831384 |
12.0672481060028
nine short and one long field, thirty % change | 559829472 |
11.8555760383606
nine short and one long field, thirty % change | 559832760 |
11.9470820426941

Observations are almost similar as previous except for
"hundred tiny fields, half nulled" case which I have updated below:

>> Observations
>> --------------------
>> 1. With both the patches WAL reduction is similar i.e ~37% for
>> "one short and one long field, no change" and 12% for
>> "hundred tiny fields, half nulled"
>> 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best
>> case "one short and one long field, no change".
>> 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead
>> for cases where there is no match
>> 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead
>> for "hundred tiny fields, half nulled" case

Now there is approximately 1.4~5% CPU gain for
"hundred tiny fields, half nulled" case

>> 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except
>> for "hundred tiny fields, half nulled" where CPU overhead is much more.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
pgrb_delta_encoding_v8.patch	application/octet-stream	38.4 KB

In response to

Re: Performance Improvement by reducing WAL for Update Operation at 2014-01-31 08:05:46 from Amit Kapila

Responses

Re: Performance Improvement by reducing WAL for Update Operation at 2014-02-04 18:28:38 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joshua D. Drake	2014-02-04 17:43:17	Re: narwhal and PGDLLIMPORT
Previous Message	Tom Lane	2014-02-04 17:34:50	Re: narwhal and PGDLLIMPORT