Re: [REVIEW] Re: Compression of full-page-writes

From: Arthur Silva <arthurprs(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-12-10 14:39:34
Message-ID: CAO_YK0VJYH-bc616y=O5S9FnbZ8vCDpkzAUeL+tRzopvxWVMGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 10, 2014 at 12:10 PM, Rahila Syed <rahilasyed90(at)gmail(dot)com>
wrote:

> >What I would suggest is instrument the backend with getrusage() at
> >startup and shutdown and have it print the difference in user time and
> >system time. Then, run tests for a fixed number of transactions and
> >see how the total CPU usage for the run differs.
>
> Folllowing are the numbers obtained on tests with absolute CPU usage,
> fixed number of transactions and longer duration with latest fpw
> compression patch
>
> pgbench command : pgbench -r -t 250000 -M prepared
>
> To ensure that data is not highly compressible, empty filler columns were
> altered using
>
> alter table pgbench_accounts alter column filler type text using
> gen_random_uuid()::text
>
> checkpoint_segments = 1024
> checkpoint_timeout = 5min
> fsync = on
>
> The tests ran for around 30 mins.Manual checkpoint was run before each
> test.
>
> Compression WAL generated %compression Latency-avg CPU usage
> (seconds) TPS Latency
> stddev
>
>
> on 1531.4 MB ~35 % 7.351 ms
> user diff: 562.67s system diff: 41.40s 135.96
> 13.759 ms
>
>
> off 2373.1 MB 6.781
> ms user diff: 354.20s system diff: 39.67s 147.40
> 14.152 ms
>
> The compression obtained is quite high close to 35 %.
> CPU usage at user level when compression is on is quite noticeably high as
> compared to that when compression is off. But gain in terms of reduction of
> WAL is also high.
>
> Server specifications:
> Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
> RAM: 32GB
> Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
> 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm
>
>
>
> Thank you,
>
> Rahila Syed
>
>
>
>
>
> On Fri, Dec 5, 2014 at 10:38 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
>
>> On Fri, Dec 5, 2014 at 1:49 AM, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>
>> wrote:
>> >>If that's really true, we could consider having no configuration any
>> >>time, and just compressing always. But I'm skeptical that it's
>> >>actually true.
>> >
>> > I was referring to this for CPU utilization:
>> >
>> http://www.postgresql.org/message-id/1410414381339-5818552.post@n5.nabble.com
>> > <http://>
>> >
>> > The above tests were performed on machine with configuration as follows
>> > Server specifications:
>> > Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2
>> nos
>> > RAM: 32GB
>> > Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
>> > 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm
>>
>> I think that measurement methodology is not very good for assessing
>> the CPU overhead, because you are only measuring the percentage CPU
>> utilization, not the absolute amount of CPU utilization. It's not
>> clear whether the duration of the tests was the same for all the
>> configurations you tried - in which case the number of transactions
>> might have been different - or whether the number of operations was
>> exactly the same - in which case the runtime might have been
>> different. Either way, it could obscure an actual difference in
>> absolute CPU usage per transaction. It's unlikely that both the
>> runtime and the number of transactions were identical for all of your
>> tests, because that would imply that the patch makes no difference to
>> performance; if that were true, you wouldn't have bothered writing
>> it....
>>
>> What I would suggest is instrument the backend with getrusage() at
>> startup and shutdown and have it print the difference in user time and
>> system time. Then, run tests for a fixed number of transactions and
>> see how the total CPU usage for the run differs.
>>
>> Last cycle, Amit Kapila did a bunch of work trying to compress the WAL
>> footprint for updates, and we found that compression was pretty darn
>> expensive there in terms of CPU time. So I am suspicious of the
>> finding that it is free here. It's not impossible that there's some
>> effect which causes us to recoup more CPU time than we spend
>> compressing in this case that did not apply in that case, but the
>> projects are awfully similar, so I tend to doubt it.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>
>
This can be improved in the future by using other algorithms.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-12-10 14:49:27 Re: advance local xmin more aggressively
Previous Message Dennis Kögel 2014-12-10 14:32:16 Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)