Quick Links

Re: [REVIEW] Re: Compression of full-page-writes

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Rahila Syed <rahilasyed90(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [REVIEW] Re: Compression of full-page-writes
Date:	2014-12-13 14:36:20
Message-ID:	CAB7nPqTM6-3WTnsddRi=8P6904m9d8BogMiAmJj28DCGMo_MUw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Dec 12, 2014 at 11:50 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
>
>
> On Wed, Dec 10, 2014 at 11:25 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>
>> On Wed, Dec 10, 2014 at 07:40:46PM +0530, Rahila Syed wrote:
>> > The tests ran for around 30 mins.Manual checkpoint was run before each
>> > test.
>> >
>> > Compression WAL generated %compression Latency-avg CPU usage
>> > (seconds) TPS
>> > Latency
>> > stddev
>> >
>> >
>> > on 1531.4 MB ~35 % 7.351 ms
>> > user diff: 562.67s system diff: 41.40s 135.96
>> > 13.759 ms
>> >
>> >
>> > off 2373.1 MB 6.781
>> > ms
>> > user diff: 354.20s system diff: 39.67s 147.40
>> > 14.152 ms
>> >
>> > The compression obtained is quite high close to 35 %.
>> > CPU usage at user level when compression is on is quite noticeably high
>> > as
>> > compared to that when compression is off. But gain in terms of reduction
>> > of WAL
>> > is also high.
>>
>> I am sorry but I can't understand the above results due to wrapping.
>> Are you saying compression was twice as slow?
>
>
> I got curious to see how the compression of an entire record would perform
> and how it compares for small WAL records, and here are some numbers based
> on the patch attached, this patch compresses the whole record including the
> block headers, letting only XLogRecord out of it with a flag indicating that
> the record is compressed (note that this patch contains a portion for replay
> untested, still this patch gives an idea on how much compression of the
> whole record affects user CPU in this test case). It uses a buffer of 4 *
> BLCKSZ, if the record is longer than that compression is simply given up.
> Those tests are using the hack upthread calculating user and system CPU
> using getrusage() when a backend.
>
> Here is the simple test case I used with 512MB of shared_buffers and small
> records, filling up a bunch of buffers, dirtying them and them compressing
> FPWs with a checkpoint.
> #!/bin/bash
> psql <<EOF
> SELECT pg_backend_pid();
> CREATE TABLE aa (a int);
> CREATE TABLE results (phase text, position pg_lsn);
> CREATE EXTENSION IF NOT EXISTS pg_prewarm;
> ALTER TABLE aa SET (FILLFACTOR = 50);
> INSERT INTO results VALUES ('pre-insert', pg_current_xlog_location());
> INSERT INTO aa VALUES (generate_series(1,7000000)); -- 484MB
> SELECT pg_size_pretty(pg_relation_size('aa'::regclass));
> SELECT pg_prewarm('aa'::regclass);
> CHECKPOINT;
> INSERT INTO results VALUES ('pre-update', pg_current_xlog_location());
> UPDATE aa SET a = 7000000 + a;
> CHECKPOINT;
> INSERT INTO results VALUES ('post-update', pg_current_xlog_location());
> SELECT * FROM results;
> EOF
Re-using this test case, I have produced more results by changing the
fillfactor of the table:
=# select test || ', ffactor ' || ffactor, pg_size_pretty(post_update
- pre_update), user_diff, system_diff from results;
?column? | pg_size_pretty | user_diff | system_diff
-------------------------------+----------------+-----------+-------------
FPW on + 2 bytes, ffactor 50 | 582 MB | 42.391894 | 0.807444
FPW on + 2 bytes, ffactor 20 | 229 MB | 14.330304 | 0.729626
FPW on + 2 bytes, ffactor 10 | 117 MB | 7.335442 | 0.570996
FPW off + 2 bytes, ffactor 50 | 746 MB | 25.330391 | 1.248503
FPW off + 2 bytes, ffactor 20 | 293 MB | 10.537475 | 0.755448
FPW off + 2 bytes, ffactor 10 | 148 MB | 5.762775 | 0.763761
HEAD, ffactor 50 | 746 MB | 25.181729 | 1.133433
HEAD, ffactor 20 | 293 MB | 9.962242 | 0.765970
HEAD, ffactor 10 | 148 MB | 5.693426 | 0.775371
Record, ffactor 50 | 582 MB | 54.904374 | 0.678204
Record, ffactor 20 | 229 MB | 19.798268 | 0.807220
Record, ffactor 10 | 116 MB | 9.401877 | 0.668454
(12 rows)

The following tests are run:
- "Record" means the record-level compression
- "HEAD" is postgres at 1c5c70df
- "FPW off" is HEAD + patch with switch set to off
- "FPW on" is HEAD + patch with switch set to on
The gain in compression has a linear profile with the length of page
hole. There was visibly some noise in the tests: you can see that the
CPU of "FPW off" is a bit higher than HEAD.

Something to be aware of btw is that this patch introduces an
additional 8 bytes per block image in WAL as it contains additional
information to control the compression. In this case this is the
uint16 compress_len present in XLogRecordBlockImageHeader. In the case
of the measurements done, knowing that 63638 FPWs have been written,
there is a difference of a bit less than 500k in WAL between HEAD and
"FPW off" in favor of HEAD. The gain with compression is welcome,
still for the default there is a small price to track down if a block
is compressed or not. This patch still takes advantage of it by not
compressing the hole present in page and reducing CPU work a bit.

Attached are as well updated patches, switching wal_compression to
USERSET and cleaning up things related to this switch from
PGC_POSTMASTER. I am attaching as well the results I got, feel free to
have a look.
Regards,
--
Michael

Attachment	Content-Type	Size
0001-Move-pg_lzcompress.c-to-src-common.patch	text/x-diff	52.2 KB
0002-Support-compression-for-full-page-writes-in-WAL.patch	text/x-diff	20.7 KB
results.sql	application/octet-stream	1.2 KB

In response to

Re: [REVIEW] Re: Compression of full-page-writes at 2014-12-12 14:50:43 from Michael Paquier

Responses

Re: [REVIEW] Re: Compression of full-page-writes at 2014-12-13 20:45:22 from Simon Riggs
Re: [REVIEW] Re: Compression of full-page-writes at 2014-12-15 18:46:14 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mark Dilger	2014-12-13 14:45:57	duplicate #define
Previous Message	Andrew Gierth	2014-12-13 12:40:53	Re: Final Patch for GROUPING SETS