fun with unlogged tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: fun with unlogged tables
Date: 2011-10-27 19:49:50
Message-ID: CA+Tgmob9Vz-M2r2uE8pCM=40Ag1M2dzSA6gqOiWMD4+UbFwSuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

One of the optimizations that I did for 9.1 was to make transactions
that touch only temporary and/or unlogged tables always commit
asynchronously, because if the database crashes the table contents
will be blown away in their entirety, and whether or not the commit
made it down to disk won't matter a bit. In my testing, this hugely
improved the performance of unlogged tables. However, Heikki recently
reported to me off-list that this can actually cause a significant
performance problem in some circumstances, because committing
asynchronously means that we can't set hint bits right away - we must
wait until the WAL writer has completed its background flush. With
default settings, this takes long enough to cause lots of extra clog
traffic. He ran into the problem while running pgbench at scale
factor 15, and I reproduced it the same way. I believe that at a
large scale factor the effect is lessened because you're less likely
to reread the same row multiple times before the commit hits the disk.

It strikes me that, while it's not safe to set hint bits until the
commit record hits the disk for *permanent* relations, it ought to be
just fine for temporary and unlogged relations, because those pages
will be gone after a crash, and their hint bits with them. Attached
is a patch taking that approach. Another approach would be to have
transactions that only touch temporary or unlogged relations to avoid
changing the value that will be returned by
TransactionIdGetCommitLSN(). This approach is better when
synchronous_commit=off and a transaction touches both permanent and
non-permanent tables, because it makes the decision as to whether hint
bits can be set early based on which page is being updated rather than
on some characteristic of the transaction. However, it also adds a
small amount of overhead to the case where we're doing an asynchronous
commit on a permanent table, because we do one more check before
concluding that hint bits can't be set.

I did some benchmarking of this approach using pgbench with scale
factor 15. shared_buffers = 8GB, maintenance_work_mem = 1GB,
checkpoint_segments = 30, checkpoint_timeout = 15min,
checkpoint_completion_target = 0.9, synchronous_commit = off. On
permanent tables 8 clients came out slower and 16 came out faster; I'm
inclined to believe it's all in the noise. On unlogged tables the
patch appears to be a clear win, massively so at 32 clients. Results
below. The first number on each line is the client count, and the
remaining numbers are tps including connections establishing, from
individual 5-minute runs.

Unlogged Tables, unpatched:
1 861.841894 752.762490 837.847109
8 3379.832456 4100.539369 3751.842036
16 6259.907605 5523.406202 4437.648873
32 4547.725770 5360.246166 4958.086754

Unlogged Tables, with patch:
1 887.562141 785.539717 920.275452
8 4436.366884 4374.135712 4335.791842
16 7518.908796 7478.427691 7476.817757
32 10433.615767 10450.577573 10374.186566

Permanent Tables, unpatched:
1 648.824800 647.239277 652.116249
8 3785.485647 3481.021391 3827.455756
16 5652.069678 4004.780105 4207.354612
32 5084.804778 4645.997471 5222.387075

Permanent Tables, with patch:
1 636.162775 637.145834 640.298944
8 3383.817707 3388.273809 3815.298676
16 5543.585926 5093.483757 5112.854318
32 5229.295024 4985.736460 5103.441187

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
bip.patch application/octet-stream 3.3 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-10-27 19:55:57 Re: (PATCH) Adding CORRESPONDING (NULL error)
Previous Message fschmidt 2011-10-27 19:43:06 portal with hold