Re: Optimize kernel readahead using buffer access strategy

From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Subject: Re: Optimize kernel readahead using buffer access strategy
Date: 2014-01-14 11:58:20
Message-ID: 52D5265C.7060906@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I fix and submit this patch in CF4.

In my past patch, it is significant bug which is mistaken caluculation of
offset in posix_fadvise():-( However it works well without problem in pgbench.
Because pgbench transactions are always random access...

And I test my patch in DBT-2 benchmark. Results are under following.

* Test server
Server: HP Proliant DL360 G7
CPU: Xeon E5640 2.66GHz (1P/4C)
Memory: 18GB(PC3-10600R-9)
Disk: 146GB(15k)*4 RAID1+0
RAID controller: P410i/256MB
OS: RHEL 6.4(x86_64)
FS: Ext4

* DBT-2 result(WH400, SESSION=100, ideal_score=5160)
Method | score | average | 90%tile | Maximum
------------------------------------------------
plain | 3589 | 9.751 | 33.680 | 87.8036
option=off | 3670 | 9.107 | 34.267 | 79.3773
option=on | 4222 | 5.140 | 7.619 | 102.473

"option" is "readahead_strategy" option, and "on" is my proposed.
"average", "90%tile", and Maximum represent latency.
Average_latency is 2 times faster than plain!

* Detail results (uploading now. please wait for a hour...)
[plain]
http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/index_thput.html
[option=off]
http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/readahead_off_20140109/HTML/index_thput.html
[option=on]
http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/readahead_on_20140109/HTML/index_thput.html

We can see part of super slow latency in my proposed method test.
Part of transaction active is 20%, and part of slow transactions is 80%.
It might be Pareto principle in CHECKPOINT;-)
#It's joke.

I will test some join sqls performance and TPC-3 benchmark in this or next week.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Attachment Content-Type Size
optimizing_kernel-readahead_using_buffer-access-strategy_v5.patch text/x-diff 14.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2014-01-14 12:28:34 Re: plpgsql.consistent_into
Previous Message Marko Tiikkaja 2014-01-14 11:46:39 Re: plpgsql.consistent_into