gaussian distribution pgbench

From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: gaussian distribution pgbench
Date: 2013-09-20 06:42:53
Message-ID: 523BEE6D.3080500@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I create gaussinan distribution pgbench patch that can access records with
gaussian frequency. And I submit this commit fest.

* Purpose this patch
In the general transaction situation, clients access for all records equally is
hard to happen. I think gaussian distribution access patterns are most of
transaction petterns in general. My patch realizes neary this access pattern.

I think that not only it can simulate a general access pattern as an effect of
this patch, but also it is useful for new development features such as effective
use and free of shared_buffers, the readahead optimization in the OS, and the
speed-up of the tuple level lock.

* Usage
It is easy to use, only put -g with standard deviation threshold parameter.
If we set larger standard deviation threshold, pgbench access patern limited
more specific records. Min standard deviation threshold is 2.

Execution example command is here.
> [mitsu-ko(at)localhost postgresql]$ bin/pgbench -g 10 -c 16 -j 8 -T 300
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 1
> standard deviation threshold: 10.00000
> access probability of top 20%, 10% and 5% records: 0.95450 0.68269 0.38292
> query mode: simple
> number of clients: 16
> number of threads: 8
> duration: 300 s
> number of transactions actually processed: 566367
> tps = 1887.821409 (including connections establishing)
> tps = 1887.949390 (excluding connections establishing)

"access probability" indicates top N access probability in this benchmark.
If we set larger standard deviation threshold parameter, it become more large.

Attached png files which are "gausian_2.png" and "gaussian_10.png" indicate
gaussian distribution access patern by my patch. "no_gaussian.png" is not with -g
option (normal). I think my patch realize gaussian distribution access patern.

* Approach
It replaces uniform random number generator to gaussian distribution random
number generator using by box-muller tansform method. Then, I use standard
deviation threshold parameter for mapping a normal distribution access pattern in
each record and normalization. It is linear mappping method that is a floating
point to an integer value.

* other
I also create another patches that can get more accurate benchmark result in
pgbench, and will submit them this commit fest. They are like that I submitted
checkpoint patch in the past. They are all right, too!

Any question?

Best regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Attachment Content-Type Size
gaussian_pgbench_v0.patch text/x-diff 5.3 KB
image/png 25.9 KB
image/png 23.2 KB
image/png 26.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Albe Laurenz 2013-09-20 07:15:42 Re: FW: REVIEW: Allow formatting in log_line_prefix
Previous Message samthakur74 2013-09-20 06:04:49 Re: pg_stat_statements: calls under-estimation propagation