Re: gaussian distribution pgbench

From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gaussian distribution pgbench
Date: 2014-03-14 06:02:57
Message-ID: 53229B91.5050508@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2014/03/13 23:00), Fujii Masao wrote:
> On Thu, Mar 13, 2014 at 10:51 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> On 03/13/2014 03:17 PM, Fujii Masao wrote:
>>>
>>> On Tue, Mar 11, 2014 at 1:49 PM, KONDO Mitsumasa
>>> <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>>
>>>> (2014/03/09 1:49), Fabien COELHO wrote:
>>>>>
>>>>>
>>>>> I'm okay with this UI and itsaccess probability of top implementation.
>>>>
>>>>
>>>> OK.
>>>
>>>
>>> We should do the same discussion for the UI of command-line option?
>>> The patch adds two options --gaussian and --exponential, but this UI
>>> seems to be a bit inconsistent with the UI for \setrandom. Instead,
>>> we can use something like --distribution=[uniform | gaussian |
>>> exponential].
>>
>>
>> IMHO we should just implement the \setrandom changes, and not add any of
>> these options to modify the standard test workload. If someone wants to run
>> TPC-B workload with gaussian or exponential distribution, they can implement
>> it as a custom script. The docs include the script for the standard TPC-B
>> workload; just copy-paster that and modify the \setrandom lines.
Well, when we set '--gaussian=NUM' or '--exponential=NUM' on command line, we can
see access probability of top N records in result of final output. This out put
is under following,

> [mitsu-ko(at)localhost pgbench]$ ./pgbench --exponential=10 postgres
> starting vacuum...end.
> transaction type: Exponential distribution TPC-B (sort of)
> scaling factor: 1
> exponential threshold: 10.00000
> access probability of top 20%, 10% and 5% records: 0.86466 0.63212 0.39347
> ~
This feature helps user to understand bias of distribution for tuning threshold
parameter.
If this feature is nothing, it is difficult to understand distribution of access
pattern, and it cannot realized on custom script. Because range of distribution
(min, max, and SQL pattern) are unknown on custom script. So I think present UI
is not bad and should not change.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tanmay Deshpande 2014-03-14 06:59:56 About the portal in postgres
Previous Message KONDO Mitsumasa 2014-03-14 05:48:20 Re: gaussian distribution pgbench