Re: gaussian distribution pgbench

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: gaussian distribution pgbench
Date: 2014-07-02 19:59:25
Message-ID: 53B4649D.2020403@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/07/14 21:05, Fabien COELHO wrote:
>
> Hello Mitsumasa-san,
>
>> And I'm also interested in your "decile percents" output like under
>> followings,
>> decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4%
>
> Sure, I'm really fine with that.
>
>> I think that it is easier than before. Sum of decile percents is just
>> 100%.
>
> That's a good property:-)
>
>> However, I don't prefer "highest/lowest percentage" because it will
>> be confused with decile percentage for users, and anyone cannot
>> understand this digits. I cannot understand "4.9%, 0.0%" when I see
>> the first time. Then, I checked the source code, I understood it:(
>> It's not good design... #Why this parameter use 100?
>
> What else? People have ten fingers and like powers of 10, and are used
> to percents?
>
>> So I'd like to remove it if you like. It will be more simple.
>
> I think that for the exponential distribution it helps, especially for
> high threshold, to have the lowest/highest percent density. For low
> thresholds, the decile is also definitely useful. So I'm fine with
> both outputs as you have put them.
>
> I have just updated the wording so that it may be clearer:
>
> decile percents: 69.9% 21.0% 6.3% 1.9% 0.6% 0.2% 0.1% 0.0% 0.0% 0.0%
> probability of fist/last percent of the range: 11.3% 0.0%
>
>> Attached patch is fixed version, please confirm it.
>
> Attached a v15 which just fixes a typo and the above wording update.
> I'm validating it for committers.
>
>> #Of course, World Cup is being held now. I'm not hurry at all.
>
> I'm not a soccer kind of person, so it does not influence my
> availibility.:-)
>
>
> Suggested commit message:
>
> Add drawing random integers with a Gaussian or truncated exponentional
> distributions to pgbench.
>
> Test variants with these distributions are also provided and triggered
> with options "--gaussian=..." and "--exponential=...".
>
>
> Have a nice day/night,
>
>
>
I would suggest that probabilities should NEVER be expressed in
percentages! As a percentage probability looks weird, and is never used
for serious statistical work - in my experience at least.

I think probabilities should be expressed in the range 0 ... 1 - i.e.
0.35 rather than 35%.

Cheers,
Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-07-02 20:01:27 Re: Audit of logout
Previous Message David G Johnston 2014-07-02 19:58:07 Re: Can simplify 'limit 1' with slow function?