Re: gaussian distribution pgbench -- splits v4

From: Mitsumasa KONDO <kondo(dot)mitsumasa(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gaussian distribution pgbench -- splits v4
Date: 2014-08-01 07:58:01
Message-ID: CADupcHWoU=L+1zPY+WfxXH=VoofeZkMGVhDJpWD4nWiW2H8oSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

2014-08-01 16:26 GMT+09:00 Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>

>
> Maybe somebody who knows more math than I do (like you, probably!) can
>> come up with something more clever.
>>
>
> I can certainly suggest other formula, but that does not mean beautiful
> code, thus would probably be rejected. I'll see.
>
> An alternative to this whole process may be to hash/modulo a non uniform
> random value.
>
> id = 1 + hash(some-random()) % n
>
> But the hashing changes the distribution as it adds collisions, so I have
> to think about how to be able to control the distribution in that case, and
> what hash function to use.

I think that we have to consider and select reproducible method, because
benchmark is always needed robust and reproducible result. And if we
realize this idea, we might need more accurate random generator that is
like Mersenne twister algorithm. erand48 algorithm is slow and not
accurate very much.

By the way, I don't know relativeness of this topic and command line
option... Well whatever...

Regards,
--
Mitsumasa KONDO

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2014-08-01 07:58:38 Index-only scans for GIST
Previous Message Fabien COELHO 2014-08-01 07:26:53 Re: gaussian distribution pgbench -- splits v4