Re: Obtaining random rows from a result set

From: Alban Hertroys <alban(at)magproductions(dot)nl>
To: Alban Hertroys <alban(at)magproductions(dot)nl>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Obtaining random rows from a result set
Date: 2007-09-01 12:24:25
Message-ID: F0955B4D-885F-4DED-A116-99B912705178@magproductions.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sep 1, 2007, at 12:44, Alban Hertroys wrote:

>> It would be possible to write an aggregate that returns a single
>> random
>> value from a set. The algorithm is something like:
>>
>> n = 1
>> v = null
>> for each row
>> if random() < 1/n:
>> v = value of row
>> n = n + 1
>>
>> return v
>
> Doesn't this always return the first record, since random() is
> always less than 1/1?
> I don't think this method has a linear distribution, but then again
> I don't understand what 'value of row' refers to...

Oh, now I see... The first time guarantees that v has a value (as
random() < 1/1), and after that there is a decreasing chance that a
new row gets re-assigned to v. That means the last row has a chance
of 1/n, which would be it's normal chance if the distribution were
linear, but doesn't the first row have a chance of 1/(n!) to be
returned?

--
Alban Hertroys
alban(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

!DSPAM:737,46d9551a289904044091126!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bill Moran 2007-09-01 12:24:36 Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)
Previous Message Alban Hertroys 2007-09-01 12:13:02 Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem