Quick Links

Re: Obtaining random rows from a result set

From:	Alban Hertroys <alban(at)magproductions(dot)nl>
To:	Alban Hertroys <alban(at)magproductions(dot)nl>
Cc:	Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Obtaining random rows from a result set
Date:	2007-09-01 12:24:25
Message-ID:	F0955B4D-885F-4DED-A116-99B912705178@magproductions.nl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Sep 1, 2007, at 12:44, Alban Hertroys wrote:

>> It would be possible to write an aggregate that returns a single
>> random
>> value from a set. The algorithm is something like:
>>
>> n = 1
>> v = null
>> for each row
>> if random() < 1/n:
>> v = value of row
>> n = n + 1
>>
>> return v
>
> Doesn't this always return the first record, since random() is
> always less than 1/1?
> I don't think this method has a linear distribution, but then again
> I don't understand what 'value of row' refers to...

Oh, now I see... The first time guarantees that v has a value (as
random() < 1/1), and after that there is a decreasing chance that a
new row gets re-assigned to v. That means the last row has a chance
of 1/n, which would be it's normal chance if the distribution were
linear, but doesn't the first row have a chance of 1/(n!) to be
returned?

--
Alban Hertroys
alban(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

!DSPAM:737,46d9551a289904044091126!

In response to

Re: Obtaining random rows from a result set at 2007-09-01 10:44:08 from Alban Hertroys

Responses

Re: Obtaining random rows from a result set at 2007-09-01 12:44:13 from Martijn van Oosterhout

Browse pgsql-general by date

	From	Date	Subject
Next Message	Bill Moran	2007-09-01 12:24:36	Re: Bigtime scaling of Postgresql (cluster and stuff I suppose)
Previous Message	Alban Hertroys	2007-09-01 12:13:02	Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem