Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search for
  Advanced Search

Re: Query only slow on first run



tmp wrote:
what exactly is that
"random_number" column

A random float that is initialized when the row is created and never modified afterwards. The physical row ordering will clearly not match the random_number ordering. However, other queries uses a row ordering by the primary key so I don't think it would make much sense to make the index on random_number a clustering index just in order to speed up this single query.

 and why are you desirous of ordering by it?

In order to simulate a random pick of K rows. See [1].

A trick that I used is to sample the random column once, and create a much smaller table of the first N rows, where N is the sample size you want, and use that.

If you need a different N samples each time, you can create a temporary table, put your random N rows into that, do an ANALYZE, and then join to this smaller table.  The overall performance can be MUCH faster even though you're creating and populating a whole table, than the plan that Postgres comes up with. This seems wrong-headed (why shouldn't Postgres be able to be as efficient on its own?), but it works.

Craig




Home | Main Index | Thread Index

Privacy Policy | PostgreSQL Archives hosted by Command Prompt, Inc. | Designed by tinysofa
Copyright © 1996 – 2007 PostgreSQL Global Development Group