Re: How can I speed up with query?

From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: How can I speed up with query?
Date: 2003-02-05 20:51:01
Message-ID: 8e2f64fc42bbf31e88e5fc5c796c62a1@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I have a table (see below) with an ID, a date, and a value. The date
> specifies when the entry was added. I want to query the table to determine
> what the value was on a specific day.
> ...
> select a.* from sample a where (id,date) in (select a.id,max(date) from
> sample where date<='<<the date>>' and id=a.id) order by id;

Actually, your query as written wants to find the latest value added before or
on a certain date. A subtle difference, but it makes a difference. Some
quick notes before I jump into my shot at this:

* Avoid the max() function - it is not fully optimized yet. Instead use
SELECT foo FROM table ORDER BY foo DESC LIMIT 1
(and use ASC to replace the min() function)

* Try not to use keywords such as "date" for your column names.

* Always run VACUUM ANALYZE and create an index: in this case, on the "date" column

* If you can't match on a single column (as in the id,date from your original
query, use the oid)

(The explain analyze below is for a 200,000 row table with 20 distinct ids and a
time period of abot a month.)

VACUUM ANALYZE sample;
CREATE INDEX sample_date on sample(date);

EXPLAIN ANALYZE
SELECT id, date, value FROM sample a WHERE date <='2003-01-01' AND oid =
(SELECT oid FROM sample WHERE id = a.id AND DATE <='2003-01-01' ORDER BY date DESC LIMIT 1)
ORDER BY id;

QUERY PLAN
- --------------------------------------------------------------------------------------------------
Sort (cost=1712.13..1712.14) (actual time=5292.35..5292.38 rows=20 loops=1)
Sort Key: id
-> Index Scan using sample_date on sample a (cost=0.00..1712.12) (actual time=5258.10..5292.22 rows=20 loops=1)
Index Cond: (date <= '2003-01-01'::date)
Filter: (oid = (subplan))
SubPlan
-> Limit (cost=0.00..81.53) (actual time=0.71..0.78 rows=1 loops=6532)
-> Index Scan Backward using sample_date on sample (cost=0.00..81.53) (actual time=0.70..0.78 rows=2 loops=6532)
Index Cond: (date <= '2003-01-01'::date)
Filter: (id = $0)
Total runtime: 5292.50 msec

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200302051538
-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+QXnMvJuQZxSWSsgRAhmDAKDwMmf0GvhnVFKeiDPnVolx3wOO1gCgsasJ
3t3LCOa6Q5uOCJpawodJO54=
=dGj1
-----END PGP SIGNATURE-----

Browse pgsql-novice by date

  From Date Subject
Next Message Andreas Plesch 2003-02-05 21:02:27 keys missing in reference display in DbVisualizer
Previous Message Mike Grommet 2003-02-05 20:42:21 Re: how can I tell it's postgresql data?