A "set seqscan='f'" query improvement example

From: Jon Lapham <lapham(at)extracta(dot)com(dot)br>
To: pgsql-general(at)postgresql(dot)org
Subject: A "set seqscan='f'" query improvement example
Date: 2002-06-05 15:31:46
Message-ID: 3CFE2EE2.3080709@extracta.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello-

I'm not sure this is a problem, but I thought I woud report this anyway.
If I "set seqscan='f'" on the following query, I see a HUGE
improvement in query time (factor of 6). Maybe this example will be
helpful to someone working on the planner.

Note that tmp_platedataset is a temporary table. Everything has been
VACUUM ANALYZE'd before issuing any queries, including the temp table.

The 3 table join query:
SELECT pp.sampleid
FROM tmp_platedataset tpd, plate_dataset_result pdr, plate_pos pp
WHERE pdr.pp_plateid=pp.plateid
AND pdr.col=pp.col
AND pdr.row=pp.row
AND pdr.platedatasetid=tpd.platedatasetid
AND tpd.toview='t';

The tables:
It seems that "\d" doesn't work with temp tables... hmm, anyway,
tmp_platedataset contains 8 columns, but the important one is
"platedatasetid int4". There are no triggers, rules, associated with
this table.
main_v0_8=# select count(*) from tmp_platedataset;
count
-------
401
(1 row)

main_v0_8=# \d plate_dataset_result
Table "plate_dataset_result"
Column | Type | Modifiers
----------------+--------------+-----------
platedatasetid | integer | not null
row | character(1) |
col | smallint |
result | real |
pp_plateid | integer | not null
Indexes: pdsr_pp_plateid
Unique keys: plate_dataset_platedataseti_key
Triggers: RI_ConstraintTrigger_26388004,
RI_ConstraintTrigger_23833439
main_v0_8=# select count(*) from plate_dataset_result;
count
---------
2192544
(1 row)

main_v0_8=# \d plate_pos
Table "plate_pos"
Column | Type | Modifiers
-------------+------------------------+-----------
plateid | integer | not null
sampleid | integer |
description | character varying(255) |
row | character(1) |
col | smallint |
conc | real |
Indexes: plate_pos_sampleid
Unique keys: plate_pos_plateid_key
Triggers: RI_ConstraintTrigger_23833631,
RI_ConstraintTrigger_23833625
main_v0_8=# select count(*) from plate_pos;
count
-------
46393
(1 row)

Here is the long query.

main_v0_8=# set enable_seqscan='t';
SET VARIABLE
main_v0_8=# explain ANALYZE SELECT pp.sampleid FROM tmp_platedataset
tpd, plate_dataset_result pdr, plate_pos pp WHERE
pdr.pp_plateid=pp.plateid AND pdr.col=pp.col AND pdr.row=pp.row AND
pdr.platedatasetid=tpd.platedatasetid AND tpd.toview='t';
NOTICE: QUERY PLAN:

Merge Join (cost=48222.32..50949.26 rows=30380 width=34) (actual
time=18350.80..19147.17 rows=32078 loops=1)
-> Index Scan using plate_pos_plateid_key on plate_pos pp
(cost=0.00..1989.92 rows=46393 width=15) (actual time=0.05..195.70
rows=19342 loops=1)
-> Sort (cost=48222.32..48222.32 rows=35759 width=19) (actual
time=18350.60..18396.26 rows=38496 loops=1)
-> Hash Join (cost=22.02..45517.89 rows=35759 width=19)
(actual time=10184.60..17237.49 rows=38496 loops=1)
-> Seq Scan on plate_dataset_result pdr
(cost=0.00..35073.93 rows=1994993 width=15) (actual time=0.07..12277.11
rows=2192544 loops=1)
-> Hash (cost=21.01..21.01 rows=401 width=4) (actual
time=3.57..3.57 rows=0 loops=1)
-> Seq Scan on tmp_platedataset tpd
(cost=0.00..21.01 rows=401 width=4) (actual time=0.41..2.60 rows=401
loops=1)
Total runtime: 19199.12 msec

Now, if I run this with heavy seqscan penalties, I can use "EXPLAIN
ANALYZE":

main_v0_8=# set enable_seqscan='f';
SET VARIABLE
main_v0_8=# explain ANALYZE SELECT pp.sampleid FROM tmp_platedataset
tpd, plate_dataset_result pdr, plate_pos pp WHERE
pdr.pp_plateid=pp.plateid AND pdr.col=pp.col AND pdr.row=pp.row AND
pdr.platedatasetid=tpd.platedatasetid AND tpd.toview='t';
NOTICE: QUERY PLAN:

Merge Join (cost=100134246.14..100136973.08 rows=30380 width=34)
(actual time=1697.54..3011.73 rows=32078 loops=1)
-> Sort (cost=100134246.14..100134246.14 rows=35759 width=19)
(actual time=1697.30..1745.48 rows=38496 loops=1)
-> Nested Loop (cost=100000000.00..100131541.72 rows=35759
width=19) (actual time=0.44..619.23 rows=38496 loops=1)
-> Seq Scan on tmp_platedataset tpd
(cost=100000000.00..100000021.01 rows=401 width=4) (actual
time=0.37..4.05 rows=401 loops=1)
-> Index Scan using plate_dataset_platedataseti_key on
plate_dataset_result pdr (cost=0.00..326.87 rows=89 width=15) (actual
time=0.03..1.01 rows=96 loops=401)
-> Index Scan using plate_pos_plateid_key on plate_pos pp
(cost=0.00..1989.92 rows=46393 width=15) (actual time=0.08..429.31
rows=43181 loops=1)
Total runtime: 3069.29 msec

-Jon

PS: If I can provide or do anything else that may be useful to anyone,
let me know!

--

-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
Jon Lapham
Extracta Moléculas Naturais, Rio de Janeiro, Brasil
email: lapham(at)extracta(dot)com(dot)br web: http://www.extracta.com.br/
***-*--*----*-------*------------*--------------------*---------------

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Sullivan 2002-06-05 15:36:31 Re: Re^3 : Solaris Performance - 64 bit puzzle
Previous Message tony 2002-06-05 15:14:10 need magic formula for euro conversion