Re: Similar querys, better execution time on worst execution plan

Lists: pgsql-performance
From: "Fernando Papa" <fpapa(at)claxson(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Similar querys, better execution time on worst execution plan
Date: 2003-06-25 19:25:44
Message-ID: F1DC5B511E2D1C499E5E20FC6D74160D036421D6@exch2000.buehuergo.corp.claxson.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi all!

I have a strange behavior with this query:

SELECT c.id_contenido,p.fecha_publicacion,c.titulo_esp,c.activo,c.activo,s.label_esp as label_sbc,p.orden
,p.tapa_spc,p.tapa_cat,p.tapa_principal,p.id_publicacion,ca.label_esp as label_cat,sp.label_esp as label_spc
FROM cont_contenido c ,cont_publicacion p ,cont_sbc s ,cont_cat ca ,cont_spc sp
WHERE c.id_instalacion = 2
AND s.id_instalacion = 2
AND p.id_instalacion = 2
AND c.id_contenido = p.id_contenido
AND c.id_sbc = s.id_sbc
--AND (c.activo = 'S' or c.activo = 's')
--AND (s.activo = 'S' or s.activo = 's')
AND upper(c.activo) = 'S'
AND upper(s.activo) = 'S'
AND ca.id_instalacion = 2
AND sp.id_instalacion = 2
AND ca.id_cat = s.id_cat
AND sp.id_spc = ca.id_spc
ORDER BY sp.label_esp ,ca.label_esp ,p.orden

This is the execution plan:
Sort (cost=128.81..128.83 rows=5 width=189)
Sort Key: sp.label_esp, ca.label_esp, p.orden
-> Nested Loop (cost=0.00..128.76 rows=5 width=189)
Join Filter: ("outer".id_contenido = "inner".id_contenido)
-> Nested Loop (cost=0.00..24.70 rows=1 width=134)
Join Filter: ("inner".id_spc = "outer".id_spc)
-> Nested Loop (cost=0.00..22.46 rows=1 width=111)
-> Nested Loop (cost=0.00..6.89 rows=1 width=68)
Join Filter: ("inner".id_cat = "outer".id_cat)
-> Seq Scan on cont_sbc s (cost=0.00..4.44 rows=1 width=35)
Filter: ((id_instalacion = 2::numeric) AND (upper((activo)::text) = 'S'::text))
-> Seq Scan on cont_cat ca (cost=0.00..2.31 rows=11 width=33)
Filter: (id_instalacion = 2::numeric)
-> Index Scan using cont_cont_cont_sbc_fk_i on cont_contenido c (cost=0.00..15.56 rows=1 width=43)
Index Cond: ((c.id_instalacion = 2::numeric) AND (c.id_sbc = "outer".id_sbc))
Filter: (upper((activo)::text) = 'S'::text)
-> Seq Scan on cont_spc sp (cost=0.00..2.16 rows=6 width=23)
Filter: (id_instalacion = 2::numeric)
-> Seq Scan on cont_publicacion p (cost=0.00..98.54 rows=442 width=55)
Filter: (id_instalacion = 2::numeric)

If I replace both "uppers" with "...= 'S' or ...= 's'":

SELECT c.id_contenido,p.fecha_publicacion,c.titulo_esp,c.activo,c.activo,s.label_esp as label_sbc,p.orden
,p.tapa_spc,p.tapa_cat,p.tapa_principal,p.id_publicacion,ca.label_esp as label_cat,sp.label_esp as label_spc
FROM cont_contenido c ,cont_publicacion p ,cont_sbc s ,cont_cat ca ,cont_spc sp
WHERE c.id_instalacion = 2
AND s.id_instalacion = 2
AND p.id_instalacion = 2
AND c.id_contenido = p.id_contenido
AND c.id_sbc = s.id_sbc
AND (c.activo = 'S' or c.activo = 's')
AND (s.activo = 'S' or s.activo = 's')
AND ca.id_instalacion = 2
AND sp.id_instalacion = 2
AND ca.id_cat = s.id_cat
AND sp.id_spc = ca.id_spc
ORDER BY sp.label_esp ,ca.label_esp ,p.orden

This is the Execution plan:

Sort (cost=193.98..194.62 rows=256 width=189)
Sort Key: sp.label_esp, ca.label_esp, p.orden
-> Merge Join (cost=178.07..183.75 rows=256 width=189)
Merge Cond: ("outer".id_contenido = "inner".id_contenido)
-> Sort (cost=60.11..60.25 rows=56 width=134)
Sort Key: c.id_contenido
-> Merge Join (cost=57.31..58.50 rows=56 width=134)
Merge Cond: ("outer".id_sbc = "inner".id_sbc)
-> Sort (cost=10.60..10.64 rows=15 width=91)
Sort Key: s.id_sbc
-> Merge Join (cost=10.00..10.32 rows=15 width=91)
Merge Cond: ("outer".id_cat = "inner".id_cat)
-> Sort (cost=5.10..5.12 rows=10 width=56)
Sort Key: ca.id_cat
-> Merge Join (cost=4.74..4.94 rows=10 width=56)
Merge Cond: ("outer".id_spc = "inner".id_spc)
-> Sort (cost=2.50..2.53 rows=11 width=33)
Sort Key: ca.id_spc
-> Seq Scan on cont_cat ca (cost=0.00..2.31 rows=11 width=33)
Filter: (id_instalacion = 2::numeric)
-> Sort (cost=2.24..2.26 rows=6 width=23)
Sort Key: sp.id_spc
-> Seq Scan on cont_spc sp (cost=0.00..2.16 rows=6 width=23)
Filter: (id_instalacion = 2::numeric)
-> Sort (cost=4.90..4.96 rows=21 width=35)
Sort Key: s.id_cat
-> Seq Scan on cont_sbc s (cost=0.00..4.44 rows=21 width=35)
Filter: ((id_instalacion = 2::numeric) AND ((activo = 'S'::character varying) OR (activo = 's'::character varying)))
-> Sort (cost=46.70..46.94 rows=93 width=43)
Sort Key: c.id_sbc
-> Seq Scan on cont_contenido c (cost=0.00..43.66 rows=93 width=43)
Filter: ((id_instalacion = 2::numeric) AND ((activo = 'S'::character varying) OR (activo = 's'::character varying)))
-> Sort (cost=117.96..119.06 rows=442 width=55)
Sort Key: p.id_contenido
-> Seq Scan on cont_publicacion p (cost=0.00..98.54 rows=442 width=55)
Filter: (id_instalacion = 2::numeric)

The question is, why the query with the worst execution plan (most expensive, the second) runs faster the query with the better execution plan?
First Query: 10 runs, avg: 8 sec.
Second Query: 10 runs, avg: 1.8 sec.

I see a fail on the "best" exec plan, the rows I get are around 430, so the first EP expect only 5 rows and the second EP expect 256.

I run 7.3.2 over Solaris.
I did "vacuum full analyze" before

Thanks in advance!

Fernando.-


From: SZUCS Gábor <surrano(at)mailbox(dot)hu>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Similar querys, better execution time on worst execution plan
Date: 2003-06-26 10:30:59
Message-ID: 001c01c33bce$090aa460$0403a8c0@fejleszt4
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Fernando,

1. Try EXPLAIN ANALYZE. Cost alone isn't an absolute measure. I think it's
only to see which parts of the query are expected to be slowest. However,
EXP ANA will give you exact times in msec (which effectively means it
executes the query).

2. I think calling upper() for each row costs more than direct comparison,
but not sure

3. Notice that there are seq scans with filter conditions like
"id_instalacion = 2::numeric"
Do you have indices on id_instalacion, which seems to be a numeric field?
if so, try casting the constant expressions in the query to numeric so that
postgresql may find the index. If you don't have such indices, it may be
worth to create them. (I guess you only have it on the table aliased with c,
since it does an index scan there.

4. another guess may be indices on (id_instalacion, activo), or, if activo
has few possible values (for example, it may be only one of three letters,
say, 'S', 'A' or 'K'), partial indices like:

CREATE INDEX cont_sbc_id_ins_S ON cont_sbc (id_instalacion)
WHERE activo in ('S', 's');
CREATE INDEX cont_sbc_id_ins_A ON cont_sbc (id_instalacion)
WHERE activo in ('A', 'a');
CREATE INDEX cont_sbc_id_ins_K ON cont_sbc (id_instalacion)
WHERE activo in ('K', 'k');

G.
------------------------------- cut here -------------------------------
WHERE c.id_instalacion = 2
AND s.id_instalacion = 2
AND p.id_instalacion = 2
...

-> Seq Scan on cont_sbc s (cost=0.00..4.44 rows=1 width=35)
Filter: ((id_instalacion = 2::numeric)
AND (upper((activo)::text) = 'S'::text))
-> Index Scan using cont_cont_cont_sbc_fk_i on cont_contenido c
(cost=0.00..15.56 rows=1 width=43)
Index Cond: ((c.id_instalacion = 2::numeric)
AND (c.id_sbc = "outer".id_sbc))
Filter: (upper((activo)::text) = 'S'::text)
-> Seq Scan on cont_publicacion p (cost=0.00..98.54 rows=442 width=55)
Filter: (id_instalacion = 2::numeric)