Lists: | pgsql-performance |
---|
From: | Eugene <emelamud(at)yahoo(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Forcing HashAggregation prior to index scan? |
Date: | 2004-07-05 23:14:34 |
Message-ID: | 20040705231434.81471.qmail@web50708.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
I have a very simple problem. I run two select statments, they are identical except for a single
where condition. The first select statment runs in 9 ms, while the second statment runs for 4000
ms
SQL1 - fast 9ms
explain analyse select seq_ac from refseq_sequence S where seq_ac in (select seq_ac2 from
refseq_refseq_hits where seq_ac1 = 'NP_001217')
SQL2 - very slow 4000ms
explain analyse select seq_ac from refseq_sequence S where seq_ac in (select seq_ac2 from
refseq_refseq_hits where seq_ac1 = 'NP_001217') AND S.species = 'Homo sapiens'
I think the second sql statment is slower than the first one because planner is not using
HashAggregate. Can I force HashAggregation before index scan?
Here is the full output from EXPLAIN ANALYZE
explain analyse select seq_ac from refseq_sequence S where seq_ac in (select seq_ac2 from
refseq_refseq_hits where seq_ac1 = 'NP_001217');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=169907.83..169919.88 rows=3 width=24) (actual time=1.450..8.707 rows=53
loops=1)
-> HashAggregate (cost=169907.83..169907.83 rows=2 width=19) (actual time=1.192..1.876
rows=53 loops=1)
-> Index Scan using refseq_refseq_hits_pkey on refseq_refseq_hits (cost=0.00..169801.33
rows=42600 width=19) (actual time=0.140..0.894 rows=54 loops=1)
Index Cond: ((seq_ac1)::text = 'NP_001217'::text)
-> Index Scan using refseq_sequence_pkey on refseq_sequence s (cost=0.00..6.01 rows=1
width=24) (actual time=0.105..0.111 rows=1 loops=53)
Index Cond: ((s.seq_ac)::text = ("outer".seq_ac2)::text)
Total runtime: 9.110 ms
explain analyse select seq_ac from refseq_sequence S where seq_ac in (select seq_ac2 from
refseq_refseq_hits where seq_ac1 = 'NP_001217') and S.species = 'Homo sapiens';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop IN Join (cost=0.00..4111.66 rows=1 width=24) (actual time=504.176..3857.340 rows=30
loops=1)
-> Index Scan using refseq_sequence_key2 on refseq_sequence s (cost=0.00..1516.06 rows=389
width=24) (actual time=0.352..491.107 rows=27391 loops=1)
Index Cond: ((species)::text = 'Homo sapiens'::text)
-> Index Scan using refseq_refseq_hits_pkey on refseq_refseq_hits (cost=0.00..858.14 rows=213
width=19) (actual time=0.114..0.114 rows=0 loops=27391)
Index Cond: (((refseq_refseq_hits.seq_ac1)::text = 'NP_001217'::text) AND
(("outer".seq_ac)::text = (refseq_refseq_hits.seq_ac2)::text))
Total runtime: 3857.636 ms
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Eugene <emelamud(at)yahoo(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Forcing HashAggregation prior to index scan? |
Date: | 2004-07-10 03:10:04 |
Message-ID: | 2132.1089429004@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Eugene <emelamud(at)yahoo(dot)com> writes:
> Can I force HashAggregation before index scan?
No. But look into why the planner's rows estimate is so bad here:
> -> Index Scan using refseq_sequence_key2 on refseq_sequence s (cost=0.00..1516.06 rows=389
> width=24) (actual time=0.352..491.107 rows=27391 loops=1)
> Index Cond: ((species)::text = 'Homo sapiens'::text)
Have you ANALYZEd this table recently? If so, maybe you need a larger
statistics target for the species column. The estimated row count
shouldn't be off by a factor of seventy...
regards, tom lane