Re: Encouraging multi-table join order

From: Dan Harris <fbsd(at)drivefaster(dot)net>
To: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Encouraging multi-table join order
Date: 2006-04-10 23:51:55
Message-ID: 443AEF9B.3000909@drivefaster.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom Lane wrote:
> That's very strange --- the estimated cost of the seqscan is high enough
> that the planner should have chosen a nestloop with inner indexscan on
> the big table. I'm not sure about the join-order point, but the hash
> plan for the first join seems wrong in any case.
>
> Um, you do have an index on eventactivity.incidentid, right? What's the
> datatype(s) of the incidentid columns? What happens to the plan if you
> turn off enable_hashjoin and enable_mergejoin?
>
> regards, tom lane
>
Yes, eventactivity.incidentid is indexed. The datatype is varchar(40).
Although, by checking this, I noticed that k_h.incidentid was
varchar(100). Perhaps the difference in length between the keys caused
the planner to not use the fastest method? I have no defense as to why
those aren't the same.. I will make them so and check.

Here's the EXPLAIN analyze with enable_hashjoin = off and
enable_mergejoin = off :

Limit (cost=4226535.73..4226544.46 rows=698 width=82) (actual
time=74339.016..74356.521 rows=888 loops=1)
-> Unique (cost=4226535.73..4226544.46 rows=698 width=82) (actual
time=74339.011..74354.073 rows=888 loops=1)
-> Sort (cost=4226535.73..4226537.48 rows=698 width=82)
(actual time=74339.003..74344.031 rows=3599 loops=1)
Sort Key: eventmain.entrydate, eventmain.incidentid,
eventgeo.eventlocation, eventactivity.recordtext
-> Nested Loop (cost=0.00..4226502.76 rows=698
width=82) (actual time=921.325..74314.959 rows=3599 loops=1)
-> Nested Loop (cost=0.00..4935.61 rows=731
width=72) (actual time=166.354..14638.308 rows=1162 loops=1)
-> Nested Loop (cost=0.00..2482.47 rows=741
width=50) (actual time=150.396..7348.013 rows=1162 loops=1)
-> Index Scan using k_h_id_idx on k_h
(cost=0.00..217.55 rows=741 width=14) (actual time=129.540..1022.243
rows=1162 loops=1)
Index Cond: (id = 33396)
Filter: ((entrydate >=
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate <
'2006-04-08 00:00:00'::timestamp without time zone))
-> Index Scan using
eventgeo_incidentid_idx on eventgeo (cost=0.00..3.04 rows=1 width=36)
(actual time=5.260..5.429 rows=1 loops=1162)
Index Cond:
((eventgeo.incidentid)::text = ("outer".incidentid)::text)
-> Index Scan using eventmain_incidentid_idx
on eventmain (cost=0.00..3.30 rows=1 width=22) (actual
time=5.976..6.259 rows=1 loops=1162)
Index Cond:
((eventmain.incidentid)::text = ("outer".incidentid)::text)
-> Index Scan using eventactivity1 on
eventactivity (cost=0.00..5774.81 rows=20 width=52) (actual
time=29.768..51.334 rows=3 loops=1162)
Index Cond: (("outer".incidentid)::text =
(eventactivity.incidentid)::text)
Filter: ((((' '::text || (recordtext)::text)
|| ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01
00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08
00:00:00'::timestamp without time zone))

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-04-11 01:01:37 Re: Encouraging multi-table join order
Previous Message Tom Lane 2006-04-10 23:12:36 Re: Encouraging multi-table join order