Re: Simple join optimized badly?

From: "Denis Lussier" <denisl(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>, "Jonah H(dot) Harris" <jharris(at)enterprisedb(dot)com>
Cc: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Simple join optimized badly?
Date: 2006-10-08 01:50:14
Message-ID: 5f820f750610071850o6ed0b19o162d65d1987b2a8c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Wouldn't PG supporting simple optmizer hints get around this kinda
problem? Seems to me that at least one customer posting per week
would be solved via the use of simple hints.

If the community is interested... EnterpriseDB has added support for
a few different simple types of hints (optimize for speed, optimize
for first rows, use particular indexes) for our upcoming 8.2 version.
We are glad to submit them into the community process if there is any
chance they will eventually be accepted for 8.3.

I don't think there is an ANSI standrd for hints, but, that doesn't
mean they are not occosaionally extrenmely useful. All hints are
effectively harmless/helpful suggestions, the planner is free to
ignore them if they are not feasible.

--Denis Lussier
Founder
http://www.enterprisedb.com

On 10/7/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Craig A. James" <cjames(at)modgraph-usa(dot)com> writes:
> > There are two plans below. The first is before an ANALYZE HITLIST_ROWS, and it's horrible -- it looks to me like it's sorting the 16 million rows of the SEARCH table. Then I run ANALYZE HITLIST_ROWS, and the plan is pretty decent.
>
> It would be interesting to look at the before-ANALYZE cost estimate for
> the hash join, which you could get by setting enable_mergejoin off (you
> might have to turn off enable_nestloop too). I recall though that
> there's a fudge factor in costsize.c that penalizes hashing on a column
> that no statistics are available for. The reason for this is the
> possibility that the column has only a small number of distinct values,
> which would make a hash join very inefficient (in the worst case all
> the values might end up in the same hash bucket, making it no better
> than a nestloop). Once you've done ANALYZE it plugs in a real estimate
> instead, and evidently the cost estimate drops enough to make hashjoin
> the winner.
>
> You might be able to persuade it to use a hashjoin anyway by increasing
> work_mem enough, but on the whole my advice is to do the ANALYZE after
> you load up the temp table. The planner really can't be expected to be
> very intelligent when it has no stats.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2006-10-08 17:17:09 Re: Simple join optimized badly?
Previous Message Tom Lane 2006-10-07 15:51:40 Re: Simple join optimized badly?