Incorrect estimates on correlated filters

From: "Chris Kratz" <chris(dot)kratz(at)vistashare(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Incorrect estimates on correlated filters
Date: 2008-08-12 21:59:27
Message-ID: 3642025c0808121459x75c8d9baq66194b87f5053ea3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello All,

Ran into a re-occuring performance problem with some report queries again
today. In a nutshell, we have filters on either multiple joined tables, or
multiple columns on a single table that are highly correlated. So, the
estimates come out grossly incorrect (the planner has no way to know they
are correlated). 2000:1 for one I'm looking at right now. Generally this
doesn't matter, except in complex reporting queries like these when this is
the first join of 40 other joins. Because the estimate is wrong at the
lowest level, it snowballs up through the rest of the joins causing the
query to run very, very slowly. In many of these cases, forcing nested
loops off for the duration of the query fixes the problem. But I have a
couple that still are painfully slow and shouldn't be.

I've been reading through the archives with others having similar problems
(including myself a year ago). Am I right in assuming that at this point
there is still little we can do in postgres to speed up this kind of query?
Right now the planner has no way to know the correlation between different
columns in the same table, let alone columns in different tables. So, it
just assumes no correlation and returns incorrectly low estimates in cases
like these.

The only solution I've come up with so far is to materialize portions of the
larger query into subqueries with these correlated filters which are indexed
and analyzed before joining into the larger query. This would keep the
incorrect estimates from snowballing up through the chain of joins.

Are there any other solutions to this problem?

Thanks,

-Chris

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2008-08-13 00:23:31 Re: Filesystem benchmarking for pg 8.3.3 server
Previous Message Ron Mayer 2008-08-12 21:47:57 Re: Filesystem benchmarking for pg 8.3.3 server