Re: proposal : cross-column stats

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: proposal : cross-column stats
Date: 2010-12-13 15:38:52
Message-ID: 7913.1292254732@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, Dec 12, 2010 at 9:16 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> The proposed solution is based on contingency tables, built for selected
>> groups of columns (not for each possible group). And the contingency
>> table gives you the ability to estimate the probabilities needed to
>> compute the selectivity. Or am I missing something?

> Well, I'm not real familiar with contingency tables, but it seems like
> you could end up needing to store a huge amount of data to get any
> benefit out of it, in some cases.

The reason that this wasn't done years ago is precisely that nobody's
figured out how to do it with a tolerable amount of stats data and a
tolerable amount of processing time (both at ANALYZE time and during
query planning). It's not hard to see what we'd ideally like to do;
it's getting from there to something useful in production that's hard.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-12-13 15:40:18 Re: Problem with pg_upgrade (8.4 -> 9.0) due to ALTER DATABASE SET ROLE
Previous Message Tom Lane 2010-12-13 15:34:50 Re: proposal : cross-column stats