Re: Cross-column statistics revisited

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joshua Tolley <eggyknap(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cross-column statistics revisited
Date: 2008-10-16 17:50:49
Message-ID: 20081016175049.GC19967@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 16, 2008 at 01:34:59PM -0400, Robert Haas wrote:
> I suspect that a lot of the correlations people care about are
> extreme. For example, it's fairly common for me to have a table where
> column B is only used at all for certain values of column A. Like,
> atm_machine_id is usually or always NULL unless transaction_type is
> ATM, or something. So a clause of the form transaction_type = 'ATM'
> and atm_machine_id < 10000 looks more selective than it really is
> (because the first half is redundant).

That case is easily done by simply considering the indexed values of a
column with a partial index. This should be fairly easy to do I think.

It might be worthwhile someone trawling through the archives looking
for examples where we estimate the correlation wrong.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-10-16 18:03:26 Re: minimal update
Previous Message Tom Lane 2008-10-16 17:44:46 Re: minimal update