From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: proposal : cross-column stats |
Date: | 2010-12-13 02:00:37 |
Message-ID: | AANLkTimWyk6qso=_BethgFSYN4FBZa-syAVjtUxjssuo@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Dec 12, 2010 at 8:46 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> Dne 13.12.2010 01:05, Robert Haas napsal(a):
>> This is a good idea, but I guess the question is what you do next. If
>> you know that the "applicability" is 100%, you can disregard the
>> restriction clause on the implied column. And if it has no
>> implicatory power, then you just do what we do now. But what if it
>> has some intermediate degree of implicability?
>
> Well, I think you've missed the e-mail from Florian Pflug - he actually
> pointed out that the 'implicativeness' Heikki mentioned is called
> conditional probability. And conditional probability can be used to
> express the "AND" probability we are looking for (selectiveness).
>
> For two columns, this is actually pretty straighforward - as Florian
> wrote, the equation is
>
> P(A and B) = P(A|B) * P(B) = P(B|A) * P(A)
Well, the question is what data you are actually storing. It's
appealing to store a measure of the extent to which a constraint on
column X constrains column Y, because you'd only need to store
O(ncolumns^2) values, which would be reasonably compact and would
potentially handle the zip code problem - a classic "hard case" rather
neatly. But that wouldn't be sufficient to use the above equation,
because there A and B need to be things like "column X has value x",
and it's not going to be practical to store a complete set of MCVs for
column X for each possible value that could appear in column Y.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2010-12-13 02:08:54 | Re: proposal : cross-column stats |
Previous Message | Tomas Vondra | 2010-12-13 01:46:05 | Re: proposal : cross-column stats |