From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Joshua Tolley" <eggyknap(at)gmail(dot)com> |
Cc: | josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, "Martijn van Oosterhout" <kleptog(at)svana(dot)org> |
Subject: | Re: Cross-column statistics revisited |
Date: | 2008-10-17 00:32:38 |
Message-ID: | 4073.1224203558@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Joshua Tolley" <eggyknap(at)gmail(dot)com> writes:
> Most of the comments on this thread have centered around the questions
> of "what we'd store" and "how we'd use it", which might be better
> phrased as, "The database assumes columns are independent, but we know
> that's not always true. Does this cause enough problems to make it
> worth fixing? How might we fix it?" I have to admit an inability to
> show that it causes problems,
Any small amount of trolling in our archives will turn up plenty of
examples.
It appears to me that a lot of people in this thread are confusing
correlation in the sense of statistical correlation between two
variables with correlation in the sense of how well physically-ordered
a column is. (The latter is actually the same kind of animal, but
always taking one of the two variables to be physical position.)
A bad estimate for physical-position correlation has only limited
impact, as Josh B said upthread; but the other case leads to very
bad rowcount estimates which have *huge* impact on plan choices.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua Tolley | 2008-10-17 01:30:43 | Re: Cross-column statistics revisited |
Previous Message | Greg Stark | 2008-10-17 00:00:20 | Re: Cross-column statistics revisited |