Quick Links

Re: Cross-column statistics revisited

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Joshua Tolley" <eggyknap(at)gmail(dot)com>
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Subject:	Re: Cross-column statistics revisited
Date:	2008-10-17 00:32:38
Message-ID:	4073.1224203558@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Joshua Tolley" <eggyknap(at)gmail(dot)com> writes:
> Most of the comments on this thread have centered around the questions
> of "what we'd store" and "how we'd use it", which might be better
> phrased as, "The database assumes columns are independent, but we know
> that's not always true. Does this cause enough problems to make it
> worth fixing? How might we fix it?" I have to admit an inability to
> show that it causes problems,

Any small amount of trolling in our archives will turn up plenty of
examples.

It appears to me that a lot of people in this thread are confusing
correlation in the sense of statistical correlation between two
variables with correlation in the sense of how well physically-ordered
a column is. (The latter is actually the same kind of animal, but
always taking one of the two variables to be physical position.)
A bad estimate for physical-position correlation has only limited
impact, as Josh B said upthread; but the other case leads to very
bad rowcount estimates which have *huge* impact on plan choices.

regards, tom lane

In response to

Re: Cross-column statistics revisited at 2008-10-16 21:34:26 from Joshua Tolley

Responses

Re: Cross-column statistics revisited at 2008-10-17 01:30:43 from Joshua Tolley
Re: Cross-column statistics revisited at 2008-10-17 17:28:10 from Ron Mayer

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joshua Tolley	2008-10-17 01:30:43	Re: Cross-column statistics revisited
Previous Message	Greg Stark	2008-10-17 00:00:20	Re: Cross-column statistics revisited