Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
- From: Josh Berkus <josh(at)agliodbs(dot)com>
- To: Andrew Dunstan <andrew(at)dunslane(dot)net>
- Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Marko Ristola <marko(dot)ristola(at)kolumbus(dot)fi>, pgsql-perform <pgsql-performance(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
- Subject: Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
- Date: Sun, 24 Apr 2005 12:08:15 -0700
- Message-id: <200504241208(dot)15437(dot)josh(at)agliodbs(dot)com>
Folks,
> I wonder if this paper has anything that might help:
> http://www.stat.washington.edu/www/research/reports/1999/tr355.ps - if I
> were more of a statistician I might be able to answer :-)
Actually, that paper looks *really* promising. Does anyone here have enough
math to solve for D(sub)Md on page 6? I'd like to test it on samples of <
0.01%.
Tom, how does our heuristic sampling work? Is it pure random sampling, or
page sampling?
--
Josh Berkus
Aglio Database Solutions
San Francisco
Home |
Main Index |
Thread Index