Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search for
  Advanced Search

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?


  • From: Josh Berkus <josh(at)agliodbs(dot)com>
  • To: Andrew Dunstan <andrew(at)dunslane(dot)net>
  • Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Marko Ristola <marko(dot)ristola(at)kolumbus(dot)fi>, pgsql-perform <pgsql-performance(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
  • Subject: Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
  • Date: Sun, 24 Apr 2005 12:08:15 -0700
  • Message-id: <200504241208(dot)15437(dot)josh(at)agliodbs(dot)com>

Folks,

> I wonder if this paper has anything that might help:
> http://www.stat.washington.edu/www/research/reports/1999/tr355.ps - if I
> were more of a statistician I might be able to answer :-)

Actually, that paper looks *really* promising.   Does anyone here have enough 
math to solve for D(sub)Md on page 6?   I'd like to test it on samples of < 
0.01%.    

Tom, how does our heuristic sampling work?   Is it pure random sampling, or 
page sampling?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco



Home | Main Index | Thread Index

Privacy Policy | PostgreSQL Archives hosted by Command Prompt, Inc. | Designed by tinysofa
Copyright © 1996 – 2008 PostgreSQL Global Development Group