Quick Links

Re: COUNT(*) and index-only scans

From:	Greg Stark <stark(at)mit(dot)edu>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: COUNT(*) and index-only scans
Date:	2011-10-12 15:04:22
Message-ID:	CAM-w4HMft5Mx-_dmff5Xk_icyXMgEHYV2_+MWd49MajzqJgqxA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Oct 12, 2011 at 3:29 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> What I suggest as a first cut for that is: simply derate the visibility fraction as the fraction
>>> of the table expected to be scanned gets smaller.
>
>> I think there's a statistically more rigorous way of accomplishing the
>> same thing. If you treat the pages we estimate we're going to read as
>> a random sample of the population of pages then your expected value is
>> the fraction of the overall population that is all-visible but your
>> 95th percentile confidence interval will be, uh, a simple formula we
>> can compute but I don't recall off-hand.
>
> The problem is precisely that the pages a query is going to read are
> likely to *not* be a random sample, but to be correlated with
> recently-dirtied pages.

Sure, but I was suggesting aiming for the nth percentile rather than a
linear factor which I don't know has any concrete meaning.

--
greg

In response to

Re: COUNT(*) and index-only scans at 2011-10-12 14:29:39 from Tom Lane

Responses

Re: COUNT(*) and index-only scans at 2011-10-12 15:18:03 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2011-10-12 15:09:11	Re: [BUGS] *.sql contrib files contain unresolvable MODULE_PATHNAME
Previous Message	Aidan Van Dyk	2011-10-12 15:01:26	Re: [BUGS] *.sql contrib files contain unresolvable MODULE_PATHNAME