Re: ANALYZE sampling is too good

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE sampling is too good
Date: 2013-12-10 19:54:43
Message-ID: CA+U5nMKQrTZ=SF93rY=uXYwcXDBtHjXWsP+X1THQnqSQLG57Yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 December 2013 19:49, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Tue, Dec 10, 2013 at 11:23 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> However, these things presume that we need to continue scanning most
>> of the blocks of the table, which I don't think needs to be the case.
>> There is a better way.
>
> Do they? I think it's one opportunistic way of ameliorating the cost.
>
>> Back in 2005/6, I advocated a block sampling method, as described by
>> Chaudri et al (ref?)
>
> I don't think that anyone believes that not doing block sampling is
> tenable, fwiw. Clearly some type of block sampling would be preferable
> for most or all purposes.

If we have one way of reducing cost of ANALYZE, I'd suggest we don't
need 2 ways - especially if the second way involves the interaction of
otherwise not fully related parts of the code.

Or to put it clearly, lets go with block sampling and then see if that
needs even more work.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-12-10 19:54:57 Re: ANALYZE sampling is too good
Previous Message Peter Geoghegan 2013-12-10 19:49:12 Re: ANALYZE sampling is too good