Re: ANALYZE sampling is too good

From: Greg Stark <stark(at)mit(dot)edu>
To: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <jim(at)nasby(dot)net>
Subject: Re: ANALYZE sampling is too good
Date: 2013-12-11 22:33:40
Message-ID: CAM-w4HN__RLfYhvtyBGDYFnmRZsyQXey4kZE-NUHggnC4NT0BA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I think we're all wet here. I don't see any bias towards larger or smaller
rows. Larger tied will be on a larger number of pages but there will be
fewer of them on any one page. The average effect should be the same.

Smaller values might have a higher variance with block based sampling than
larger values. But that actually *is* the kind of thing that Simon's
approach of just compensating with later samples can deal with.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2013-12-11 22:39:04 Re: ANALYZE sampling is too good
Previous Message Alvaro Herrera 2013-12-11 22:25:10 Re: preserving forensic information when we freeze