Re: ANALYZE sampling is too good

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE sampling is too good
Date: 2013-12-09 23:46:38
Message-ID: CAGTBQpbgcTOGwh0SbyV0mpJq3RDq+6YzCBqZ8Wsr6NkNG1CpSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 9, 2013 at 8:45 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>>On Mon, Dec 9, 2013 at 8:14 PM, Heikki Linnakangas
>><hlinnakangas(at)vmware(dot)com> wrote:
>>> I took a stab at using posix_fadvise() in ANALYZE. It turned out to
>>be very
>>> easy, patch attached. Your mileage may vary, but I'm seeing a nice
>>gain from
>>> this on my laptop. Taking a 30000 page sample of a table with 717717
>>pages
>>> (ie. slightly larger than RAM), ANALYZE takes about 6 seconds without
>>the
>>> patch, and less than a second with the patch, with
>>> effective_io_concurrency=10. If anyone with a good test data set
>>loaded
>>> would like to test this and post some numbers, that would be great.
>>
>>Kernel version?
>
> 3.12, from Debian experimental. With an ssd drive and btrfs filesystem. Admittedly not your average database server setup, so it would be nice to get more reports from others.

Yeah, read-ahead isn't relevant for SSD.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-12-09 23:52:35 Re: ANALYZE sampling is too good
Previous Message Heikki Linnakangas 2013-12-09 23:45:33 Re: ANALYZE sampling is too good