Re: ANALYZE sampling is too good

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE sampling is too good
Date: 2013-12-09 23:45:33
Message-ID: 2a291e05-737e-4270-80c5-4df55a57c470@email.android.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>On Mon, Dec 9, 2013 at 8:14 PM, Heikki Linnakangas
><hlinnakangas(at)vmware(dot)com> wrote:
>> I took a stab at using posix_fadvise() in ANALYZE. It turned out to
>be very
>> easy, patch attached. Your mileage may vary, but I'm seeing a nice
>gain from
>> this on my laptop. Taking a 30000 page sample of a table with 717717
>pages
>> (ie. slightly larger than RAM), ANALYZE takes about 6 seconds without
>the
>> patch, and less than a second with the patch, with
>> effective_io_concurrency=10. If anyone with a good test data set
>loaded
>> would like to test this and post some numbers, that would be great.
>
>Kernel version?

3.12, from Debian experimental. With an ssd drive and btrfs filesystem. Admittedly not your average database server setup, so it would be nice to get more reports from others.

>I raised this issue on LKML, and, while I got no news on this front,
>they might have silently fixed it. I'd have to check the sources
>again.

Maybe. Or maybe the heuristic read ahead isn't significant/helpful, when you're prefetching with posix_fadvise anyway.
I

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2013-12-09 23:46:38 Re: ANALYZE sampling is too good
Previous Message Claudio Freire 2013-12-09 23:29:50 Re: ANALYZE sampling is too good