Re: [PoC] pgstattuple2: block sampling to reduce physical read

From: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] pgstattuple2: block sampling to reduce physical read
Date: 2013-09-14 07:18:50
Message-ID: 52340DDA.9090606@uptime.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2013/07/23 20:02), Greg Smith wrote:
> On 7/23/13 2:16 AM, Satoshi Nagayasu wrote:
>> I've been working on new pgstattuple function to allow
>> block sampling [1] in order to reduce block reads while
>> scanning a table. A PoC patch is attached.
>
> Take a look at all of the messages linked in
> https://commitfest.postgresql.org/action/patch_view?id=778
>
> Jaime and I tried to do what you're working on then, including a random
> block sampling mechanism modeled on the stats_target mechanism. We
> didn't do that as part of pgstattuple though, which was a mistake.
>
> Noah created some test cases as part of his thorough review that were
> not computing the correct results. Getting the results correct for all
> of the various types of PostgreSQL tables and indexes ended up being
> much harder than the sampling part. See
> http://www.postgresql.org/message-id/20120222052747.GE8592@tornado.leadboat.com
> in particular for that.

Thanks for the info. I have read the previous discussion.

I'm looking forward to seeing more feedback on this approach,
in terms of design and performance improvement.
So, I have submitted this for the next CF.

>> This new function, pgstattuple2(), samples only 3,000 blocks
>> (which accounts 24MB) from the table randomly, and estimates
>> several parameters of the entire table.
>
> There should be an input parameter to the function for how much sampling
> to do, and if it's possible to make the scale for it to look like
> ANALYZE that's helpful too.
>
> I have a project for this summer that includes reviving this topic and
> making sure it works on some real-world systems. If you want to work on
> this too, I can easily combine that project into what you're doing.

Yeah, I'm interested in that. Something can be shared?

Regards,
--
Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Uptime Technologies, LLC. http://www.uptime.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-09-14 07:22:42 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Previous Message Amit Kapila 2013-09-14 07:03:23 Re: Review: Patch to compute Max LSN of Data Pages