Re: [PERFORM] encouraging index-only scans

From: Jim Nasby <jim(at)nasby(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] encouraging index-only scans
Date: 2013-09-18 20:28:53
Message-ID: 523A0D05.70504@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 9/17/13 6:10 PM, Andres Freund wrote:
>> What if we maintained XID stats for ranges of pages in a separate
>> >fork? Call it the XidStats fork. Presumably the interesting pieces
>> >would be min(xmin) and max(xmax) for pages that aren't all visible. If
>> >we did that at a granularity of, say, 1MB worth of pages[1] we're
>> >talking 8 bytes per MB, or 1 XidStats page per GB of heap. (Worst case
>> >alignment bumps that up to 2 XidStats pages per GB of heap.)

> Yes, I have thought about similar ideas as well, but I came to the
> conclusion that it's not worth it. If you want to make the boundaries
> precise and the xidstats fork small, you're introducing new contention
> points because every DML will need to make sure it's correct.

Actually, that's not true... the XidStats only need to be "relatively" precise. IE: within a few hundred or thousand XIDs.

So for example, you'd only need to attempt an update if the XID already stored was more than a few hundred/thousand/whatever XIDs away from your XID. If it's any closer don't even bother to update.

That still leaves potential for thundering herd on the fork buffer lock if you've got a ton of DML on one table across a bunch of backends, but there might be other ways around that. For example, if you know you can update the XID with a CPU-atomic instruction, you don't need to lock the page.

> Also, the amount of code that would require seems to be bigger than
> justified by the increase of precision when to vacuum.

That's very possibly true. I haven't had a chance to see how much VM bits help reduce vacuum overhead yet, so I don't have anything to add on this front. Perhaps others might.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2013-09-18 21:09:26 Re: record identical operator
Previous Message Jim Nasby 2013-09-18 20:17:31 Re: Assertions in PL/PgSQL

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2013-09-19 18:39:43 Re: [PERFORM] encouraging index-only scans
Previous Message Brian Fehrle 2013-09-18 18:50:05 Re: View with and without ::text casting performs differently.