Re: show Heap Fetches in EXPLAIN for index-only scans

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-13 15:21:22
Message-ID: CA+TgmoYwVhYkGJfNWz7DF3P5VPw-8qv97o8dwdhfpxf6O1tYfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I think that people who are using index-only scans are going to want
some way to find out the degree to which the scans are in fact
index-only.

So here's a 5-line patch that adds the number of heap fetches to the
EXPLAIN ANALYZE output. This might not be all the instrumentation
we'll ever want here, but I think we at least want this much.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
explain-heap-fetches.patch application/octet-stream 1.5 KB

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-13 15:29:28
Message-ID: CABUevEw8189hPjRoq70TVG0OzfjQwWoyit=h9XhOZFCxeuen9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 13, 2012 at 16:21, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think that people who are using index-only scans are going to want
> some way to find out the degree to which the scans are in fact
> index-only.
>
> So here's a 5-line patch that adds the number of heap fetches to the
> EXPLAIN ANALYZE output.  This might not be all the instrumentation
> we'll ever want here, but I think we at least want this much.

Agreed.

Would also be good to have counter sin pg_stat_* for this, since you'd
usually want to look at this kind of data over time as well. In your
plans? ;)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-13 15:31:19
Message-ID: CAEYLb_Uy1dnSWA9g+SH7CAzN5g=Mx63WzL9pS50iGh1S+AjG6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 January 2012 15:21, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think that people who are using index-only scans are going to want
> some way to find out the degree to which the scans are in fact
> index-only.
>
> So here's a 5-line patch that adds the number of heap fetches to the
> EXPLAIN ANALYZE output.  This might not be all the instrumentation
> we'll ever want here, but I think we at least want this much.

Good idea. The fact that that information wasn't available did bother me.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-13 15:31:42
Message-ID: CA+TgmobfkaFay-4cd-d6AqJ1rEtNFAmBhky7KT54d8xWe3gyVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 13, 2012 at 10:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Fri, Jan 13, 2012 at 16:21, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I think that people who are using index-only scans are going to want
>> some way to find out the degree to which the scans are in fact
>> index-only.
>>
>> So here's a 5-line patch that adds the number of heap fetches to the
>> EXPLAIN ANALYZE output.  This might not be all the instrumentation
>> we'll ever want here, but I think we at least want this much.
>
> Agreed.
>
> Would also be good to have counter sin pg_stat_* for this, since you'd
> usually want to look at this kind of data over time as well. In your
> plans? ;)

Not really. I don't have a clear enough idea about what that should
look like, and I expect a vigorous debate over the distributed cost of
another counter. But I'm happy to have someone who feels more
strongly about it than I do take up the cause.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-13 15:41:45
Message-ID: CAEYLb_VgER4ED9SNOJDHvtYZwU6EsO-mXV6xrx4fss6XN4MnbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 January 2012 15:31, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jan 13, 2012 at 10:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> Would also be good to have counter sin pg_stat_* for this, since you'd
>> usually want to look at this kind of data over time as well. In your
>> plans? ;)
>
> Not really.  I don't have a clear enough idea about what that should
> look like, and I expect a vigorous debate over the distributed cost of
> another counter.  But I'm happy to have someone who feels more
> strongly about it than I do take up the cause.

Maybe the the ioss_HeapFetches field should be a uint32? That's all
it's going to be on some platforms anyway.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-13 16:08:16
Message-ID: CA+TgmoYxhDn4RDXkPAEWG6Diy7jB2D0-i+cP41WGhRyew5cTCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 13, 2012 at 10:41 AM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> On 13 January 2012 15:31, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, Jan 13, 2012 at 10:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> Would also be good to have counter sin pg_stat_* for this, since you'd
>>> usually want to look at this kind of data over time as well. In your
>>> plans? ;)
>>
>> Not really.  I don't have a clear enough idea about what that should
>> look like, and I expect a vigorous debate over the distributed cost of
>> another counter.  But I'm happy to have someone who feels more
>> strongly about it than I do take up the cause.
>
> Maybe the the ioss_HeapFetches field should be a uint32? That's all
> it's going to be on some platforms anyway.

I originally had it that way, but then it didn't match what
ExplainPropertyLong() was looking for. I didn't think it was worth
further complicating explain.c for this...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-13 16:24:29
Message-ID: 22897.1326471869@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> So here's a 5-line patch that adds the number of heap fetches to the
> EXPLAIN ANALYZE output. This might not be all the instrumentation
> we'll ever want here, but I think we at least want this much.

Cosmetic gripes:

1. Please initialize the counter in ExecInitIndexOnlyScan. We don't
generally rely on node fields to init as zeroes.

2. Project style is to use foo++, not ++foo, in contexts where
it doesn't actually matter which is used.

regards, tom lane


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-22 02:50:31
Message-ID: CAMkU=1xYPd_i8G9AAMgC-SY5T1Ofg0AkHTn9n9ZUM1u1j51jUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 13, 2012 at 7:21 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think that people who are using index-only scans are going to want
> some way to find out the degree to which the scans are in fact
> index-only.
>
> So here's a 5-line patch that adds the number of heap fetches to the
> EXPLAIN ANALYZE output.  This might not be all the instrumentation
> we'll ever want here, but I think we at least want this much.

A review:

It is not in context diff format.

It applies, builds, and runs cleanly, including under --enable-cassert

No docs, but the output format of other EXPLAIN ANALYZE fields
generally aren't documented either. But it is fairly self-explanatory
(provided one knows what an index-only scan is in the first place).

No regression tests, but that too seems appropriate.

It does what it says. We (or I, anyway) want what it does. As
mentioned, we might want counts also exposed via the stats collector,
but that is another matter.

I don't see how it could possibly cause a meaningful slow-down, and
tests on all-memory index-only scans cannot detect a slow down.

On my machine, using EXPLAIN ANALYZE causes a 16 fold slow down on an
in memory -i -s100 "select count(*) from pgbench_accounts" conducted
after a run of pgbench -T30, so that there are some heap fetches
needed. Compared to that ANALYZE slowdown, any additional slow down
from this patch is ridiculously slow. It would be nice if you could
get the output of this patch (and of the BUFFERS option to EXPLAIN)
without incurring the timing overhead of ANALYZE. But again, that is
not the subject of this patch.

I wondered about the type of ioss_HeapFetches. Other members of
execnodes.h structs tend to be int64, not long. But those others are
not reported in EXPLAIN. Other things reported in explain.c tend to
be long. This seems to be a foot in both worlds, and your response to
Peter is convincing.

I agree with Tom on the pre increment versus post increment of ioss_HeapFetches.

I also wondered if ioss_HeapFetches ought to be initialized to zero.
I looked for analogous code but didn't find enough analogy to convince
me one way or the other.

All the other members of IndexOnlyScanState have entries in the big
comment block preceding the typedef, so I would expect
ioss_HeapFetches should have one as well.

These are all minor issues, and since you are a committer I'm going to
mark it ready for committer.

As a side-note, I noticed that I needed to run vacuum twice in a row
to get the Heap Fetches to drop to zero. I vaguely recall that only
one vacuum was needed when ios first went in (and I had instrumented
it to elog heap-fetches). Does anyone know if this the expected
consequence of one of the recent changes we made to vacuum?

Thanks,

Jeff


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-26 01:42:05
Message-ID: CA+TgmoYt1bD8H31MMAGy8NkEhkea2HFwe4TE4NvCV1R-srs1Rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jan 21, 2012 at 9:50 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> A review:
>
> [ review ]

Thanks. Committed with hopefully-appropriate revisions.

> As a side-note, I noticed that I needed to run vacuum twice in a row
> to get the Heap Fetches to drop to zero.  I vaguely recall that only
> one vacuum was needed when ios first went in (and I had instrumented
> it to elog heap-fetches).  Does anyone know if this the expected
> consequence of one of the recent changes we made to vacuum?

No, that's not expected. The change we made to vacuum was to skip
pages that are busy - but it shouldn't be randomly skipping pages for
no reason. I can reproduce what you're observing, though:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)

After updating a row in the table and checkpointing, the page the rows
was on is marked full and the page that gets the new version becomes
not-all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x000003fb Flags: 0x0003 (HAS_FREE_LINES|PAGE_FULL)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0001 (HAS_FREE_LINES)

Now I vacuum the relation and checkpoint, and the page the *new*
relation is on becomes all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0001 (HAS_FREE_LINES)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)

Now I vacuum it again and checkpoint, and now the old page also
becomes all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)

But it seems to me that this is expected (if non-optimal) behavior.
Only the first pass of vacuum knows how to mark pages all-visible.
After the update, the first pass of the first vacuum sees a dead tuple
on the old page and truncates it to a dead line pointer. When it
comes to the new page, it observes that the page is now all-visible
and marks it so. It then does index vacuuming and returns to the
first page, marking the dead line pointer unused. But during this
second visit to the old page, there's no possibility of marking the
page as all-visible, because the code doesn't know how to do that.
The next vacuum's first pass, however, can do so, because there are no
longer any dead tuples on the page.

We could fix this by modifying lazy_vacuum_page(): since we have to
dirty the buffer anyway, we could recheck whether all the remaining
tuples on the page are now all-visible, and if so set the visibility
map bit. This is probably desirable even apart from index-only scans,
because it will frequently save the next vacuum the cost of reading,
dirtying, and writing extra pages. There will be some incremental CPU
cost, but that seems likely to be more than repaid by the I/O savings.

Thoughts? Should we do this at all? If so, should we squeeze it into
9.2 or leave it for 9.3?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-02-02 23:51:28
Message-ID: 20120202235128.GC21463@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 25, 2012 at 08:42:05PM -0500, Robert Haas wrote:
> Only the first pass of vacuum knows how to mark pages all-visible.
> After the update, the first pass of the first vacuum sees a dead tuple
> on the old page and truncates it to a dead line pointer. When it
> comes to the new page, it observes that the page is now all-visible
> and marks it so. It then does index vacuuming and returns to the
> first page, marking the dead line pointer unused. But during this
> second visit to the old page, there's no possibility of marking the
> page as all-visible, because the code doesn't know how to do that.
> The next vacuum's first pass, however, can do so, because there are no
> longer any dead tuples on the page.
>
> We could fix this by modifying lazy_vacuum_page(): since we have to
> dirty the buffer anyway, we could recheck whether all the remaining
> tuples on the page are now all-visible, and if so set the visibility
> map bit. This is probably desirable even apart from index-only scans,
> because it will frequently save the next vacuum the cost of reading,
> dirtying, and writing extra pages. There will be some incremental CPU
> cost, but that seems likely to be more than repaid by the I/O savings.
>
> Thoughts? Should we do this at all? If so, should we squeeze it into
> 9.2 or leave it for 9.3?

Sounds like a good idea. It has bothered me that two consecutive VACUUMs of a
table, with no intervening activity, can dirty a page twice. Making that less
frequent is a good thing. I'd hold the change for 9.3, but that's probably an
unusually-conservative viewpoint.