Re: Intermittent regression test failure from index-only scans patch

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Intermittent regression test failure from index-only scans patch
Date: 2011-10-08 15:04:56
Message-ID: 10290.1318086296@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I notice that several members of the buildfarm have been consistently
showing the same regression test failure since the index-only scans
patch went in:

*** /home/pgbuildfarm/workdir/HEAD/pgsql.27150/src/test/regress/expected/stats.out Sat Oct 8 03:20:05 2011
--- /home/pgbuildfarm/workdir/HEAD/pgsql.27150/src/test/regress/results/stats.out Sat Oct 8 12:30:55 2011
***************
*** 94,100 ****
WHERE st.relname='tenk2' AND cl.relname='tenk2';
?column? | ?column? | ?column? | ?column?
----------+----------+----------+----------
! t | t | t | t
(1 row)

SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,
--- 94,100 ----
WHERE st.relname='tenk2' AND cl.relname='tenk2';
?column? | ?column? | ?column? | ?column?
----------+----------+----------+----------
! t | t | t | f
(1 row)

SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,

The diff indicates that the idx_scan count advanced but idx_tup_fetch
did not, which is not so surprising here because tenk2 hasn't been
modified in some time. If the autovacuum daemon managed to mark it
all-visible before the stats test runs, then an index-only scan will
happen, and bingo, no idx_tup_fetch increment (because indeed no heap
tuple was fetched).

I'm inclined to fix this by changing the test to examine idx_tup_read
not idx_tup_fetch. Alternatively, we could have the test force
enable_indexonlyscan off. Thoughts?

regards, tom lane


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Intermittent regression test failure from index-only scans patch
Date: 2011-10-08 15:28:55
Message-ID: CAF6yO=3YzCCW4XTF0bvC9mkE+8Cj8+Uy76kA7vzini7Bh67PKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/10/8 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> I notice that several members of the buildfarm have been consistently
> showing the same regression test failure since the index-only scans
> patch went in:
>
> *** /home/pgbuildfarm/workdir/HEAD/pgsql.27150/src/test/regress/expected/stats.out      Sat Oct  8 03:20:05 2011
> --- /home/pgbuildfarm/workdir/HEAD/pgsql.27150/src/test/regress/results/stats.out       Sat Oct  8 12:30:55 2011
> ***************
> *** 94,100 ****
>   WHERE st.relname='tenk2' AND cl.relname='tenk2';
>   ?column? | ?column? | ?column? | ?column?
>  ----------+----------+----------+----------
> !  t        | t        | t        | t
>  (1 row)
>
>  SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,
> --- 94,100 ----
>   WHERE st.relname='tenk2' AND cl.relname='tenk2';
>   ?column? | ?column? | ?column? | ?column?
>  ----------+----------+----------+----------
> !  t        | t        | t        | f
>  (1 row)
>
>  SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,
>
>
> The diff indicates that the idx_scan count advanced but idx_tup_fetch
> did not, which is not so surprising here because tenk2 hasn't been
> modified in some time.  If the autovacuum daemon managed to mark it
> all-visible before the stats test runs, then an index-only scan will
> happen, and bingo, no idx_tup_fetch increment (because indeed no heap
> tuple was fetched).
>
> I'm inclined to fix this by changing the test to examine idx_tup_read
> not idx_tup_fetch.  Alternatively, we could have the test force
> enable_indexonlyscan off.  Thoughts?

No preferences, but is it interesting to add a "vacuum freeze"
somewhere and check expected result after index-only scan ? (for both
idx_tup_read and idx_tup_fetch)

>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Intermittent regression test failure from index-only scans patch
Date: 2011-10-08 15:57:37
Message-ID: 10942.1318089457@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?ISO-8859-1?Q?C=E9dric_Villemain?= <cedric(dot)villemain(dot)debian(at)gmail(dot)com> writes:
> 2011/10/8 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> The diff indicates that the idx_scan count advanced but idx_tup_fetch
>> did not, which is not so surprising here because tenk2 hasn't been
>> modified in some time. If the autovacuum daemon managed to mark it
>> all-visible before the stats test runs, then an index-only scan will
>> happen, and bingo, no idx_tup_fetch increment (because indeed no heap
>> tuple was fetched).
>>
>> I'm inclined to fix this by changing the test to examine idx_tup_read
>> not idx_tup_fetch. Alternatively, we could have the test force
>> enable_indexonlyscan off. Thoughts?

> No preferences, but is it interesting to add a "vacuum freeze"
> somewhere and check expected result after index-only scan ? (for both
> idx_tup_read and idx_tup_fetch)

This test is only trying to make sure that the stats collection
machinery is working. I don't think that we should try to coerce things
so that it can check something as context-sensitive as whether an
index-only scan happened. It's too fragile already --- we've seen
non-reproducible failures here many times before.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Intermittent regression test failure from index-only scans patch
Date: 2011-10-08 20:04:23
Message-ID: 60921B0F-2D8D-473C-90C5-56E3A6E0AC99@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Oct 8, 2011, at 11:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'm inclined to fix this by changing the test to examine idx_tup_read
> not idx_tup_fetch. Alternatively, we could have the test force
> enable_indexonlyscan off. Thoughts?

No preference.

Should we have another counter for heap fetches avoided? Seems like that could be useful to know.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Intermittent regression test failure from index-only scans patch
Date: 2011-10-09 04:34:26
Message-ID: 27341.1318134866@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Oct 8, 2011, at 11:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm inclined to fix this by changing the test to examine idx_tup_read
>> not idx_tup_fetch. Alternatively, we could have the test force
>> enable_indexonlyscan off. Thoughts?

> No preference.

I ended up doing it the second way (ie enable_indexonlyscan = off)
because it turns out that pg_stat_user_tables doesn't have the
idx_tup_read column --- we track that count per index, not per table.
I could have complicated the test's stats queries some more, but it
seemed quite not relevant to the goals of the test.

> Should we have another counter for heap fetches avoided? Seems like that could be useful to know.

Hm. I'm hesitant to add another per-table (or per index?) statistics
counter because of the resultant bloat in the stats file. But it
wouldn't be a bad idea for somebody to take two steps back and rethink
what we're counting in this area. The current counter definitions are
mostly backwards-compatible with pre-8.1 behavior, and it seems like the
goalposts have moved enough that maybe it's time to break compatibility.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Intermittent regression test failure from index-only scans patch
Date: 2011-10-11 19:38:54
Message-ID: CABUevEzwwPjZHtOA+ToOOusXmosTxwU5D+Z5z5rqScgFkE0b4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Oct 9, 2011 at 06:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Oct 8, 2011, at 11:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I'm inclined to fix this by changing the test to examine idx_tup_read
>>> not idx_tup_fetch.  Alternatively, we could have the test force
>>> enable_indexonlyscan off.  Thoughts?
>
>> No preference.
>
> I ended up doing it the second way (ie enable_indexonlyscan = off)
> because it turns out that pg_stat_user_tables doesn't have the
> idx_tup_read column --- we track that count per index, not per table.
> I could have complicated the test's stats queries some more, but it
> seemed quite not relevant to the goals of the test.
>
>> Should we have another counter for heap fetches avoided?  Seems like that could be useful to know.
>
> Hm.  I'm hesitant to add another per-table (or per index?) statistics
> counter because of the resultant bloat in the stats file.  But it
> wouldn't be a bad idea for somebody to take two steps back and rethink
> what we're counting in this area.  The current counter definitions are
> mostly backwards-compatible with pre-8.1 behavior, and it seems like the
> goalposts have moved enough that maybe it's time to break compatibility.

We certainly need *some* way to figure out if this has been used,
IMHO. So yeah, if the current way doesn't scale enough, we need to
think of some other way. But I'm not sure one more counter would
really bloat it that bad? OTOH, repeating that reasoning enough time
will eventually make it enough to care about...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Intermittent regression test failure from index-only scans patch
Date: 2011-10-11 19:46:28
Message-ID: 2760.1318362388@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Sun, Oct 9, 2011 at 06:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Should we have another counter for heap fetches avoided? Seems like that could be useful to know.

>> Hm. I'm hesitant to add another per-table (or per index?) statistics
>> counter because of the resultant bloat in the stats file.

> We certainly need *some* way to figure out if this has been used,
> IMHO. So yeah, if the current way doesn't scale enough, we need to
> think of some other way. But I'm not sure one more counter would
> really bloat it that bad? OTOH, repeating that reasoning enough time
> will eventually make it enough to care about...

You can already tell whether it's happening by comparing idx_tup_read
versus idx_tup_fetch. Now that measure does conflate some things, like
whether the tuple was not read at all or was read and rejected as not
visible, but I'm not at all convinced that another counter is worth its
weight. If invisible tuples are a significant part of the table then
index-only scanning isn't going to be very useful to you anyway.

regards, tom lane