Occasional failures on buildfarm member eukaryote

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mark Wong" <markwkm(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Occasional failures on buildfarm member eukaryote
Date: 2009-07-31 15:25:01
Message-ID: 20110.1249053901@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've noticed that every so often eukaryote reports a regression failure
with just this one diff:

*** /data/markwkm/local/pgfarmbuild-cell/HEAD/pgsql.4654/src/test/regress/expected/plpgsql.out Fri Jul 31 04:00:51 2009
--- /data/markwkm/local/pgfarmbuild-cell/HEAD/pgsql.4654/src/test/regress/results/plpgsql.out Fri Jul 31 04:22:38 2009
***************
*** 2041,2046 ****
--- 2041,2047 ----
(1 row)

reset statement_timeout;
+ ERROR: canceling statement due to statement timeout
select * from foo;
f1
----

The latest example is at
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=eukaryote&dt=2009-07-31%2004:00:02
but there are quite a few earlier cases. It's repeatable enough that
I think we should try to fix it, if only to reduce noise in the
buildfarm reports.

One possibility is that eukaryote is just remarkably heavily loaded
and increasing the statement_timeout value being used in this test
(currently 2 seconds) would make the failure go away. I'm not sure
I believe this theory though, because there are lots of fairly slow
machines in the buildfarm, but eukaryote seems to be the only one
that's showing this type of failure.

If it's not that, I think this must indicate some weird
platform-specific issue with timeout handling; but I can't think what,
since it seems to be running stock Fedora 8.

Thoughts? Can you make this machine available for closer examination?

regards, tom lane