Re: Tracing down buildfarm "postmaster does not shut down" failures

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Tracing down buildfarm "postmaster does not shut down" failures
Date: 2016-02-09 16:03:29
Message-ID: 56BA0DD1.8030901@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/08/2016 10:55 PM, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
>> On Mon, Feb 08, 2016 at 02:15:48PM -0500, Tom Lane wrote:
>>> We've seen variants
>>> on this theme on half a dozen machines just in the past week --- and it
>>> seems to mostly happen in 9.5 and HEAD, which is fishy.
>> It has been affecting only the four AIX animals, which do share hardware.
>> (Back in 2015 and once in 2016-01, it did affect axolotl and shearwater.)
> Certainly your AIX critters have shown this a bunch, but here's another
> current example:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2016-02-08%2014%3A49%3A23
>
>> That's reasonable. If you would like higher-fidelity data, I can run loops of
>> "pg_ctl -w start; make installcheck; pg_ctl -t900 -w stop", and I could run
>> that for HEAD and 9.2 simultaneously. A day of logs from that should show
>> clearly if HEAD is systematically worse than 9.2.
> That sounds like a fine plan, please do it.
>
>> So, I wish to raise the timeout for those animals. Using an environment
>> variable was a good idea; it's one less thing for test authors to remember.
>> Since the variable affects a performance-related fudge factor rather than
>> change behavior per se, I'm less skittish than usual about unintended
>> consequences of dynamic scope. (With said unintended consequences in mind, I
>> made "pg_ctl register" ignore PGCTLTIMEOUT rather than embed its value into
>> the service created.)
> While this isn't necessarily a bad idea in isolation, the current
> buildfarm scripts explicitly specify a -t value to pg_ctl stop, which
> I would not expect an environment variable to override. So we need
> to fix the buildfarm script to allow the timeout to be configurable.
> I'm not sure if there would be any value-add in having pg_ctl answer
> to an environment variable once we've done that.

The failure on axolotl was for the ECPG tests, which don't use the
buildfarm's startup/stop db code. They don't honour TEMP_CONFIG either,
which they probably should - then we might get better log traces.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2016-02-09 16:14:06 Re: [patch] Proposal for \crosstabview in psql
Previous Message Tom Lane 2016-02-09 15:27:25 Re: ALTER EXTENSION DROP FUNCTION not working ?