Re: Buildfarm feature request: some way to track/classify failures

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Buildfarm feature request: some way to track/classify failures
Date: 2007-03-19 13:45:15
Message-ID: 45FE93EB.7030006@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> BTW, before I forget, this little project turned up a couple of
> small improvements for the current buildfarm infrastructure:
>
> 1. There are half a dozen entries with obviously bogus timestamps:
>
> bfarm=# select sysname,snapshot,branch from mfailures where snapshot < '2004-01-01';
> sysname | snapshot | branch
> ------------+---------------------+--------
> corgi | 1997-10-14 14:20:10 | HEAD
> kookaburra | 1970-01-01 01:23:00 | HEAD
> corgi | 1997-09-30 11:47:08 | HEAD
> corgi | 1997-10-17 14:20:11 | HEAD
> corgi | 1997-12-21 15:20:11 | HEAD
> corgi | 1997-10-15 14:20:10 | HEAD
> corgi | 1997-09-28 11:47:09 | HEAD
> corgi | 1997-09-28 11:47:08 | HEAD
> (8 rows)
>
> indicating wrong system clock settings on these buildfarm machines.
> (Indeed, IIRC these failures were actually caused by the ridiculous
> clock settings --- we have at least one regression test that checks
> century >= 21 ...) Perhaps the buildfarm server should bounce
> reports with timestamps more than a day in the past or a few minutes in
> the future. I think though that a more useful answer would be to
> include "time of receipt of report" in the permanent record, and then
> subsequent analysis could make its own decisions about whether to
> believe the snapshot timestamp --- plus we could track elapsed times for
> builds, which could be interesting in itself.
>

We actually do timestamp the reports - I just didn't include that in the
extract. I will alter the view it's based on. We started doing this in
Nov 2005, so I'm going to restrict the view to cases where the
report_time is not null - I doubt we're interested in ancient history.

A revised extract is available at
http://www.pgbuildfarm.org/mfailures2.dump

We already reject snapshot times that are in the future.

Use of NTP is highly recommended to buildfarm members, but I'm reluctant
to make it mandatory, as they might not have it available. I think we
can do this: alter the client script to report its idea of current time
at the time it makes the web transaction. If it's off from the server
time by more than some small value (say 60 secs), adjust the snapshot
time accordingly. If they don't report it then we can reject insane
dates (more than 24hours ago seems about right).

So I agree with both your suggestions ;-)

> 2. I was annoyed repeatedly that some buildfarm members weren't
> reporting log_archive_filenames entries, which forced going the long
> way round in the process I was using. Seems like we need some more
> proactive means for getting buildfarm owners to keep their script
> versions up-to-date. Not sure what that should look like exactly,
> as long as it's not "you can run an ancient version as long as you
> please".
>
>
>

Modern clients report the versions of the two scripts involved (see
script_version and web_script_version in reported config) so we could
easily enforce a minimum version on these.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-03-19 13:55:38 Re: Buildfarm feature request: some way to track/classify failures
Previous Message Teodor Sigaev 2007-03-19 13:40:52 Re: Indexam interface proposal