uh-oh, dugong failing again (was Re: Pgbuildfarm-status-green Digest, Vol 28, Issue 24)

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org, math(at)sai(dot)msu(dot)ru
Subject: uh-oh, dugong failing again (was Re: Pgbuildfarm-status-green Digest, Vol 28, Issue 24)
Date: 2007-09-26 15:04:52
Message-ID: 21277.1190819092@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> The PGBuildfarm member dugong had the following event on branch HEAD:
> Status changed from OK to ContribCheck failure
> The snapshot timestamp for the build that triggered this notification is: 2007-09-25 20:05:01

This seems to be exactly what we saw two weeks ago, and I just noticed
that in the JIT bgwriter patch, I put an Assert into ForwardFsyncRequest
in exactly the place where one was removed to make icc happy two weeks
ago. This one is less cosmetic and so I'm not as willing to just take
it out. I think we need to look closer. Can we confirm that
ForwardFsyncRequest somehow becomes a no-op when icc compiles it with an
Assert right there?

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>, <math(at)sai(dot)msu(dot)ru>
Subject: Re: uh-oh, dugong failing again
Date: 2007-10-04 14:53:48
Message-ID: 878x6jylqb.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> The PGBuildfarm member dugong had the following event on branch HEAD:
>> Status changed from OK to ContribCheck failure
>> The snapshot timestamp for the build that triggered this notification is: 2007-09-25 20:05:01
>
> This seems to be exactly what we saw two weeks ago, and I just noticed
> that in the JIT bgwriter patch, I put an Assert into ForwardFsyncRequest
> in exactly the place where one was removed to make icc happy two weeks
> ago. This one is less cosmetic and so I'm not as willing to just take
> it out. I think we need to look closer. Can we confirm that
> ForwardFsyncRequest somehow becomes a no-op when icc compiles it with an
> Assert right there?

It seems to work with icc on my 32 bit intel cpu. Earlier you speculated that
the struct might be getting padded out which would cause hash failures. But
surely using a different padding from other compilers would be a compiler bug
since it would be an incompatible ABI change. I find it hard to believe
intel's compiler would get the ia64 ABI wrong. And hard to believe nobody's
noticed an incompatible ABI from gcc-generated binaries.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, math(at)sai(dot)msu(dot)ru
Subject: Re: uh-oh, dugong failing again
Date: 2007-10-04 17:47:28
Message-ID: 25195.1191520048@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> This seems to be exactly what we saw two weeks ago, and I just noticed
>> that in the JIT bgwriter patch, I put an Assert into ForwardFsyncRequest
>> in exactly the place where one was removed to make icc happy two weeks
>> ago. This one is less cosmetic and so I'm not as willing to just take
>> it out. I think we need to look closer. Can we confirm that
>> ForwardFsyncRequest somehow becomes a no-op when icc compiles it with an
>> Assert right there?

> It seems to work with icc on my 32 bit intel cpu. Earlier you speculated that
> the struct might be getting padded out which would cause hash failures. But
> surely using a different padding from other compilers would be a compiler bug
> since it would be an incompatible ABI change. I find it hard to believe
> intel's compiler would get the ia64 ABI wrong. And hard to believe nobody's
> noticed an incompatible ABI from gcc-generated binaries.

Well, I changed the Assert() to an explicit if-test-and-elog, and the
failure seems to have gone away. So I'd say that makes it absolutely
certainly an icc bug. Not clear what difference icc sees between an
enabled Assert and an if/elog, but evidently there is one.

regards, tom lane