Re: horo(r)logy test fail on solaris (again and solved)

Lists: pgsql-hackers
From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Match(dot)Grun(at)thomson(dot)com
Subject: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-26 15:31:48
Message-ID: 451947E4.2030504@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I tried regression test with Postgres Beta and horology test field. See
attached log. It appears few month ago - see
http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php
I used Sun Studio 11 with -fast flag and SPARC platform.

I played little bit with cc flags and following flags work fine for me:

export CFLAGS="-fast"
export LDFLAGS="-lm -fast"

The fast switch for compiler is very important too, because it links
"fast" library.

Could anybody confirm that it works on his machine?

But the question is if the "-fast" flag is good for postgres. The -fast
flag sets "brutal" floating point optimization and some operation should
have less precision. Is possible verify that floating point operation
works well?

I read postgres documentation about floating point datatypes and that
implementation is platform specific. Developer must take care about it
discrepancies, but should there any other part of postgres code where
"-fast" switch generate some computing defect - it means that result
must be platform independent?

The cc flags are describes in
http://docs.sun.com/source/819-3688/cc_ops.app.html.

Zdenek

Attachment Content-Type Size
regression.diffs text/plain 1.7 KB
regression.out text/plain 4.3 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-26 15:37:22
Message-ID: 26191.1159285042@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> But the question is if the "-fast" flag is good for postgres. The -fast
> flag sets "brutal" floating point optimization and some operation should
> have less precision. Is possible verify that floating point operation
> works well?

That's a pretty good way to guarantee that you'll break the datetime
code.

It might be acceptable if you use --enable-integer-datetimes.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and
Date: 2006-09-26 15:48:10
Message-ID: 200609261548.k8QFmAf25622@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala wrote:
> I tried regression test with Postgres Beta and horology test field. See
> attached log. It appears few month ago - see
> http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php
> I used Sun Studio 11 with -fast flag and SPARC platform.

Are you looking for ways to contort Solaris to make PostgreSQL fail?
That doesn't prove much about PostgreSQL, but rather about Solaris.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-26 15:55:52
Message-ID: 45194D88.2080703@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>
>> But the question is if the "-fast" flag is good for postgres. The -fast
>> flag sets "brutal" floating point optimization and some operation should
>> have less precision. Is possible verify that floating point operation
>> works well?
>>
>
> That's a pretty good way to guarantee that you'll break the datetime
> code.
>
>

! | @ 6 years | @ 5 years 12 mons 5 days 6 hours

Doesn't this look odd regardless of what bad results come back from the
FP library?

cheers

andrew


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-26 15:57:18
Message-ID: 45194DDE.7090800@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane napsal(a):
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>> But the question is if the "-fast" flag is good for postgres. The -fast
>> flag sets "brutal" floating point optimization and some operation should
>> have less precision. Is possible verify that floating point operation
>> works well?
>
> That's a pretty good way to guarantee that you'll break the datetime
> code.
>
> It might be acceptable if you use --enable-integer-datetimes.

I suggest to remove mention about -fast flag from FAQ.Solaris or add
warning about usage of this.

Josh do you have any cc flags suggestion?

regards, Zdenek


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-26 16:03:09
Message-ID: 45194F3D.3080107@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian napsal(a):
> Zdenek Kotala wrote:
>> I tried regression test with Postgres Beta and horology test field. See
>> attached log. It appears few month ago - see
>> http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php
>> I used Sun Studio 11 with -fast flag and SPARC platform.
>
> Are you looking for ways to contort Solaris to make PostgreSQL fail?
> That doesn't prove much about PostgreSQL, but rather about Solaris.
>

It is not about Solaris, It is about recommended setting for Sun Studio
in the FAQ.Solaris.

regards Zdenek


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-26 16:15:40
Message-ID: 26677.1159287340@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours

> Doesn't this look odd regardless of what bad results come back from the
> FP library?

It looks exactly like the sort of platform-dependent rounding issue that
Bruce and Michael Glaesemann spent a lot of time on recently. It might
be interesting to see if CVS HEAD works any better under these
conditions ... but if it doesn't, that doesn't mean I'll be interested
in fixing it. Getting the float datetime code to work is hard enough
without having a compiler that thinks it can take shortcuts.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-26 17:06:04
Message-ID: 45195DFC.3030907@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek,

>> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>>> But the question is if the "-fast" flag is good for postgres. The
>>> -fast flag sets "brutal" floating point optimization and some
>>> operation should have less precision. Is possible verify that
>>> floating point operation works well?
>>
>> That's a pretty good way to guarantee that you'll break the datetime
>> code.
>>
>> It might be acceptable if you use --enable-integer-datetimes.
>
> I suggest to remove mention about -fast flag from FAQ.Solaris or add
> warning about usage of this.
>
> Josh do you have any cc flags suggestion?

Using Sun Studio? I'm hardly the expert. Maybe Jignesh?

--Josh Berkus


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Zdenek Kotala" <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and
Date: 2006-09-26 17:18:33
Message-ID: C13EAEF9.2AAA%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

On 9/26/06 9:15 AM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours
>
>> Doesn't this look odd regardless of what bad results come back from the
>> FP library?
>
> It looks exactly like the sort of platform-dependent rounding issue that
> Bruce and Michael Glaesemann spent a lot of time on recently. It might
> be interesting to see if CVS HEAD works any better under these
> conditions ... but if it doesn't, that doesn't mean I'll be interested
> in fixing it. Getting the float datetime code to work is hard enough
> without having a compiler that thinks it can take shortcuts.

How about fixing the compilation so that the routines in adt that are
sensitive to FP optimizations are isolated from aggressive optimization?

- Luke


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-27 00:11:06
Message-ID: 200609261711.08206.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek,

Hmmm ... we're not using the -fast option for the standard PostgreSQL
packages. Where did you start using it?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: josh(at)agliodbs(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-27 07:31:28
Message-ID: 451A28D0.9020106@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus napsal(a):
> Zdenek,
>
> Hmmm ... we're not using the -fast option for the standard PostgreSQL
> packages. Where did you start using it?

Yes, I know. The -fast option generates architecture depending code and
it is not possible use in common packages. I found out this option when
I analyzed BUG #2651. I tried regression test and it's fail. I found
that same problem was described with Match Grun few month ago and the
-fast option is mentioned in the FAQ.Solaris for performance tunning.

That is all.

regards Zdenek


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-27 13:40:09
Message-ID: 451A7F39.1000101@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan napsal(a):
>
>
> Tom Lane wrote:
>> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>>
>>> But the question is if the "-fast" flag is good for postgres. The
>>> -fast flag sets "brutal" floating point optimization and some
>>> operation should have less precision. Is possible verify that
>>> floating point operation works well?
>>>
>>
>> That's a pretty good way to guarantee that you'll break the datetime
>> code.
>>
>>
>
> ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours
>
>
>
> Doesn't this look odd regardless of what bad results come back from the
> FP library?

The problem was generated, because -fast option was set only for the
compiler and not for the linker. Linker takes wrong version of
libraries. If -fast is set for both then horology test is OK, but
question was if float optimalization should generate some problems.

regards, Zdenek


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-27 13:51:37
Message-ID: 8453.1159365097@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> The problem was generated, because -fast option was set only for the
> compiler and not for the linker. Linker takes wrong version of
> libraries. If -fast is set for both then horology test is OK, but
> question was if float optimalization should generate some problems.

So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
LDFLAGS?

regards, tom lane


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-27 14:09:18
Message-ID: 451A860E.8080604@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane napsal(a):
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>> The problem was generated, because -fast option was set only for the
>> compiler and not for the linker. Linker takes wrong version of
>> libraries. If -fast is set for both then horology test is OK, but
>> question was if float optimalization should generate some problems.
>
> So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
> LDFLAGS?

Exactly, but I want to sure, that float optimalization is safe and
should be applied for postgres, because -fast breaks IEE754 standard. If
it is OK I will adjust FAQ_Solaris.

Zdenek


From: Kenneth Marshall <ktm(at)it(dot)is(dot)rice(dot)edu>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and solved)
Date: 2006-09-27 14:26:08
Message-ID: 20060927142608.GN22081@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 27, 2006 at 04:09:18PM +0200, Zdenek Kotala wrote:
> Tom Lane napsal(a):
> >Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> >>The problem was generated, because -fast option was set only for the
> >>compiler and not for the linker. Linker takes wrong version of
> >>libraries. If -fast is set for both then horology test is OK, but
> >>question was if float optimalization should generate some problems.
> >
> >So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
> >LDFLAGS?
>
> Exactly, but I want to sure, that float optimalization is safe and
> should be applied for postgres, because -fast breaks IEE754 standard. If
> it is OK I will adjust FAQ_Solaris.
>
> Zdenek
>
Unless the packager understands the floating point usage of every
piece and module included and the effect that the -fast option will
have on them, please do not recommend it for anything but extremely
well tested dedicated use-cases. When it causes problems, it can
be terrible if the problems are not detected immediately. Massive
data corruption could occur.

Given these caveats, in a well tested use-case the -fast option can
squeeze a bit more from the CPU and could be used. I have had to
debug the fallout from the -fast option in other software in the
past. Let's just say, backups are a good thing.

I would vote not to recommend it without very strong cautions similar
to was Sun includes in the compiler manual pages.

Ken


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Kenneth Marshall <ktm(at)is(dot)rice(dot)edu>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Match(dot)Grun(at)thomson(dot)com
Subject: Re: horo(r)logy test fail on solaris (again and
Date: 2006-10-02 23:01:46
Message-ID: 200610022301.k92N1ke15798@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Thanks for the analysis. I have removed mention of the -fast option
from the Solaris FAQ.

---------------------------------------------------------------------------

Kenneth Marshall wrote:
> On Wed, Sep 27, 2006 at 04:09:18PM +0200, Zdenek Kotala wrote:
> > Tom Lane napsal(a):
> > >Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> > >>The problem was generated, because -fast option was set only for the
> > >>compiler and not for the linker. Linker takes wrong version of
> > >>libraries. If -fast is set for both then horology test is OK, but
> > >>question was if float optimalization should generate some problems.
> > >
> > >So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
> > >LDFLAGS?
> >
> > Exactly, but I want to sure, that float optimalization is safe and
> > should be applied for postgres, because -fast breaks IEE754 standard. If
> > it is OK I will adjust FAQ_Solaris.
> >
> > Zdenek
> >
> Unless the packager understands the floating point usage of every
> piece and module included and the effect that the -fast option will
> have on them, please do not recommend it for anything but extremely
> well tested dedicated use-cases. When it causes problems, it can
> be terrible if the problems are not detected immediately. Massive
> data corruption could occur.
>
> Given these caveats, in a well tested use-case the -fast option can
> squeeze a bit more from the CPU and could be used. I have had to
> debug the fallout from the -fast option in other software in the
> past. Let's just say, backups are a good thing.
>
> I would vote not to recommend it without very strong cautions similar
> to was Sun includes in the compiler manual pages.
>
> Ken
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +