Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX

Lists: pgsql-hackers
From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 02:32:19
Message-ID: CAB7nPqSoQ8ExG2sKU47ZqoNuAyved7Dr83K9EF76W5SaLPkSTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

This morning while running make check-world on my OSX Mavericks laptop, I
found the following failure:
test pgtypeslib/dt_test2 ... stderr FAILED (test process was
terminated by signal 6: Abort trap)
(lldb) bt
* thread #1: tid = 0x0000, 0x00007fff8052c866
libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGSTOP
* frame #0: 0x00007fff8052c866 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff83cb035c libsystem_pthread.dylib`pthread_kill + 92
frame #2: 0x00007fff81899bba libsystem_c.dylib`__abort + 145
frame #3: 0x00007fff8189a46d libsystem_c.dylib`__stack_chk_fail + 196
frame #4: 0x000000010f7cb3bb
libpgtypes.3.dylib`PGTYPESdate_from_asc(str=0x000000010f6a2d6c,
endptr=0x00007fff5055e488) + 635 at datetime.c:104
frame #5: 0x000000010f6a260f dt_test2`main + 255 at dt_test2.pgc:91
frame #6: 0x00007fff87acc5fd libdyld.dylib`start + 1
frame #7: 0x00007fff87acc5fd libdyld.dylib`start + 1
Bisecting is showing me that this failure has been introduced by 4318dae,
and is reproducible on all the active branches, down to REL9_0_STABLE.

Note that this problem has been introduced after discussing a separate
issue here:
http://www.postgresql.org/message-id/1399399313.27807.28.camel@sussancws0025
Regards,
--
Michael


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 03:21:09
Message-ID: 1657.1412565669@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> This morning while running make check-world on my OSX Mavericks laptop, I
> found the following failure:

[ scratches head... ] Doesn't reproduce on my OSX Mavericks laptop,
either with or without --disable-integer-datetimes. What compiler
are you using exactly? Any special build options?

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 04:07:11
Message-ID: CAB7nPqREioLT511h=8uBm7rVPEF++cx073FLJxCPMoVdTVdJiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 6, 2014 at 12:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > This morning while running make check-world on my OSX Mavericks laptop, I
> > found the following failure:
>
> [ scratches head... ] Doesn't reproduce on my OSX Mavericks laptop,
> either with or without --disable-integer-datetimes.
> What compiler are you using exactly?
>
clang from developer tools 6.0 of September 2014, even if configure points
to "gcc" in /usr/bin/:
$ which gcc
/usr/bin/gcc
$ gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.51) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix
$ clang --version
Apple LLVM version 6.0 (clang-600.0.51) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix

> Any special build options?
>
Nothing really fancy:
$ ./configure --enable-depend --enable-debug --disable-rpath
--enable-cassert --prefix=/to/path/bin/pgsql --with-libxml
CFLAGS=
I am attaching config.log in case. Btw that's 10.9.5, and I have been able
to reproduce it on a second machine running 10.9.5 as well.
Regards,
--
Michael

Attachment Content-Type Size
config.tar.gz application/x-gzip 22.3 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 04:15:03
Message-ID: 2807.1412568903@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> On Mon, Oct 6, 2014 at 12:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> [ scratches head... ] Doesn't reproduce on my OSX Mavericks laptop,
>> either with or without --disable-integer-datetimes.
>> What compiler are you using exactly?

> clang from developer tools 6.0 of September 2014, even if configure points
> to "gcc" in /usr/bin/:
> $ which gcc
> /usr/bin/gcc
> $ gcc --version
> Configured with: --prefix=/Library/Developer/CommandLineTools/usr
> --with-gxx-include-dir=/usr/include/c++/4.2.1
> Apple LLVM version 6.0 (clang-600.0.51) (based on LLVM 3.5svn)
> Target: x86_64-apple-darwin13.4.0
> Thread model: posix

Exact same here, so that's not it. (I think ... my Xcode says it's 6.0.1,
but the compiler --version report is just the same as you show.)

>> Any special build options?

> Nothing really fancy:
> $ ./configure --enable-depend --enable-debug --disable-rpath
> --enable-cassert --prefix=/to/path/bin/pgsql --with-libxml

That looks about like mine too, though I'm not using --disable-rpath
... what's the reason for that?

> I am attaching config.log in case. Btw that's 10.9.5, and I have been able
> to reproduce it on a second machine running 10.9.5 as well.

10.9.5 here as well. We're running out of explanations ...

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 05:07:22
Message-ID: CAB7nPqRGnSsu29Mqzzw82c_2Om7TSscAqt0uTgN_XOx=L-MYOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 6, 2014 at 1:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > Nothing really fancy:
> > $ ./configure --enable-depend --enable-debug --disable-rpath
> > --enable-cassert --prefix=/to/path/bin/pgsql --with-libxml
>
> That looks about like mine too, though I'm not using --disable-rpath
> ... what's the reason for that?
>
No real reason. That was only some old remnant in a build script that was
here for ages :)
--
Michael


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 13:45:15
Message-ID: 29057.1412603115@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> On Mon, Oct 6, 2014 at 1:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> That looks about like mine too, though I'm not using --disable-rpath
>> ... what's the reason for that?

> No real reason. That was only some old remnant in a build script that was
> here for ages :)

Hm. Grasping at straws here ... what's your locale enviroment?

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 23:14:40
Message-ID: CAB7nPqQJqz=YTcBC68rAaQ-pYxqUPiQQ9E6ZGqcfJWpJgLtKmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 6, 2014 at 10:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > On Mon, Oct 6, 2014 at 1:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> That looks about like mine too, though I'm not using --disable-rpath
> >> ... what's the reason for that?
>
> > No real reason. That was only some old remnant in a build script that was
> > here for ages :)
>
> Hm. Grasping at straws here ... what's your locale enviroment?
>

The system locales have nothing really special...
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
But now that you mention it I have as well that:
$ defaults read -g AppleLocale
en_JP
--
Michael


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-06 23:50:23
Message-ID: CAB7nPqQt=Gj-mMZfDbJbavJv0QDsXiuZHExi6CHa4=6x9L+0Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 7, 2014 at 8:14 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> The system locales have nothing really special...
> $ locale
> LANG=
> LC_COLLATE="C"
> LC_CTYPE="UTF-8"
> LC_MESSAGES="C"
> LC_MONETARY="C"
> LC_NUMERIC="C"
> LC_TIME="C"
> LC_ALL=
> But now that you mention it I have as well that:
> $ defaults read -g AppleLocale
> en_JP
>
Hm... I have tried changing the system locales (to en_US for example) and
time format but I can still trigger the issue all the time. I'll try to
have a closer look.. It looks like this test does not like some settings at
the OS level.
--
Michael


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-07 00:57:54
Message-ID: 29432.1412643474@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> Hm... I have tried changing the system locales (to en_US for example) and
> time format but I can still trigger the issue all the time. I'll try to
> have a closer look.. It looks like this test does not like some settings at
> the OS level.

I eventually realized that the critical difference was you'd added
"CFLAGS=" to the configure call. On this platform that has the net
effect of removing -O2 from the compiler flags, and apparently that
shifts around the stack layout enough to expose the clobber.

The fix is simple enough: ecpg's version of ParseDateTime is failing
to check for overrun of the field[] array until *after* it's already
clobbered the stack:

*** a/src/interfaces/ecpg/pgtypeslib/dt_common.c
--- b/src/interfaces/ecpg/pgtypeslib/dt_common.c
*************** ParseDateTime(char *timestr, char *lowst
*** 1695,1703 ****
while (*(*endstr) != '\0')
{
/* Record start of current field */
- field[nf] = lp;
if (nf >= MAXDATEFIELDS)
return -1;

/* leading digit? then date or time */
if (isdigit((unsigned char) *(*endstr)))
--- 1695,1703 ----
while (*(*endstr) != '\0')
{
/* Record start of current field */
if (nf >= MAXDATEFIELDS)
return -1;
+ field[nf] = lp;

/* leading digit? then date or time */
if (isdigit((unsigned char) *(*endstr)))

Kind of astonishing that nobody else has reported this, given that
there's been a regression test specifically meant to catch such a
problem since 4318dae. The stack layout in PGTYPESdate_from_asc
must happen to avoid the issue on practically all platforms.

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-07 01:35:04
Message-ID: CAB7nPqQ-fgu+p-Z5N=juuSjuHhTckhQCgjbE_Exw3+MzJjBM9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 7, 2014 at 9:57 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > Hm... I have tried changing the system locales (to en_US for example) and
> > time format but I can still trigger the issue all the time. I'll try to
> > have a closer look.. It looks like this test does not like some settings
> at
> > the OS level.
>
> I eventually realized that the critical difference was you'd added
> "CFLAGS=" to the configure call. On this platform that has the net
> effect of removing -O2 from the compiler flags, and apparently that
> shifts around the stack layout enough to expose the clobber.
>

At least my scripts are weird enough to trigger such behaviors. The funny
part is that it's really a coincidence, CFLAGS was being set with an empty
variable, variable removed in this script some time ago.

The fix is simple enough: ecpg's version of ParseDateTime is failing
> to check for overrun of the field[] array until *after* it's already
> clobbered the stack:
> Kind of astonishing that nobody else has reported this, given that
> there's been a regression test specifically meant to catch such a
> problem since 4318dae. The stack layout in PGTYPESdate_from_asc
> must happen to avoid the issue on practically all platforms.
>
Yes, thanks. That's it. At least I am not going crazy.
Regards,
--
Michael


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Date: 2014-10-08 06:00:27
Message-ID: 20141008060027.GA348882@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 06, 2014 at 08:57:54PM -0400, Tom Lane wrote:
> I eventually realized that the critical difference was you'd added
> "CFLAGS=" to the configure call. On this platform that has the net
> effect of removing -O2 from the compiler flags, and apparently that
> shifts around the stack layout enough to expose the clobber.
>
> The fix is simple enough: ecpg's version of ParseDateTime is failing
> to check for overrun of the field[] array until *after* it's already
> clobbered the stack:

Thanks for tracking that down. Oops.