Re: pg_test_fsync crashes on systems with POSIX signal handling

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: pg_test_fsync crashes on systems with POSIX signal handling
Date: 2013-03-15 19:05:54
Message-ID: 25651.1363374354@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On my old HPUX box:

$ ./pg_test_fsync
2 seconds per test
Direct I/O is not supported on this platform.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 165.122 ops/sec ( 6056 microsecs/op)
fdatasync Alarm call
$ echo $?
142 -- that's SIGALRM

The reason it's failing is that according to the traditional (not BSD)
definition of signal(2), the signal handler is reset to SIG_DFL when the
signal is delivered. So the second occurrence of SIGALRM doesn't call
the signal handler but just crashes the process.

The quick-and-dirty fix for this is to just copy pqsignal() into
pg_test_fsync, and use that instead of calling signal() directly.
I wonder though if we shouldn't move that function into libpgport.
Thoughts?

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_test_fsync crashes on systems with POSIX signal handling
Date: 2013-03-15 21:15:19
Message-ID: 20130315211519.GA12845@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 15, 2013 at 03:05:54PM -0400, Tom Lane wrote:
> On my old HPUX box:
>
> $ ./pg_test_fsync
> 2 seconds per test
> Direct I/O is not supported on this platform.
>
> Compare file sync methods using one 8kB write:
> (in wal_sync_method preference order, except fdatasync
> is Linux's default)
> open_datasync 165.122 ops/sec ( 6056 microsecs/op)
> fdatasync Alarm call
> $ echo $?
> 142 -- that's SIGALRM
>
> The reason it's failing is that according to the traditional (not BSD)
> definition of signal(2), the signal handler is reset to SIG_DFL when the
> signal is delivered. So the second occurrence of SIGALRM doesn't call
> the signal handler but just crashes the process.
>
> The quick-and-dirty fix for this is to just copy pqsignal() into
> pg_test_fsync, and use that instead of calling signal() directly.
> I wonder though if we shouldn't move that function into libpgport.
> Thoughts?

Well, the Win32 signal handler is already in port, so moving the Unix
one seems to make sense, i.e. the comment above pgsignal says:

/* Win32 signal handling is in backend/port/win32/signal.c */

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_test_fsync crashes on systems with POSIX signal handling
Date: 2013-03-18 03:10:48
Message-ID: 22388.1363576248@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Fri, Mar 15, 2013 at 03:05:54PM -0400, Tom Lane wrote:
>> The quick-and-dirty fix for this is to just copy pqsignal() into
>> pg_test_fsync, and use that instead of calling signal() directly.
>> I wonder though if we shouldn't move that function into libpgport.
>> Thoughts?

> Well, the Win32 signal handler is already in port, so moving the Unix
> one seems to make sense, i.e. the comment above pgsignal says:
> /* Win32 signal handling is in backend/port/win32/signal.c */

Done, though it was a bit more painful than I expected --- I seem to
have guessed completely wrong about where the portability hazards were.
Good thing we have a buildfarm.

regards, tom lane