Re: Strange Windows problem, lock_timeout test request

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-17 08:00:50
Message-ID: 51457832.9050107@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2013-03-17 04:48 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> [ 2-lock_timeout-v37.patch ]
> Applied after a fair amount of additional hacking.

Thank you, thank you, thank you! :-)

> I was disappointed to find that the patch introduced a new race
> condition into timeout.c, or at least broke a safety factor that had
> been there. The argument why enable_timeout() could skip disabling
> the timer interrupt on entry, if num_active_timeouts is zero, doesn't
> work for enable_timeouts(): as soon as you've incremented
> num_active_timeouts for the first new timeout, the signal handler would
> mess around with the data structure if it were to fire. What I did
> about that was to modify disable_alarm() to forcibly disable the
> interrupt if we're adding multiple timeouts in enable_timeouts(), even
> if we think no interrupt is pending. This might be overly paranoid,
> but since all of this is new code for 9.3 and hasn't been through any
> beta testing, I felt it best to preserve that safety feature. We can
> revisit it later if it proves to be an issue. (It's conceivable for
> instance that we could avoid incrementing the stored value of
> num_active_timeouts until we're done adding all the new timeouts;
> but that seemed pretty messy.)

Your reasoning seems to be correct. However, if we take it to the
extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
should also disable the interrupt handler at the beginning of the
function and enabled at the end with pqsignal(). An evil soul may
send SIGALRM externally to the backend processes at the proper
moment and create a race condition while enable_timeout*() is in
progress and the itimer is about to trigger at the same time (but
doesn't since it was disabled).

> For the current usage pattern I'm not
> too sure that it matters anyway: a savings is only possible if you
> have enabled lock_timeout but not statement_timeout, and I'm doubtful
> that that will be a common use-case.

I am not too sure about it. With lock_timeout in place, code migrated
from Informix, MSSQL, etc. can have the same semantic behaviour.

>
> I also rearranged the handling of the LOCK_TIMEOUT interrupt some more,
> since I didn't see a need for it to be different from STATEMENT_TIMEOUT,
> and got rid of some non-C89 coding practices that didn't seem to me to
> be improving readability anyway.

Thanks for that, too. I was too blind to see that choice, even after
thinking about why the statement_timeout cannot be done from
SIGALRM context and why does the code need to also send SIGINT
to the process group. (To kill children, too, like scripts executed via
system()...)

Thanks again,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-03-17 12:24:16 Re: Support for REINDEX CONCURRENTLY
Previous Message Tom Lane 2013-03-17 04:41:43 Re: Should array_length() Return NULL