Lists: | pgsql-hackers |
---|
From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | pg_sleep() doesn't work well with recovery conflict interrupts. |
Date: | 2014-05-28 15:23:31 |
Message-ID: | 20140528152331.GB25431@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses
WaitLatch() to wait. That's fine in itself. But
procsignal_sigusr1_handler, which is used e.g. when resolving recovery
conflicts, doesn't unconditionally do a SetLatch().
That means that we'll we'll currently not be able to cancel conflicting
backends during recovery for 10min. Now, I don't think that'll happen
too often in practice, but it's still annoying.
As an alternative to doing the PG_TRY/save set_latch_on_sigusr1/set
set_latch_on_sigusr1/PG_CATCH/reset set_latch_on_sigusr1/ dance in
pg_sleep() we could also have RecoveryConflictInterrupt() do an
unconditional SetLatch()?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: pg_sleep() doesn't work well with recovery conflict interrupts. |
Date: | 2014-05-30 05:00:42 |
Message-ID: | CAA4eK1JZP+RGgG2ZnrYYXQRVKbWfVkfUnrGv5=_3hEsz_k=yAA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, May 28, 2014 at 8:53 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> Hi,
>
> Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses
> WaitLatch() to wait. That's fine in itself. But
> procsignal_sigusr1_handler, which is used e.g. when resolving recovery
> conflicts, doesn't unconditionally do a SetLatch().
> That means that we'll we'll currently not be able to cancel conflicting
> backends during recovery for 10min. Now, I don't think that'll happen
> too often in practice, but it's still annoying.
How will such a situation occur, aren't we using pg_usleep during
RecoveryConflict functions
(ex. in ResolveRecoveryConflictWithVirtualXIDs)?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: pg_sleep() doesn't work well with recovery conflict interrupts. |
Date: | 2014-06-01 07:35:27 |
Message-ID: | 20140601073527.GG4286@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2014-05-30 10:30:42 +0530, Amit Kapila wrote:
> On Wed, May 28, 2014 at 8:53 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
> wrote:
> > Hi,
> >
> > Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses
> > WaitLatch() to wait. That's fine in itself. But
> > procsignal_sigusr1_handler, which is used e.g. when resolving recovery
> > conflicts, doesn't unconditionally do a SetLatch().
> > That means that we'll we'll currently not be able to cancel conflicting
> > backends during recovery for 10min. Now, I don't think that'll happen
> > too often in practice, but it's still annoying.
>
> How will such a situation occur, aren't we using pg_usleep during
> RecoveryConflict functions
> (ex. in ResolveRecoveryConflictWithVirtualXIDs)?
I am not sure what you mean. pg_sleep() is the SQL callable function, a
different thing to pg_usleep(). The latter isn't interruptible on all
platforms, but the sleep times should be short enough for that not to
matter.
I am pretty sure by now that the sane fix for this is to add a
SetLatch() call to RecoveryConflictInterrupt(). All the signal handlers
that deal with query cancelation et al. do so, so it seems right that
RecoveryConflictInterrupt() does so as well.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_sleep() doesn't work well with recovery conflict interrupts. |
Date: | 2014-06-01 16:26:58 |
Message-ID: | 14901.1401640018@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> I am pretty sure by now that the sane fix for this is to add a
> SetLatch() call to RecoveryConflictInterrupt(). All the signal handlers
> that deal with query cancelation et al. do so, so it seems right that
> RecoveryConflictInterrupt() does so as well.
+1
regards, tom lane
From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: pg_sleep() doesn't work well with recovery conflict interrupts. |
Date: | 2014-06-02 03:11:40 |
Message-ID: | CAA4eK1+Y9ghvZPgN=sgDuhyvMb+j61zkAmpnMi30EFA9ckY4aA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sun, Jun 1, 2014 at 1:05 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> On 2014-05-30 10:30:42 +0530, Amit Kapila wrote:
> > On Wed, May 28, 2014 at 8:53 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
> > > Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses
> > > WaitLatch() to wait. That's fine in itself. But
> > > procsignal_sigusr1_handler, which is used e.g. when resolving recovery
> > > conflicts, doesn't unconditionally do a SetLatch().
> > > That means that we'll we'll currently not be able to cancel
conflicting
> > > backends during recovery for 10min. Now, I don't think that'll happen
> > > too often in practice, but it's still annoying.
> >
> > How will such a situation occur, aren't we using pg_usleep during
> > RecoveryConflict functions
> > (ex. in ResolveRecoveryConflictWithVirtualXIDs)?
>
> I am not sure what you mean. pg_sleep() is the SQL callable function, a
> different thing to pg_usleep().
I was not clear how such a situation can occur, but now looking at
it bit more carefully, I think I understood that any backend calling
pg_sleep() during recovery conflict resolution can face this situation.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com