Quick Links

Re: Review: Hot standby

Lists:	pgsql-hackers

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Review: Hot standby
Date:	2008-11-21 07:52:12
Message-ID:	2e78013d0811202352s7ca6f43fs4c146df296636f9b@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I think we should avoid the #define's like below which uses a local
variable. I guess the same #define is used elsewhere in the code as well. If
the code is rearranged or the variable name is changed, the code would
break.

The use of malloc should also be avoided (unless the memory subsystem is not
up yet; I haven't checked).

***************
*** 706,713 ****
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of memory")));
Assert(snapshot->subxip == NULL);
snapshot->subxip = (TransactionId *)
! malloc(arrayP->maxProcs * PGPROC_MAX_CACHED_SUBXIDS *
sizeof(TransactionId));
if (snapshot->subxip == NULL)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
--- 1003,1011 ----
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of memory")));
Assert(snapshot->subxip == NULL);
+ #define maxNumSubXids (arrayP->maxProcs * PGPROC_MAX_CACHED_SUBXIDS)
snapshot->subxip = (TransactionId *)
! malloc(maxNumSubXids * sizeof(TransactionId));
if (snapshot->subxip == NULL)
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-21 08:47:27
Message-ID:	2e78013d0811210047t1309ee1fr44e726ad793f50ad@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wonder if there is corner case below where there are no WAL records to
replay during standby recovery. Specifically, that may cause
IsRunningXactDataValid() to return false since latestObservedXid remains set
to InvalidTransactionId and that prevents postmaster from serving read-only
clients.

I don't have a test case, but I recall seeing the issue while experimenting.
Though that could be because of the WALInsertLock issue reported earlier.

/*
* Can we signal Postmaster to enter consistent recovery
mode?
*
* There are two points in the log that we must pass. The
first
* is minRecoveryPoint, which is the LSN at the time the
* base backup was taken that we are about to rollfoward
from.
* If recovery has ever crashed or was stopped there is also
* another point also: minSafeStartPoint, which we know the
* latest LSN that recovery could have reached prior to
crash.
*
* We must also have assembled sufficient information about
* transaction state to allow valid snapshots to be taken.
*/
if (!reachedSafeStartPoint &&
IsRunningXactDataValid() &&
XLByteLE(ControlFile->minSafeStartPoint, EndRecPtr) &&
XLByteLE(ControlFile->minRecoveryPoint, EndRecPtr))
{
reachedSafeStartPoint = true;
if (InArchiveRecovery)
{
ereport(LOG,
(errmsg("database has now reached consistent
state at %X/%X",
EndRecPtr.xlogid, EndRecPtr.xrecoff)));
InitRecoveryTransactionEnvironment();
StartCleanupDelayStats();
if (IsUnderPostmaster)
SendPostmasterSignal(PMSIGNAL_RECOVERY_START);
}
}

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-21 09:51:55
Message-ID:	2e78013d0811210151i1ef5909fqad475489a049bdd4@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

+ /*
+ * Release locks, if any.
+ */
+ RelationReleaseRecoveryLocks(xlrec->slotId);

If I am reading the patch correctly, AccessExclusiveLocks acquired by a
transaction will be released when the transaction is committed or aborted.
If the transaction errors out with FATAL, the locks will be released when
the next transaction occupying the same slot is committed/aborted.

I smell some sort of deadlock condition here. What if the following events
happen:

- transaction A (slot 1) starts and acquires AEL lock on relation
- transaction A errors out with FATAL error
- transaction B (slot 1) starts and requests AEL lock on the same relation

Won't B deadlock with A ? Since B hasn't yet committed/aborted, the lock
held by A is not released and
relation_redo_lock() would indefinitely wait for the lock.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-21 13:29:14
Message-ID:	1227274154.7015.74.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2008-11-21 at 13:22 +0530, Pavan Deolasee wrote:
>
> I think we should avoid the #define's like below which uses a local
> variable. I guess the same #define is used elsewhere in the code as
> well. If the code is rearranged or the variable name is changed, the
> code would break.

I use a #define because same value is used elsewhere in code.

> The use of malloc should also be avoided (unless the memory subsystem
> is not up yet; I haven't checked).

The malloc was part of the existing code, explained by comments.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-21 13:35:35
Message-ID:	1227274535.7015.80.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2008-11-21 at 15:21 +0530, Pavan Deolasee wrote:
>
> + /*
> + * Release locks, if any.
> + */
> + RelationReleaseRecoveryLocks(xlrec->slotId);
>
>
> If I am reading the patch correctly, AccessExclusiveLocks acquired by
> a transaction will be released when the transaction is committed or
> aborted. If the transaction errors out with FATAL, the locks will be
> released when the next transaction occupying the same slot is
> committed/aborted.
>
> I smell some sort of deadlock condition here. What if the following
> events happen:
>
> - transaction A (slot 1) starts and acquires AEL lock on relation
> - transaction A errors out with FATAL error
> - transaction B (slot 1) starts and requests AEL lock on the same
> relation
>
> Won't B deadlock with A ? Since B hasn't yet committed/aborted, the
> lock held by A is not released and
> relation_redo_lock() would indefinitely wait for the lock.

This won't happen because the lock is held by the startup process on
behalf of slot 1. Explained in comments in inval.c code.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-22 09:44:40
Message-ID:	2e78013d0811220144k718e4bbfy98d123051d2d0c21@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Nov 21, 2008 at 6:59 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>
>
> The malloc was part of the existing code, explained by comments.
>
>
Oh I see. But I don't see any explanations for using malloc instead of
palloc. Not that the current patch is responsible for this, I am wondering
why its done that way and if we are freeing the malloced memory at all ?

malloc is used at another place in a new code. Although it seems that the
allocation happens just once, please check if its better to use palloc
there.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-22 11:08:40
Message-ID:	1227352120.7015.133.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, 2008-11-22 at 15:14 +0530, Pavan Deolasee wrote:

>
> The malloc was part of the existing code, explained by
> comments.
>
>
> Oh I see. But I don't see any explanations for using malloc instead of
> palloc. Not that the current patch is responsible for this, I am
> wondering why its done that way and if we are freeing the malloced
> memory at all ?
>
> malloc is used at another place in a new code. Although it seems that
> the allocation happens just once, please check if its better to use
> palloc there.

Thanks for your comments. OK, here's my thoughts after checking.

The malloc in the current code occurs within GetSnapshotData where it
exists to optimise memory allocation. The memory is allocated once and
never released, so malloc is appropriate.

The patch adds another use of malloc in GetRunningTransactionData()
which is similar to GetSnapshotData in many ways. It only ever runs
within bgwriter, and again, once allocated the memory is never released.
So malloc is appropriate there also.

I will change the #define as you suggested earlier.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-22 14:14:25
Message-ID:	20081122141425.GA3813@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavan Deolasee escribió:
> On Fri, Nov 21, 2008 at 6:59 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> > The malloc was part of the existing code, explained by comments.
>
> Oh I see. But I don't see any explanations for using malloc instead of
> palloc. Not that the current patch is responsible for this, I am wondering
> why its done that way and if we are freeing the malloced memory at all ?

It's an optimization. We don't ever free it -- we alloc it once (the
first time the snapshot is taken) and then the allocated space is reused
until the backend dies. The reason for not using palloc is that if
you're not going to do any context-related management, what would be the
point? We save the palloc overhead this way (admittedly not a lot).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-22 21:42:19
Message-ID:	1227390139.7015.173.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2008-11-21 at 14:17 +0530, Pavan Deolasee wrote:

> I wonder if there is corner case

I remain on the lookout for these, so thanks for thinking of this.

> below where there are no WAL records to replay during standby
> recovery. Specifically, that may cause IsRunningXactDataValid() to
> return false since latestObservedXid remains set to
> InvalidTransactionId and that prevents postmaster from serving
> read-only clients.

I don't think so, because this section of code is only called if there
is redo to perform. If no redo, we never get here.

There certainly is no guarantee that the correct conditions will ever
exist to allow read-only access. If snapshots overflow consistently over
a long period then the correct state will never be reached. That is very
unlikely, but possible. I could prevent the delay, but then the code
would so seldom run that it would probably have bugs, so it didn't seem
worth trying to plug the gap.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-24 07:40:14
Message-ID:	2e78013d0811232340x44ebe2cex8c0c883de729efb8@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Nov 23, 2008 at 3:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>
>
>
> I don't think so, because this section of code is only called if there
> is redo to perform. If no redo, we never get here.
>
>
OK. But why not open up the database for read-only access when there is no
redo action to be done ? Or am I missing something trivial ?

Anyways, I will wait for your latest version to continue with the
test/review.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-25 09:26:11
Message-ID:	2e78013d0811250126k5d8ddb07s61543f42d4dda4f3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

The following sequence of commands causes server crash.

postgres=# BEGIN TRANSACTION READ WRITE ;
BEGIN
postgres=# SELECT * FROM pg_class FOR UPDATE;
FATAL: cannot assign TransactionIds during recovery
STATEMENT: SELECT * FROM pg_class FOR UPDATE;
FATAL: cannot assign TransactionIds during recovery
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

I think we should disallow starting a read-write transaction during
recovery. The patch attempts to do that by setting transaction mode to
read-only, but not enough to prevent somebody to explicitly mark the
transaction as read-write.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-25 13:25:24
Message-ID:	14797.1227619524@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> I think we should disallow starting a read-write transaction during
> recovery. The patch attempts to do that by setting transaction mode to
> read-only, but not enough to prevent somebody to explicitly mark the
> transaction as read-write.

Huh? The "read only" transaction mode is not hard read-only anyway,
so if that's the only step being taken, it's entirely useless.

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-25 13:32:57
Message-ID:	2e78013d0811250532k53807762o592b99081d931381@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Nov 25, 2008 at 6:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
>
> Huh? The "read only" transaction mode is not hard read-only anyway,
> so if that's the only step being taken, it's entirely useless.
>
>
I think there are explicit checks for some utility statements (like VACUUM),
but I haven't checked if all necessary code paths are covered or not.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-25 19:13:07
Message-ID:	1227640387.14213.12.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-11-25 at 19:02 +0530, Pavan Deolasee wrote:

> On Tue, Nov 25, 2008 at 6:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Huh? The "read only" transaction mode is not hard read-only
> anyway,
> so if that's the only step being taken, it's entirely useless.
>
>
> I think there are explicit checks for some utility statements (like
> VACUUM), but I haven't checked if all necessary code paths are covered
> or not.

The commands that need protecting have been explicitly identified in the
notes and there are 7 files changed that were specifically identified
with protective changes.

You've identified a way of breaking part the first line of defence, but
the command was caught by the second line of defence in the patch.

Problem, yes. Major issue, no. Will fix.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-26 10:22:35
Message-ID:	2e78013d0811260222p710effc6j59c0b4f10ae76fa6@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

ISTM that the redo conflict resolution is not working as intended. I did the
following test and it throws some surprises.

On standby:

postgres=# BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
BEGIN
postgres=# SELECT * from test;
a | b
-----+---
102 |
103 |
(2 rows)

On primary:

postgres=# SELECT * from test;
a | b
-----+---
102 |
103 |
(2 rows)

postgres=#
postgres=# UPDATE test SET a = a + 100;
UPDATE 2
postgres=# VACUUM test;
VACUUM
postgres=# SELECT pg_switch_xlog();
pg_switch_xlog
----------------
0/2D000288
(1 row)

On standby (server log):

LOG: restored log file "00000001000000000000002D" from archive
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
<same message repeated>

The open transaction (see above) on the standby is not still not aborted and
if I query the table in the same transaction, I get:

(Note: the transaction is still open)
postgres=#
postgres=# SELECT * from test;
a | b
---+---
(0 rows)

I think whats happening is that ResolveRecoveryConflictWithVirtualXIDs() is
failing to abort the open transaction and it keeps trying for that,
everytime doubling the sleep time, so the LOG messages come less frequently
later, but they are never ending. Soon the sleep becomes exponentially
large.

Even though the standby has a open transaction, its obvious that the
cleanup_redo has also failed to abort the transaction. Thats why the tuples
have disappeared from the standby (most likely because they are cleaned up
by VACUUM).

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-26 12:32:07
Message-ID:	2e78013d0811260432h3b42110fy1130ac6978a8cf54@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Nov 26, 2008 at 3:52 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>wrote:

>
>
>
> I think whats happening is that ResolveRecoveryConflictWithVirtualXIDs() is
> failing to abort the open transaction

Btw, ISTM that SIGINT works only for statement cancellation. So if the
transaction is in idle state, SIGINT has nothing to cancel and hence also
fails to abort the transaction.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-27 17:14:37
Message-ID:	1227806077.20796.73.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2008-11-26 at 18:02 +0530, Pavan Deolasee wrote:

> I think whats happening is that
> ResolveRecoveryConflictWithVirtualXIDs() is failing to abort
> the open transaction
>
>
> Btw, ISTM that SIGINT works only for statement cancellation. So if the
> transaction is in idle state, SIGINT has nothing to cancel and hence
> also fails to abort the transaction.

[If I read this correctly this second post is the cause of the first
problem, so we have one problem, rather than two.]

Understood; yes that seems to be a problem.

I will propose a solution later today. (I've been tied up with a few
things over last few days, but I'm free of that now).

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 10:01:02
Message-ID:	1227866462.20796.133.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2008-11-27 at 17:14 +0000, Simon Riggs wrote:
> On Wed, 2008-11-26 at 18:02 +0530, Pavan Deolasee wrote:
>
> > I think whats happening is that
> > ResolveRecoveryConflictWithVirtualXIDs() is failing to abort
> > the open transaction
> >
> >
> > Btw, ISTM that SIGINT works only for statement cancellation. So if the
> > transaction is in idle state, SIGINT has nothing to cancel and hence
> > also fails to abort the transaction.
>
> [If I read this correctly this second post is the cause of the first
> problem, so we have one problem, rather than two.]
>
> Understood; yes that seems to be a problem.

After some thought, the way I would handle this is by sending a slightly
different kind of signal.

We can send a shared invalidation message which means "end the
transaction, whether or not you are currently running a statement". We
would then send the backend a SIGUSR1 to ensure that it reads the shared
invalidation message as soon as possible. This is easily possible with
the new sinval processing for 8.4.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 16:14:43
Message-ID:	26392.1227888883@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> After some thought, the way I would handle this is by sending a slightly
> different kind of signal.

> We can send a shared invalidation message which means "end the
> transaction, whether or not you are currently running a statement".

No, a thousand times no. The sinval queue is an *utterly* inappropriate
mechanism for such a thing.

regards, tom lane

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 16:31:45
Message-ID:	1227889905.20796.196.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2008-11-28 at 11:14 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > After some thought, the way I would handle this is by sending a slightly
> > different kind of signal.
>
> > We can send a shared invalidation message which means "end the
> > transaction, whether or not you are currently running a statement".
>
> No, a thousand times no.

So you're against it? ;-)

> The sinval queue is an *utterly* inappropriate
> mechanism for such a thing.

To be honest, it did seem quite a neat solution. Any particular
direction of thought you'd like me to pursue instead?

Asking the backend to kill itself is much cleaner than the other ways I
imagined. So my other thoughts steer towards hijacking the SIGUSR1
signal somehow for my nefarious purposes. Would that way sound OK?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 16:44:36
Message-ID:	26861.1227890676@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Fri, 2008-11-28 at 11:14 -0500, Tom Lane wrote:
>> The sinval queue is an *utterly* inappropriate
>> mechanism for such a thing.

> To be honest, it did seem quite a neat solution. Any particular
> direction of thought you'd like me to pursue instead?

I hadn't been following the discussion closely enough to know what the
problem is. But "cancel the current transaction" is far outside the
bounds of what sinval events are supposed to do. If you try to do that
we'll forever be fighting bugs in code that expected
AcceptInvalidationMessages to do no more than invalidate cache entries.

regards, tom lane

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 17:31:35
Message-ID:	1227893495.20796.218.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2008-11-28 at 11:44 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On Fri, 2008-11-28 at 11:14 -0500, Tom Lane wrote:
> >> The sinval queue is an *utterly* inappropriate
> >> mechanism for such a thing.
>
> > To be honest, it did seem quite a neat solution. Any particular
> > direction of thought you'd like me to pursue instead?
>
> I hadn't been following the discussion closely enough to know what the
> problem is.

When we replay an AccessExclusiveLock on the standby we need to kick off
any current lock holders, after a configurable grace period. Current
lock holders may include some read-only backends that are
idle-in-transaction. SIGINT, which is what the current patch uses, is
not sufficient to dislodge the idle backends.

So we need to send a signal to the idle backends and then have them
react. We could use a multi-meaning approach for SIGUSR1 as we do for
pmsignal, or ...

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 17:45:37
Message-ID:	27671.1227894337@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Fri, 2008-11-28 at 11:44 -0500, Tom Lane wrote:
>> I hadn't been following the discussion closely enough to know what the
>> problem is.

> When we replay an AccessExclusiveLock on the standby we need to kick off
> any current lock holders, after a configurable grace period. Current
> lock holders may include some read-only backends that are
> idle-in-transaction. SIGINT, which is what the current patch uses, is
> not sufficient to dislodge the idle backends.

Hm. People have complained of that fact from time to time in normal
usage as well. Should we simply change SIGINT handling to allow it to
cancel an idle transaction?

regards, tom lane

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 18:00:39
Message-ID:	1227895239.20796.232.camel@hp_dx2400_1
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2008-11-28 at 12:45 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On Fri, 2008-11-28 at 11:44 -0500, Tom Lane wrote:
> >> I hadn't been following the discussion closely enough to know what the
> >> problem is.
>
> > When we replay an AccessExclusiveLock on the standby we need to kick off
> > any current lock holders, after a configurable grace period. Current
> > lock holders may include some read-only backends that are
> > idle-in-transaction. SIGINT, which is what the current patch uses, is
> > not sufficient to dislodge the idle backends.
>
> Hm. People have complained of that fact from time to time in normal
> usage as well. Should we simply change SIGINT handling to allow it to
> cancel an idle transaction?

Yes, that is by far the best solution. ISTM many people will be happy.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Guillaume Smet" <guillaume(dot)smet(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 18:03:50
Message-ID:	1d4e0c10811281003r7c74a9b3l975eef0e974901c8@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Nov 28, 2008 at 6:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Hm. People have complained of that fact from time to time in normal
> usage as well. Should we simply change SIGINT handling to allow it to
> cancel an idle transaction?

If this means that we would be able to able to easily rollback the
current transaction of a backend stuck in "idle in transaction"
status, a big +1.

--
Guillaume

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Guillaume Smet <guillaume(dot)smet(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-11-28 18:34:52
Message-ID:	493039CC.7090507@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Guillaume Smet wrote:
> On Fri, Nov 28, 2008 at 6:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Hm. People have complained of that fact from time to time in normal
>> usage as well. Should we simply change SIGINT handling to allow it to
>> cancel an idle transaction?
>>
>
> If this means that we would be able to able to easily rollback the
> current transaction of a backend stuck in "idle in transaction"
> status, a big +1.
>
>

Yes, I have had clients caught by this regularly (last time less than a
week ago), it would be a significant enhancement.

cheers

andrew

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: Hot standby
Date:	2008-12-16 18:03:38
Message-ID:	1229450618.8673.577.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I'm looking at allowing SIGINT cancel an idle transaction.

Just edit StatementCancelHandler() in postgres.c, so that it doesn't
ignore a signal when DoingCommandRead at line 2577.

This patch does actually do what I wanted, but it has some unintended
consequences as well. These mask the fact that it does actually work,
which is confusing and has taken me a while to understand.

The backend accepts the signal and throws an error which then cancels
the transaction. The ERROR appears in the log immediately. However, a
psql client does not respond in any way when this occurs and only when a
new request is sent do we then generate the ERROR message on the client.
pg_stat_activity continues to show "<IDLE> in transaction", even after
global xmin is higher than the xid of the cancelled backend.

Then afterwards the client gets out of sync with the server and starts
putting replies on the wrong messages. Wow.

I'm not sure why recv() doesn't return with EINTR, but I guess I'm about
to find out. Hopefully?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Attachment	Content-Type	Size
idle_sigint.patch	text/x-patch	1.3 KB