Quick Links

Re: Snapshot synchronization, again...

From:	Joachim Wieland <joe(at)mcknight(dot)de>
To:	Noah Misch <noah(at)leadboat(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Snapshot synchronization, again...
Date:	2011-01-19 05:12:39
Message-ID:	AANLkTinvmf9yaXQp7BijTkodChUQ5Nb3wvx5riD5BxBM@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Noah, thank you for this excellent review. I have taken over most
(allmost all) of your suggestions (except for the documentation which
is still missing). Please check the updated & attached patch for
details.

On Fri, Jan 14, 2011 at 10:13 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> "00000000000000000000" is a valid md5 message digest. To avoid always accepting
> a snapshot with that digest, we would need to separately store a flag indicating
> whether a given syncSnapshotChksums member is currently valid. Maybe that hole
> is acceptable, though.

In the end I decided to store md5 checksums as printable characters in
shmem. We now need a few more bytes for each checksum but as soon as a
byte is NUL we know that it is not a valid checksum.

>> +
>> + if (!XactReadOnly)
>> + elog(WARNING, "A snapshot exporting function should be readonly.");
>
> There are legitimate use cases for copying the snapshot of a read-write
> transaction. Consider a task like calculating some summaries for insert into a
> table. To speed this up, you might notionally partition the source data and
> have each of several workers calculate each partition. I'd vote for removing
> this warning entirely.

Warning removed, adding this fact to the documentation only is fine with me.

> InvalidBackendId is -1. Is InvalidOid (0) in fact also invalid as a backend ID?

Yes, there is a check in utils/init/postinit.c that makes sure that 0
is never a valid backendId. But anyway, I have now made this more
explicit by adding an own parse routine for ints.

>> + while ((xid = parseXactFromText(&s, "xip:")) != InvalidTransactionId)
>> + {
>> + if (xid == GetTopTransactionIdIfAny())
>> + continue;
>> + snapshot->xip[i++] = xid;
>
> This works on a virgin GetSnapshotData() snapshot, which always has enough space
> (sized according to max_connections). A snapshot from CopySnapshot(), however,
> has just enough space for the original usage. I get a SIGSEGV at this line with
> the attached test script. If I change things around so xip starts non-NULL, I
> get messages like these as the code scribbles past the end of the allocation:

This has been fixed. xip/subxip are now allocated separately if necessary.

> Though unlikely, the other backend may not even exist by now. Indeed, that
> could have happened and RecentGlobalXmin advanced since we compared the
> checksum. Not sure what the right locking is here, but something is needed.

Good catch. What I have done now is a second check at the end of
ImportSnapshot(). At that point we have set up the new snapshot and
adjusted MyProc->xmin. If we succeed, then we are fine. If not we
throw an error and invalidate the whole transaction.

>> + * Instead we must check to not go forward (we might have opened a cursor
>> + * in this transaction and still have its snapshot registered)
>> + */
>
> I'm thinking this case should yield an ERROR. Otherwise, our net result would
> be to silently adopt a snapshot that is neither our old self nor the target.
> Maybe there's a use case for that, but none comes to my mind.

This can happen when you do:

DECLARE foo CURSOR FOR SELECT * FROM bar;
import snapshot...
FETCH 1 FROM foo;

> I guess the use case for pg_import_snapshot from READ COMMITTED would be
> something like "DO $$BEGIN PERFORM pg_import_snapshot('...'); stuff; END$$;".
> First off, would that work ("stuff" use the imported snapshot)? If it does
> work, we should take the precedent of SET LOCAL and issue no warning. If it
> doesn't work, then we should document that the snapshot does take effect until
> the next command and probably keep this warning or something similar.

No, this will also give you a new snapshot for every query, no matter
if it is executed within or outside of a DO-Block.

>> + Datum
>> + pg_import_snapshot(PG_FUNCTION_ARGS)
>> + {
>> + bytea *snapshotData = PG_GETARG_BYTEA_P(0);
>> + bool ret = true;
>> +
>> + if (ActiveSnapshotSet())
>> + ret = ImportSnapshot(snapshotData, GetActiveSnapshot());
>> +
>> + ret &= ImportSnapshot(snapshotData, GetTransactionSnapshot());
>
> Is it valid to scribble directly on snapshots like this?

I figured that previously executed code still has references pointing
to the snapshots and so we cannot just push a new snapshot on top but
really need to change the memory where they are pointing to.

I am also adding a test script that shows the difference of READ
COMMITTED and SERIALIZABLE in an importing transaction in a DO-Block.
It's based on the script you sent.

So thanks again and I'm looking forward to your next review... :-)

Joachim

Attachment	Content-Type	Size
syncsnapshots.2.diff	text/x-patch	16.8 KB
test3.pl	text/x-perl	1.5 KB

In response to

Re: Snapshot synchronization, again... at 2011-01-15 03:13:05 from Noah Misch

Responses

Re: Snapshot synchronization, again... at 2011-01-20 06:37:03 from Noah Misch

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2011-01-19 05:14:00	Re: pg_basebackup for streaming base backups
Previous Message	Fujii Masao	2011-01-19 04:12:21	Re: pg_basebackup for streaming base backups