[WIP] Keeping track of snapshots

Lists: pgsql-patches
From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pg Patches <pgsql-patches(at)postgresql(dot)org>
Subject: [WIP] Keeping track of snapshots
Date: 2008-03-28 14:06:06
Message-ID: 20080328140606.GL7464@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

This is the patch I am currently playing with, to keep track of the
live snapshots within a transaction.

There are some obvious bugs left, and it open certain questions. The
most obvious bug is that it is not tracking SerializableSnapshot in a
serializable transaction.

The main question I currently have is how to deal with ActiveSnapshot.
Currently, on most places it is just being registered just like any
snapshot, and then unregistered. For the most part this works fine, but
I wonder if it's not just a bug waiting to happen -- since it is a
global variable and other places in the code feel free to assign to it,
it is possible that some code sets it, then calls some other code that
sets it to a different snapshot and then return -- on exit, the first
setter would try to unregister a snapshot that it didn't register.

So the question is, should we deal with ActiveSnapshot in some different
way? For example snapmgr could know explicitely about it; keeping a
stack perhaps, and have callers call a routine in snapmgr "I need this
snap to be ActiveSnapshot" (push), and later "I no longer need
ActiveSnapshot" (pop). On error, we can just forget about the part of
the stack that corresponds to the current subxact. This would allow us
to remove some PG_TRY blocks.

The other question is about CommitTransactionCommand. Currently my
EOXact routine barfs for every snapshot not unregistered on main
transaction commit -- a leak. I see this as a good thing, however it
forced me to be more meticulous about not having ActiveSnapshot be set
in commands that have multiple transactions like VACUUM, multitable
CLUSTER and CREATE INDEX CONCURRENTLY.

(Hmm, maybe this last point is also saying that we should be dealing
with ActiveSnapshot explicitely on snapmgr.)

And finally, the last question is about some callers like COPY or
EXPLAIN that get a snapshot (ActiveSnapshot or a different one), then
modify Snapshot->curcid. I'm not sure this is a problem, really, but
just to be careful I made them create a copy and register the copy.
The problem I have with it is that the snapshots are allocated only once
when they are registered -- if somebody registers the same snapshot in
the future, we will only increment the reference count and return the
pointer to the previously stored snapshot. So if COPY reuses a snapshot
that has been previously stored, it would mangle the copy and the other
user could get an invalid view of the database. I'm not sure this is
really a problem, because how could COPY get the same snapshot as
anybody else? Still, it gives me a bad feeling.

(Hmm ... it just occured to me that perhaps it would be better to
improve the interaction between GetSnapshotData and RegisterSnapshot, to
avoid some palloc traffic.)

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment Content-Type Size
snapshot-4.patch text/x-diff 42.8 KB

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Pg Patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [WIP] Keeping track of snapshots
Date: 2008-03-28 14:58:02
Message-ID: 877ifmzyit.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Alvaro Herrera" <alvherre(at)commandprompt(dot)com> writes:

> The other question is about CommitTransactionCommand. Currently my
> EOXact routine barfs for every snapshot not unregistered on main
> transaction commit -- a leak. I see this as a good thing, however it
> forced me to be more meticulous about not having ActiveSnapshot be set
> in commands that have multiple transactions like VACUUM, multitable
> CLUSTER and CREATE INDEX CONCURRENTLY.

I believe ActiveSnapshot has to be set during CREATE INDEX CONCURRENTLY if
it's an expression index which calls a function which needs a snapshot...

AFAICT VACUUM had better not ever need a snapshot because its xmin isn't
included in other vacuum commands' globalxmin so there's no guarantee that if
it had a snapshot that the tuples visible in that snapshot wouldn't disappear
out from under it.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Pg Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [WIP] Keeping track of snapshots
Date: 2008-03-28 15:03:49
Message-ID: 20080328150349.GN7464@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Gregory Stark wrote:
> "Alvaro Herrera" <alvherre(at)commandprompt(dot)com> writes:
>
> > The other question is about CommitTransactionCommand. Currently my
> > EOXact routine barfs for every snapshot not unregistered on main
> > transaction commit -- a leak. I see this as a good thing, however it
> > forced me to be more meticulous about not having ActiveSnapshot be set
> > in commands that have multiple transactions like VACUUM, multitable
> > CLUSTER and CREATE INDEX CONCURRENTLY.
>
> I believe ActiveSnapshot has to be set during CREATE INDEX CONCURRENTLY if
> it's an expression index which calls a function which needs a snapshot...

Yeah, it is set at that point -- it is unset just before
CommitTransactionCommand is called. Same for multitable CLUSTER -- it
certainly does need a snapshot, but it needs to unregister it just
before committing at the end of processing each table.

> AFAICT VACUUM had better not ever need a snapshot because its xmin isn't
> included in other vacuum commands' globalxmin so there's no guarantee that if
> it had a snapshot that the tuples visible in that snapshot wouldn't disappear
> out from under it.

Yeah, I neglected to mention that this is just for VACUUM FULL (which
does have its xmin included in snapshots).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.