Re: logical changeset generation v6.2

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical changeset generation v6.2
Date: 2013-10-29 15:43:26
Message-ID: 20131029154326.GD21284@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-10-29 11:28:44 -0400, Robert Haas wrote:
> On Tue, Oct 29, 2013 at 10:47 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2013-10-28 11:54:31 -0400, Robert Haas wrote:
> >> > There's one snag I currently can see, namely that we actually need to
> >> > prevent that a formerly dropped relfilenode is getting reused. Not
> >> > entirely sure what the best way for that is.
> >>
> >> I'm not sure in detail, but it seems to me that this all part of the
> >> same picture. If you're tracking changed relfilenodes, you'd better
> >> track dropped ones as well.
> >
> > What I am thinking about is the way GetNewRelFileNode() checks for
> > preexisting relfilenodes. It uses SnapshotDirty to scan for existing
> > relfilenodes for a newly created oid. Which means already dropped
> > relations could be reused.
> > I guess it could be as simple as using SatisfiesAny (or even better a
> > wrapper around SatisfiesVacuum that knows about recently dead tuples).
>
> I think modifying GetNewRelFileNode() is attacking the problem from
> the wrong end. The point is that when a table is dropped, that fact
> can be communicated to the same machine machinery that's been tracking
> the CTID->CTID mappings. Instead of saying "hey, the tuples that were
> in relfilenode 12345 are now in relfilenode 67890 in these new
> positions", it can say "hey, the tuples that were in relfilenode 12345
> are now GONE".

Unfortunately I don't understand what you're suggesting. What I am
worried about is something like:

<- decoding is here
VACUUM FULL pg_class; -- rewrites filenode 1 to 2
VACUUM FULL pg_class; -- rewrites filenode 2 to 3
VACUUM FULL pg_class; -- rewrites filenode 3 to 1
<- now decode up to here

In this case there are two possible (cmin,cmax) values for a specific
tuple. One from the original filenode 1 and one for the one generated
from 3.
Now that will only happen if there's an oid wraparound which hopefully
shouldn't happen very often, but I'd like to not rely on that.

> >> Completely aside from this issue, what
> >> keeps a relation from being dropped before we've decoded all of the
> >> changes made to its data before the point at which it was dropped? (I
> >> hope the answer isn't "nothing".)
> >
> > Nothing. But there's no need to prevent it, it'll still be in the
> > catalog and we don't ever access a non-catalog relation's data during
> > decoding.
>
> Oh, right. But what about a drop of a user-catalog table?

Currently nothing prevents that. I am not sure it's worth worrying about
it, do you think we should?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Leonardo Francalanci 2013-10-29 15:49:43 Re: Fast insertion indexes: why no developments
Previous Message Andres Freund 2013-10-29 15:37:14 Re: CLUSTER FREEZE