Re: logical changeset generation v4

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: logical changeset generation v4
Date: 2013-01-18 17:32:53
Message-ID: 20130118173253.GL29501@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-18 11:48:43 -0500, Robert Haas wrote:
> On Fri, Jan 18, 2013 at 11:33 AM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Andres Freund wrote:
> >
> >> [09] Adjust all *Satisfies routines to take a HeapTuple instead of a HeapTupleHeader
> >>
> >> For timetravel access to the catalog we need to be able to lookup (cmin,
> >> cmax) pairs of catalog rows when were 'inside' that TX. This patch just
> >> adapts the signature of the *Satisfies routines to expect a HeapTuple
> >> instead of a HeapTupleHeader. The amount of changes for that is fairly
> >> low as the HeapTupleSatisfiesVisibility macro already expected the
> >> former.
> >>
> >> It also makes sure the HeapTuple fields are setup in the few places that
> >> didn't already do so.
> >
> > I had a look at this part. Running the regression tests unveiled a case
> > where the tableOid wasn't being set (and thus caused an assertion to
> > fail), so I added that. I also noticed that the additions to
> > pruneheap.c are sometimes filling a tuple before it's strictly
> > necessary, leading to wasted work. Moved those too.
> >
> > Looks good to me as attached.
>
> I took a quick look at this and am just curious why we're adding the
> requirement that t_tableOid has to be initialized?

Its a stepping stone for catalog timetravel. I separated it into a different
patch because it seems to make the real patch easier to review without having
to deal with all those unrelated hunks.

The reason why we need t_tableOid and a valid ItemPointer is that during
catalog timetravel (so we can decode the heaptuples in WAL) we need to
see tuples in the catalog that have been changed in the transaction we
travelled to. That means we need to lookup cmin/cmax values which aren't
stored separately anymore.

My first approach was to build support for logging allocated combocids
(only for catalog tables) and use the existing combocid infrastructure
to look them up.
Turns out thats not a correct solution, consider this:
* T100: INSERT (xmin: 100, xmax: Invalid, (cmin|cmax): 3)
* T101: UPDATE (xmin: 100, xmax: 101, (cmin|cmax): 10)

If you know travel to T100 and you want to decide whether that tuple is
visible when in CommandId = 5 you have the problem that the original
cmin value has been overwritten by the cmax from T101. Note that in this
scenario no ComboCids have been generated!
The problematic part is that the information about what happened is
only available in T101.

I took resolve to doing something similar to what the heap rewrite code
uses to track update chains. Everytime a catalog tuple
inserted/updated/deleted (filenode, ctid, cmin, cmax) is wal logged (if
wal_level=logical) and while traveling to a transaction all those are
put up in a hash table so they can get looked up if we need the
respective cmin/cmax values. As we do that for all modifications of
catalog tuples in that transaction we only ever need that mapping when
inspecting that specific transaction.

Seems to work very nicely, I have made quite some tests with it and I
know of no failure cases.

To be able to make that lookup we need to get the relfilenode & item
pointer of the tuple were just looking up. Thats why I changed the
signature to pass a HeapTuple instead of a HeapTupleHeader. We get the
relfilenode from the buffer that has been passed *not* from the passed
table oid.
So requiring a valid table oid isn't strictly required as long as the
item pointer is valid, but it has made debugging noticeably easier.

Makes sense?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-01-18 17:37:04 Re: logical changeset generation v4
Previous Message Tom Lane 2013-01-18 17:16:16 Re: HS locking broken in HEAD