Re: changeset generation v5-01 - Patches & git tree

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: changeset generation v5-01 - Patches & git tree
Date: 2013-06-28 07:32:06
Message-ID: 20130628073206.GA11757@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-06-27 18:18:50 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > I'm looking at the combined patches 0003-0005, which are essentially all
> > about adding a function to obtain relation OID from (tablespace,
> > filenode). It takes care to look through the relation mapper, and uses
> > a new syscache underneath for performance.
>
> > One question about this patch, originally, was about the usage of
> > that relfilenode syscache. It is questionable because it would be the
> > only syscache to apply on top of a non-unique index.
>
> ... which, I assume, is on top of a pg_class index that doesn't exist
> today. Exactly what is the argument that says performance of this
> function is sufficiently critical to justify adding both the maintenance
> overhead of a new pg_class index, *and* a broken-by-design syscache?

Ok, so this requires some context. When we do the changeset extraction
we build a mvcc snapshot that for every heap wal record is consistent
with one made at the time the record has been inserted. Then, when we've
built that snapshot, we can use it to turn heap wal records into the
representation the user wants:

For that we first need to know which table a change comes from, since
otherwise we obviously cannot interpret the HeapTuple that's essentially
contained in the wal record without it. Since we have a correct mvcc
snapshot we can query pg_class for (tablespace, relfilenode) to get back
the relation. When we know the relation, the user (i.e. the output
pluggin) can use normal backend code to transform the HeapTuple into the
target representation, e.g. SQL, since we can build a TupleDesc. Since
the syscaches are synchronized with the built snapshot normal output
functions can be used.

What that means is that for every heap record in the target database in
the WAL we need to query pg_class to turn the relfilenode into a
pg_class.oid. So, we can easily replace syscache.c with some custom
caching code, but I don't think it's realistic to get rid of that
index. Otherwise we need to cache the entire pg_class in memory which
doesn't sound enticing.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Chalke 2013-06-28 07:32:17 Re: checking variadic "any" argument in parser - should be array
Previous Message Michael Paquier 2013-06-28 07:30:16 Re: Support for REINDEX CONCURRENTLY