Re: changeset generation v5-01 - Patches & git tree

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: changeset generation v5-01 - Patches & git tree
Date: 2013-06-28 14:49:26
Message-ID: 8422.1372430966@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2013-06-28 08:41:46 -0400, Robert Haas wrote:
>> The alternative I previously proposed was to make the WAL records
>> carry the relation OID. There are a few problems with that: one is
>> that it's a waste of space when logical replication is turned off, and
>> it might not be easy to only do it when logical replication is on.
>> Also, even when logic replication is turned on, things that make WAL
>> bigger aren't wonderful. On the other hand, it does avoid the
>> overhead of another index on pg_class.

> I personally favor making catalog modifications a bit more more
> expensive instead of increasing the WAL volume during routine
> operations.

This argument is nonsense, since it conveniently ignores the added WAL
entries created as a result of additional pg_class index manipulations.

Robert's idea sounds fairly reasonable to me; another 4 bytes per
insert/update/delete WAL entry isn't that big a deal, and it would
probably ease many debugging tasks as well as what you want to do.
So I'd vote for including the rel OID all the time, not conditionally.

The real performance argument against the patch as you have it is that
it saddles every PG installation with extra overhead for pg_class
updates whether or not that installation ever has or ever will make use
of changeset generation --- unlike including rel OIDs in WAL entries,
which might be merely difficult to handle conditionally, it's flat-out
impossible to turn such an index on or off. Moreover, even if one is
using changeset generation, the overhead is being imposed at the wrong
place, ie the master not the slave doing changeset extraction.

But that's not the only problem, nor even the worst one IMO. I said
before that a syscache with a non-unique key is broken by design, and
I stand by that estimate. Even assuming that this usage doesn't create
bugs in the code as it stands, it might well foreclose future changes or
optimizations that we'd like to make in the catcache code.

If you don't want to change WAL contents, what I think you should do
is create a new cache mechanism (perhaps by extending the relmapper)
that caches relfilenode to OID lookups and acts entirely inside the
changeset-generating slave. Hacking up the catcache instead of doing
that is an expedient kluge that will come back to bite us.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-06-28 14:49:36 Re: Documentation/help for materialized and recursive views
Previous Message David Fetter 2013-06-28 14:38:08 Re: Department of Redundancy Department: makeNode(FuncCall) division