Re: [PATCH 03/16] Add a new syscache to fetch a pg_class entry via its relfilenode

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH 03/16] Add a new syscache to fetch a pg_class entry via its relfilenode
Date: 2012-06-14 18:50:51
Message-ID: CA+Tgmobs1DcPAY=-szCeRiGwe-hn9ADBjZXxy_S48jDLLD0v9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> This patch is problematic because formally indexes used by syscaches needs to
> be unique, this one is not though because of 0/InvalidOids entries for
> nailed/shared catalog entries. Those values aren't allowed to be queried though.

That's not the only reason it's not unique. Take a look at
GetRelFileNode(). We really only guarantee that <database OID,
tablespace OID, relfilenode, backend-ID>, taken as a four-tuple, is
unique. You could have the same relfilenode in different tablespaces,
or even within the same tablespace with different backend-IDs. The
latter might not matter for you because you're presumably disregarding
temp tables, but the former probably does. It's an uncommon scenario
because we normally set relid = relfilenode, and of course relid is
unique across the database, but the table gets rewritten then you end
up with relid != relfilenode, and I don't think there's anything at
that point that will prevent the new relfilenode from being chosen as
some other relations relfilenode, as long as it's in a different
tablespace.

I think the solution may be to create a specialized cache for this,
rather than relying on the general syscache infrastructure. You might
look at, e.g., attoptcache.c for an example. That would allow you to
build a cache that is aware of things like the relmapper
infrastructure, and the fact that temp tables are ignorable for your
purposes. But I think you will need to include at least the
tablespace OID in the key along with the relfilenode to make it
bullet-proof.

I haven't read through the patch series far enough to know what this
is being used for yet, but my fear is that you're using it to handle
mapping a relflenode extracted from the WAL stream back to a relation
OID. The problem with that is that relfilenode assignments obey
transaction semantics. So, if someone begins a transaction, truncates
a table, inserts a tuple, and commits, the heap_insert record is going
to refer to a relfilenode that, according to the system catalogs,
doesn't exist. This is similar to one of the worries in my other
email, so I won't belabor the point too much more here...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-14 19:10:02 Re: Allow WAL information to recover corrupted pg_controldata
Previous Message Tom Lane 2012-06-14 18:43:04 Re: libpq compression