Re: [PATCH] Largeobject access controls

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Largeobject access controls
Date: 2009-10-15 00:43:21
Message-ID: 603c8f070910141743o7c856c3bqa0eb80f226d7a304@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 14, 2009 at 12:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com> writes:
>> Tom Lane wrote:
>>> The most serious problem is that you ripped out myLargeObjectExists(),
>>> apparently because you didn't understand what it's there for.  The reason
>>> it's there is to ensure that pg_dump runs don't fail.  pg_dump is expected
>>> to produce a consistent dump of all large objects that existed at the
>>> time of its transaction snapshot.  With this code, pg_dump would get a
>>> "large object doesn't exist" error on any LO that is deleted between the
>>> time of the snapshot and the time pg_dump actually tries to dump it ---
>>> which could be quite a large window in a large database.
>
>> The origin of this matter is the basis of existence was changed.
>> Our current basis is whether pg_largeobject has one or more data chunk for
>> the given loid in the correct snapshot, or not.
>> The newer one is whether pg_largeobject_metadata has the entry for the given
>> loid in the SnapshotNow, or not.
>
> Which is wrong.  You can certainly switch your attention as to which
> catalog to look in, but you can't change the snapshot that the test is
> referenced to.
>
>> The newer basis is same as other database objects, such as functions.
>> But why pg_dump performs correctly for these database objects?
>
> The reason pg_dump works reasonably well is that it takes locks on
> tables, and the other objects it dumps (such as functions) have
> indivisible one-row representations in the catalogs.  They're either
> there when pg_dump looks, or they're not.  What you would have here
> is that pg_dump would see LO data chunks when it looks into
> pg_largeobject using its transaction snapshot, and then its attempts to
> open those LO OIDs would fail because the metadata is not there anymore
> according to SnapshotNow.
>
> There are certainly plenty of corner-case issues in pg_dump; I've
> complained before that it's generally a bad idea to be migrating pg_dump
> functionality into internal backend routines that tend to rely on
> SnapshotNow.  But if we change LO dumping like this, it's not going to
> be a corner case --- it's going to be a common race-condition failure
> with a window measured in many minutes if not hours.  That's more than
> sufficient reason to reject this patch, because the added functionality
> isn't worth it.  And pg_dump isn't even the only client that depends
> on the assumption that a read-only open LO is static.
>
>>> Moving on to lesser but still significant problems, you probably already
>>> guessed my next one: there's no pg_dump support.
>
>> Hmm. I planed to add support to the pg_dump next to the serve-side enhancement.
>
> We do not commit system changes that lack necessary pg_dump support.
> There are some things you can leave till later, but pg_dump is not
> negotiable.  (Otherwise, testers would be left with databases they
> can't dump/reload across the next forced initdb.)

As part of closing out this CommitFest, I am marking this patch as
Returned with Feedback. Because the amount of work that remains to be
done is so substantial, that might have been a good idea anyway, but
since the clock is expiring the point is moot.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-10-15 00:46:39 Re: Buffer usage in EXPLAIN and pg_stat_statements (review)
Previous Message Alvaro Herrera 2009-10-14 23:36:58 Re: Inappropriate failure conditions in foreign_data regression test