Re: [PATCH] Largeobject access controls

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Largeobject access controls
Date: 2009-10-15 00:54:06
Message-ID: 4AD672AE.4080208@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
>> The newer basis is same as other database objects, such as functions.
>> But why pg_dump performs correctly for these database objects?
>
> The reason pg_dump works reasonably well is that it takes locks on
> tables, and the other objects it dumps (such as functions) have
> indivisible one-row representations in the catalogs. They're either
> there when pg_dump looks, or they're not. What you would have here
> is that pg_dump would see LO data chunks when it looks into
> pg_largeobject using its transaction snapshot, and then its attempts to
> open those LO OIDs would fail because the metadata is not there anymore
> according to SnapshotNow.

It needs to break down the matter more simple.

This patch adds the pg_largeobject_metadata catalog, and a certain entry
within the catalog is refered by multiple entries within pg_largeobject.
At the viewpoint of data model, it is equivalent that any pg_largeobject
tuples which share same LOID always have common copy of the metadata.
(If we tries to implement this actually, it needs to keep consistency of
the part of metadata for each writer operations!)

In the current design, when we access a certain large object with read-only
mode, query's snapshot is used, not always SnapshotNow.
If we assume any data chunk has its metadata part, the metadata should be
also refered from the query's snapshot. In fact, this patch stores the
part of metadata within pg_largeobject_metadata. But it seems to me the
principle is that same snapshot which used for data chunks should be
applied to scan the metadata.

So, I can agree the following approach:

> So I'm kind of inclined to say that the least evil solution is to apply
> the permissions check using the query-snapshot-time metadata row.
> It's definitely a debatable question though. We'd also want to make sure
> that the aclcheck code doesn't fail if it finds a nonexistent role ID
> in the ACL.

In this approach, we cannot apply the current pg_largeobject_aclcheck()
because it does not have an argument to deliver the preferable snapshot.
So, we need to extract acl field from the tuple being visible from the
appropriate snapshot, then calls aclmask() routine.

The aclmask() just compares identifiers in numeris representation,
so it seems to me this routine does not raise an error if it finds
a nonexistent role ID from the viewpoint of SnapshotNow.

aclmask(const Acl *acl, Oid roleid, Oid ownerId,
AclMode mask, AclMaskHow how)

It requires five arguments. The acl and ownerId can be fetched from the
metadata with query's snapshot. The roleid can be given by GetUserId().
The mask and how are constant.

An I missing something?

> We do not commit system changes that lack necessary pg_dump support.
> There are some things you can leave till later, but pg_dump is not
> negotiable. (Otherwise, testers would be left with databases they
> can't dump/reload across the next forced initdb.)

OK, I'll add support pg_dump soon.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-10-15 00:54:46 Re: Review of "SQLDA support for ECPG"
Previous Message Robert Haas 2009-10-15 00:46:39 Re: Buffer usage in EXPLAIN and pg_stat_statements (review)