Re: [PATCH] [v8.5] Security checks on largeobjects

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Joshua Tolley <eggyknap(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bernd Helmle <mailings(at)oopsware(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] [v8.5] Security checks on largeobjects
Date: 2009-07-16 04:49:14
Message-ID: 4A5EB14A.1030302@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Joshua,

I found your name as a reviewer at the commitfest.postgresql.org.

However, I don't think the initial proposal of the largeobject
security is now on the state to be reviewed seriously.

In my preference, it should be reworked based on the TOASTing
approach as discussed in this thread previously, if we can
find a reasonable solution for the issue I raised before.

KaiGai Kohei wrote:
> I could find one more issue when we apply largeobject-style interfaces
> on generic toasted varlena data.
>
> When we fetch a toasted datum, it scans the pg_toast_%u with SnapshotToast,
> because it assumes any toasted chunks don't have multiple versions, and
> visibility of the toast pointer always means visibility of the toast chunks.
>
> However, if we provide largeobject-style interfaces which allow partial
> updates on toasted varlena, it seems to me this assumption will get being
> incorrect.
>
> Is there any good idea?

The largeobject interface uses SnapshotNow for writable accesses, and
GetActiveSnapshot() for read-only accesses, but toast_fetch_datum()
uses SnapshotToast to scan the toast relation.

It seems to me SnapshotToast depends on an assumption that tuples
within TOAST relation does not have any multiple versions.
When we update a toast value, TOAST mechanism inserts whole of
variable length datum with a new chunk_id, and older chunks are
removed at toast_delete_datum().
The TOAST pointer is updated to the new chunk_id, and its visibility
is under MVCC controls.

The source code comments at HeapTupleSatisfiesToast() says as follows:

/*
* HeapTupleSatisfiesToast
* True iff heap tuple is valid as a TOAST row.
*
* This is a simplified version that only checks for VACUUM moving conditions.
* It's appropriate for TOAST usage because TOAST really doesn't want to do
* its own time qual checks; if you can see the main table row that contains
* a TOAST reference, you should be able to see the TOASTed value. However,
* vacuuming a TOAST table is independent of the main table, and in case such
* a vacuum fails partway through, we'd better do this much checking.
*
* Among other things, this means you can't do UPDATEs of rows in a TOAST
* table.
*/

If largeobjects-like interface is available to update a part of TOAST
values, we cannot keep the assumption.

I would like to conclude the issue before reworking it.

Is there any good idea?
If we follow the current largeobject manner, what problems can be found?

Thanks,

> KaiGai Kohei wrote:
>> I concluded that the following issues should be solved when we apply
>> largeobject-like interfaces on the big toasted data within general
>> relations, not only pg_largeobject system catalog.
>>
>> At first, we need to add a new strategy to store the given varlena data
>> on the external toast relation.
>> If we try to seek and fetch a certain data chunk, it is necessary to be
>> computable what chunk stores the required data specified by offset and
>> length. So, the external chunks should be uncompressed always. It is a
>> common requirement for both of read and write operations.
>> If we try to update a part of the toasted data chunks, it should not be
>> inlined independent from length of the datum, because we need to update
>> whole the tuple which contains inlined toasted chunks in this case.
>> If we open the toasted varlena with read-only mode, inlined one does not
>> prevent anything. It is an issue for only write operation.
>>
>> I would like to add a new strategy on pg_type.typstorage with the following
>> characteristics:
>> 1. It always stores the given varlena data without any compression.
>> So, the given data is stored as a set of fixed-length chunks.
>> 2. It always stores the given varlena data on external toast relation.
>>
>> I suggest a new built-in type named BLOB which has an identical definition
>> to BYTEA type, expect for its attstorage.
>>
>> Next, a different version of lo_open() should be provided to accept
>> BLOB type as follows:
>>
>> SELECT pictname, lo_open(pictdata, x'20000'::int) FROM my_picture;
>>
>> It will allocate a largeobject descriptor for the given BLOB data,
>> and user can read and write using loread() and lowrite() interfaces.
>>
>> issue:
>> In this case, should it hold the relation handler and locks on the
>> "my_picture" relation, not only its toast relation?
>> issue:
>> Should the lo_open() with read-only mode be available on the existing
>> TEXT or BYTEA types? I could not find any reason to deny them.
>>
>> Next, pg_largeobject system catalog can be redefined using the BLOB
>> type as follows:
>>
>> CATALOG(pg_largeobject,2613)
>> {
>> Oid loowner; /* OID of the largeobject owner */
>> Oid lonsp; /* OID of the largeobject namespace */
>> aclitem loacl[1]; /* access permissions */
>> blob lodata; /* contents of the largeobject */
>> } FormData_pg_largeobject;
>>
>> The existing largeobject interfaces perform on pg_largeobject.lodata
>> specified by largeobject identifier.
>> Rest of metadata can be used for access control purpose.
>>
>> Thanks,
>>
>> KaiGai Kohei wrote:
>>> Tom Lane wrote:
>>>> Bernd Helmle <mailings(at)oopsware(dot)de> writes:
>>>>> It might be interesting to dig into your proposal deeper in conjunction
>>>>> with TOAST (you've already mentioned this TODO). Having serial access with
>>>>> a nice interface into TOAST would be eliminating the need for
>>>>> pg_largeobject completely (i'm not a big fan of this one-big-system-table
>>>>> approach the old LO interface currently is).
>>>> Yeah, it would be more useful probably to fix that than to add
>>>> decoration to the LO facility. Making LO more usable is just going to
>>>> encourage people to bump into its other limitations (32-bit OIDs,
>>>> 32-bit object size, finite maximum size of pg_largeobject, lack of
>>>> dead-object cleanup, etc etc).
>>> The reason why I tried to mention the named largeobject feature is
>>> that dac security checks on largeobject require them to belong to
>>> a certain schema, so I thought it is quite natural to have a string
>>> name. However, obviously, it is not a significant theme for me.
>>>
>>> I can also agree your opinion that largeobject interfaces should be
>>> redefined to access partial stuff of TOAST'ed verlena data structure,
>>> not only pg_largeobject.
>>>
>>> In this case, we will need a new pg_type.typstorage option which
>>> force to put the given verlena data on external relation without
>>> compression, because we cannot estimate the data offset in inlined
>>> or compressed external verlena data.
>>>
>>> I'll try to submit a design within a few days.
>>> Thanks,
>>
>
>

--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Brendan Jurd 2009-07-16 05:22:52 Re: WIP: generalized index constraints
Previous Message Bruce Momjian 2009-07-16 04:28:47 Re: [GENERAL] pg_migrator not setting values of sequences?