Re: [v9.3] Row-Level Security

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Florian Pflug <fgp(at)phlo(dot)org>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [v9.3] Row-Level Security
Date: 2012-11-15 21:07:27
Message-ID: CADyhKSUBRFuXpgy=NjMgX2H57aRh4pQ7cDhLcpn=ThCGCwaPgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached patch is a revised version of row-level security
feature.
According to Robert's suggestion, I reworked implementation
around ALTER command, and logic to disable RLS during
FK/PK constraint checks.

In addition, I moved the entrypoint to apply row-level security
policy on the query tree next to the expand_inherited_tables,
because it became clear my previous approach is not
a straight-forward way to support update / delete cases.

This patch performs to replace RangeTblEntry of tables with
RLS policy by sub-queries that simply references the original
table with configured RLS policy. Also, the sub-queries have
security_barrier flag to prevent non-leakproof functions being
pushed down from outside of the sub-query.

This sub-query has target-list that just references columns of
underlying table, and ordered according to column definition
of the original table. So, we don't need to adjust varattno of
Var-node that reference regular columns, even though the
RangeTblEntry was replaced.
On the other hand, system-column is problematic because
sub-query does not have these columns due to nature of them.
So, I inject a logic to adjust varattno of Var-node that references
system-column of the target tables being replaced.
It works fine as follows:

postgres=> ALTER TABLE t1 SET ROW LEVEL SECURITY (a % 2 = 0);
ALTER TABLE
postgres=> ALTER TABLE t2 SET ROW LEVEL SECURITY (a % 2 = 1);
ALTER TABLE
postgres=> EXPLAIN (costs off) SELECT tableoid, * FROM t1 WHERE b like '%';
QUERY PLAN
-------------------------------------------
Result
-> Append
-> Subquery Scan on t1
Filter: (t1.b ~~ '%'::text)
-> Seq Scan on t1 t1_1
Filter: ((a % 2) = 0)
-> Subquery Scan on t2
Filter: (t2.b ~~ '%'::text)
-> Seq Scan on t2 t2_1
Filter: ((a % 2) = 1)
-> Seq Scan on t3
Filter: (b ~~ '%'::text)
(12 rows)

postgres=> SELECT tableoid, * FROM t1 WHERE b like '%';
tableoid | a | b
----------+----+-----
16385 | 2 | bbb
16385 | 4 | ddd
16385 | 6 | fff
16391 | 11 | sss
16391 | 13 | uuu
16391 | 15 | yyy
16397 | 21 | xyz
16397 | 22 | yzx
16397 | 23 | zxy
(9 rows)

Also, UPDATE / DELETE statement

postgres=> EXPLAIN (costs off) UPDATE t1 SET b = b || '_updt' WHERE b like '%';
QUERY PLAN
-------------------------------------
Update on t1
-> Subquery Scan on t1
Filter: (t1.b ~~ '%'::text)
-> Seq Scan on t1 t1_1
Filter: ((a % 2) = 0)
-> Subquery Scan on t2
Filter: (t2.b ~~ '%'::text)
-> Seq Scan on t2 t2_1
Filter: ((a % 2) = 1)
-> Seq Scan on t3
Filter: (b ~~ '%'::text)
(11 rows)

postgres=> UPDATE t1 SET b = b || '_updt' WHERE b like '%';
UPDATE 9

However, UPDATE / DELETE support is not perfect right now.
In case when we try to update / delete a table with inherited
children and RETURNING clause was added, is loses right
references to the pseudo columns, even though it works fine
without inherited children.

postgres=> UPDATE only t1 SET b = b || '_updt' WHERE b like '%' RETURNING *;
a | b
---+----------
2 | bbb_updt
4 | ddd_updt
6 | fff_updt
(3 rows)

UPDATE 3
postgres=> UPDATE t1 SET b = b || '_updt' WHERE b like '%' RETURNING *;
ERROR: variable not found in subplan target lists

I'm still under investigation of this behavior. Any comments
will be helpful to solve this problem.

Thanks,

2012/10/22 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> 2012/10/22 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Thu, Oct 18, 2012 at 2:19 PM, Alvaro Herrera
>> <alvherre(at)2ndquadrant(dot)com> wrote:
>>> Kohei KaiGai escribió:
>>>> The revised patch fixes the problem that Daen pointed out.
>>>
>>> Robert, would you be able to give this latest version of the patch a
>>> look?
>>
>> Yeah, sorry I've been completely sidelined this CommitFest. It's been
>> a crazy couple of months. Prognosis for future craziness reduction
>> uncertain. Comments:
>>
>> The documentation lists several documented limitations that I would
>> like to analyze a little bit. First, it says that row-level security
>> policies are not applied on UPDATE or DELETE. That sounds downright
>> dangerous to me. Is there some really compelling reason we're not
>> doing it?
>
> It intends to simplify the patch to avoid doing everything within a single
> patch. I'll submit the patch supporting UPDATE and DELETE for CF-Nov
> in addition to the base one.
>
>> Second, it says that row-level security policies are not
>> currently applied on INSERT, so you should use a trigger, but implies
>> that this will change in the future. I don't think we should change
>> that in the future; I think relying on triggers for that case is just
>> fine. Note that it could be an issue with the post-image for UPDATES,
>> as well, and I think the trigger solution is similarly adequate to
>> cover that case.
>
> Hmm. I should not have written this in section of the current limitation.
> It may give impression the behavior will be changed future.
> OK, I'll try to revise the documentation stuff.
>
>> With respect to the documented limitation regarding
>> DECLARE/FETCH, what exactly will happen? Can we describe this a bit
>> more clearly rather than just saying the behavior will be
>> unpredictable?
>>
> In case when user-id was switched after declaration of a cursor that
> contains qualifier depending on current_user, its results set contains
> rows with old user-id and rows with new user-id.
>
> Here is one other option rather than documentation fix.
> As we had a discussion on the upthread, it can be solved if we restore
> the user-id associated with the portal to be run, however, a problem is
> some commands switches user-id inside of the portal.
> http://archives.postgresql.org/pgsql-hackers/2012-07/msg00055.php
>
> Is there some good idea to avoid the problem?
>
>> It looks suspiciously as if the row-level security mode needs to be
>> saved and restored in all the same places we call save and restore the
>> user ID and security context. Is there some reason the
>> row-level-security-enabled flag shouldn't just become another bit in
>> the security context? Then we'd get all of this save/restore logic
>> mostly for free.
>>
> It seems to me a good idea, but I didn't find out this.
>
>> ATExecSetRowLevelSecurity() calls SetRowLevelSecurity() or
>> ResetRowLevelSecurity() to update pg_rowlevelsec, but does the
>> pg_class update itself. I think that all of this logic should be
>> moved into a single function, or at least functions in the same file,
>> with the one that only updates pg_rowlevelsec being static and
>> therefore not able to be called from outside the file. We always need
>> the pg_class update and the pg_rowlevelsec update to happen together,
>> so it's not good to have an exposed function that does one of those
>> updates but not the other. I think the simplest thing is just to move
>> ATExecSetRowLevelSecurity to pg_rowlevelsec.c and rename it to
>> SetRowLevelSecurity() and then give it two static helper functions,
>> say InsertPolicyRow() and DeletePolicyRow().
>>
> OK, I'll rework the code.
>
>> I think it would be good if Tom could review the query-rewriting parts
>> of this (viz rowlevelsec.c) as I am not terribly familiar with this
>> machinery, and of course anything we get wrong here will have security
>> consequences. At first blush, I'm somewhat concerned about the fact
>> that we're trying to do this after query rewriting; that seems like it
>> could break things. I know KaiGai mentioned upthread that the
>> rewriter won't be rerun if the plan is invalidated, but (1) why is
>> that OK now? and (2) if it is OK now, then why is it OK to do
>> rewriting of the RLS qual - only - after rewriting if all of the rest
>> of the rewriting needs to happen earlier?
>>
> I just follow the existing behavior of plan invalidation; that does not
> re-run the query rewriter. So, if we have no particular reason why
> we should not run the rewriter again to handle RLS quals, it might
> be an option to handle RLS as a part of rewriter.
>
> At least, here is two problems. 1) System column is problematic
> when SELECT statement is replaced by sub-query. 2) It makes
> infinite recursion when a certain table has SELECT INSTEAD
> rule with a sub-query on the same table.
>
> Thanks,
> --
> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment Content-Type Size
pgsql-v9.3-row-level-security.rw.v6.patch application/octet-stream 134.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-11-15 21:09:06 Re: pg_ctl reload -o "...."
Previous Message Dimitri Fontaine 2012-11-15 20:57:43 Re: Further pg_upgrade analysis for many tables