Re: RLS Design

Lists: pgsql-hackers
From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Smith <greg(at)2ndQuadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-03-05 05:55:11
Message-ID: 5316BC3F.4040908@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all

One of the remaining issues with row security is how to pass plan
invalidation information generated in the rewriter back into the planner.

With row security, it's necessary to set a field in PlannerGlobal,
tracking the user ID of the user the query was planned for if row
security was applied. It is also necessary to add a PlanInvalItem for
the user ID.

Currently the rewriter has no way to pass this information to the
planner. QueryRewrite returns just a Query*.

We use Query structs throughout the rewriter and planner; it doesn't
make sense to add a List* field for PlanInvalItem nodes and an Oid field
for the user ID to the Query node when it's only ever going to get used
for the top level Query node returned by the rewriter, and only for long
enough to copy the data into PlannerGlobal.

The alternative seems to be changing the return type of QueryRewrite,
introducing a new node type, say:

struct RewriteResult {
Query *productQuery;
Oid planUserId;
List* planInvalItems;
}

This seems cleaner, and more extensible, but it means changing a fair
bit of API, including:

pg_plan_query
planner
standard_planner
planner_hook_type
QueryRewrite

and probably the plan cache infrastructure too. So it'd be fairly
invasive, and I know that creates concerns about backpatching and
extensions.

I can't just polymorphically subclass Query as some kind of "TopQuery" -
no true polymorphism in C, would need a new NodeType for it, and then
need to teach everything that knows about T_Query about T_TopQuery too.
So that won't work.

So, I'm looking for advice before I embark on this change. I need _some_
way to pass invalidation information from the rewriter into the planner
when it's collected by row security code during rewriting.

Any advice/comments?

I'm inclined to bite the bullet and make the API change. It'll be a
pain, but I can see future uses for passing global info out of the
rewriter rather than shoving it into per-Query structures. I'd define a
RewriteResult and pass that down into all the rewriter internal
functions, then return the outer query wrapped in it.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Smith <greg(at)2ndQuadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-03-05 17:43:44
Message-ID: 20140305174344.GM4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig Ringer escribió:

> One of the remaining issues with row security is how to pass plan
> invalidation information generated in the rewriter back into the planner.

I think I already asked this, but would it work to extract this info by
walking the rewritten list of queries instead; and in case it would,
would that be any easier than the API change you're proposing?

> We use Query structs throughout the rewriter and planner; it doesn't
> make sense to add a List* field for PlanInvalItem nodes and an Oid field
> for the user ID to the Query node when it's only ever going to get used
> for the top level Query node returned by the rewriter, and only for long
> enough to copy the data into PlannerGlobal.

So there is an assumption that you can't have a subquery that uses a
different role ID than the main query. That sounds fine, and anyway I
don't think we're prepared to deal with differing userids for
subqueries, so the proposal that it belongs only on the top-level node
is acceptable. And from there, it seems that not putting the info in
Query (which would be a waste everywhere else than the toplevel query
node) is sensible.

> The alternative seems to be changing the return type of QueryRewrite,
> introducing a new node type, say:
>
> struct RewriteResult {
> Query *productQuery;
> Oid planUserId;
> List* planInvalItems;
> }
>
> This seems cleaner, and more extensible, but it means changing a fair
> bit of API, including:
>
> pg_plan_query
> planner
> standard_planner
> planner_hook_type
> QueryRewrite

I think we should just bite the bullet and do the change (a new struct,
I assume, not a new node). It will cause an incompatibility to anyone
that has written planner hooks, but probably the number of such hooks is
not very large anyway.

I don't think we should base decisions on the amount of backpatching
pain we cause, for patches that involve new functionality such as this
one. We commit patches that will cause future merge conflicts all the
time.

> I'm inclined to bite the bullet and make the API change. It'll be a
> pain, but I can see future uses for passing global info out of the
> rewriter rather than shoving it into per-Query structures. I'd define a
> RewriteResult and pass that down into all the rewriter internal
> functions, then return the outer query wrapped in it.

Is there already something in Query that could be a toplevel struct
member only?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-03-05 18:58:13
Message-ID: 17164.1394045893@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com> writes:
> One of the remaining issues with row security is how to pass plan
> invalidation information generated in the rewriter back into the planner.

> With row security, it's necessary to set a field in PlannerGlobal,
> tracking the user ID of the user the query was planned for if row
> security was applied. It is also necessary to add a PlanInvalItem for
> the user ID.

TBH I'd just add a user OID field in struct Query and not hack up a bunch
of existing function APIs. It's not much worse than the existing
constraintDeps field.

The PlanInvalItem could perfectly well be generated by the planner,
no, if it has the user OID? But I'm not real sure why you need it.
I don't see the reason for an invalidation triggered by user ID.
What exactly about the *user*, and not something else, would trigger
plan invalidation?

What we do need is a notion that a plan cache entry might only be
valid for a specific calling user ID; but that's a matter for cache
entry lookup not for subsequent invalidation.

regards, tom lane


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-03-07 06:18:46
Message-ID: 531964C6.4050309@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/06/2014 02:58 AM, Tom Lane wrote:
> Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com> writes:
>> One of the remaining issues with row security is how to pass plan
>> invalidation information generated in the rewriter back into the planner.
>
>> With row security, it's necessary to set a field in PlannerGlobal,
>> tracking the user ID of the user the query was planned for if row
>> security was applied. It is also necessary to add a PlanInvalItem for
>> the user ID.
>
> TBH I'd just add a user OID field in struct Query and not hack up a bunch
> of existing function APIs. It's not much worse than the existing
> constraintDeps field.

If you're happy with that, I certainly won't complain. It's much simpler
and less intrusive.

I should be able to post an update using this later today.

> The PlanInvalItem could perfectly well be generated by the planner,
> no, if it has the user OID? But I'm not real sure why you need it.
> I don't see the reason for an invalidation triggered by user ID.
> What exactly about the *user*, and not something else, would trigger
> plan invalidation?

It's only that the plan depends on the user ID. There's no point keeping
the plan around if the user no longer exists.

You're quite right that this can be done in the planner when a
dependency on the user ID is found, though. So there's no need to pass a
PlanInvalItem down, which is a lot nicer.

> What we do need is a notion that a plan cache entry might only be
> valid for a specific calling user ID; but that's a matter for cache
> entry lookup not for subsequent invalidation.

Yes, that would be good, but is IMO more of a separate optimization. I'm
currently using KaiGai's code to invalidate and re-plan when a user ID
change is detected.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-03-07 17:30:15
Message-ID: 18430.1394213415@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> On 03/06/2014 02:58 AM, Tom Lane wrote:
>> The PlanInvalItem could perfectly well be generated by the planner,
>> no, if it has the user OID? But I'm not real sure why you need it.
>> I don't see the reason for an invalidation triggered by user ID.
>> What exactly about the *user*, and not something else, would trigger
>> plan invalidation?

> It's only that the plan depends on the user ID. There's no point keeping
> the plan around if the user no longer exists.

[ shrug... ] Leaving such a plan cached would be harmless, though.
Furthermore, the user ID we'd be talking about is either the owner
of the current session, or the owner of some view or security-definer
function that the plan is already dependent on, so it's fairly hard
to credit that the plan would survive long enough for the issue to
arise.

Even if there is a scenario where invalidating by user ID is actually
useful, I think adding infrastructure to cause invalidation in such a case
is optimizing for the wrong thing. You're adding cycles to every query to
benefit a case that is going to be quite infrequent in practice.

>> What we do need is a notion that a plan cache entry might only be
>> valid for a specific calling user ID; but that's a matter for cache
>> entry lookup not for subsequent invalidation.

> Yes, that would be good, but is IMO more of a separate optimization. I'm
> currently using KaiGai's code to invalidate and re-plan when a user ID
> change is detected.

I'm unlikely to accept a patch that does that; wouldn't it be catastrophic
for performance in the presence of security-definer functions? You can't
just trash the whole plan cache when a user ID switch occurs.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-04-15 02:06:12
Message-ID: 20140415020612.GA2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig, Tom, all,

I've been through the RLS code over the past couple of days which I
pulled from Craig's repo and have a bunch of minor updates. In general,
the patch seems pretty reasonable- except for the issues discussed
below. Quite a bit of this patch is tied up in plan invalidation and
tracking if the security quals depend on the current user, all of which
seems pretty grotty and the wrong way around to me.

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> > It's only that the plan depends on the user ID. There's no point keeping
> > the plan around if the user no longer exists.
>
> [ shrug... ] Leaving such a plan cached would be harmless, though.

Agreed.

> Furthermore, the user ID we'd be talking about is either the owner
> of the current session, or the owner of some view or security-definer
> function that the plan is already dependent on, so it's fairly hard
> to credit that the plan would survive long enough for the issue to
> arise.

I don't entirely follow which 'issue' is being referred to here, but we
need to consider that 'set role' changes should also cause a new plan.

> Even if there is a scenario where invalidating by user ID is actually
> useful, I think adding infrastructure to cause invalidation in such a case
> is optimizing for the wrong thing. You're adding cycles to every query to
> benefit a case that is going to be quite infrequent in practice.

Yeah, I have a hard time seeing that there's an issue w/ keeping the
cached plans around even if the session never goes back to being under
the user ID for which those older plans were built. Also, wouldn't a
'RESET ALL' clear any of them anyway?

> > Yes, that would be good, but is IMO more of a separate optimization. I'm
> > currently using KaiGai's code to invalidate and re-plan when a user ID
> > change is detected.
>
> I'm unlikely to accept a patch that does that; wouldn't it be catastrophic
> for performance in the presence of security-definer functions? You can't
> just trash the whole plan cache when a user ID switch occurs.

Yeah, this doesn't seem like the right approach. Adding the user ID to
the cache key definitely strikes me as the right way to fix this.

I've uploaded the latest patch, rebased against master, with my changes
to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't
believe it'd clear the mailing list (it's 29k).

I'll take a look at changing the cache key to include user ID and
ripping out the plan invalidation logic from the current patch tomorrow
but I seriously doubt I'll be able to get all of that done in the next
day or two. If anyone else is able to help out, it'd certainly be
appreciated; I really think that's the main hurdle to address at this
point with this patch- without the plan invalidation complexity, the
the patch is really just building out the catalog, the SQL-level
statements for managing it, and the bit of code required to add the
conditional to statements involving RLS-enabled tables.

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-04-15 02:15:25
Message-ID: 12181.1397528125@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> I've uploaded the latest patch, rebased against master, with my changes
> to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't
> believe it'd clear the mailing list (it's 29k).

Please actually post it, for the archives' sake. 29k is far below the
list limit. (Which I don't know exactly what it is ... but certainly
in the hundreds of KB.)

> I'll take a look at changing the cache key to include user ID and
> ripping out the plan invalidation logic from the current patch tomorrow
> but I seriously doubt I'll be able to get all of that done in the next
> day or two.

TBH I think we are up against the deadline. April 15 was the agreed-to
drop dead date for pushing new features into 9.4.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-04-15 02:23:24
Message-ID: 20140415022324.GB2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > I've uploaded the latest patch, rebased against master, with my changes
> > to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz as I don't
> > believe it'd clear the mailing list (it's 29k).
>
> Please actually post it, for the archives' sake. 29k is far below the
> list limit. (Which I don't know exactly what it is ... but certainly
> in the hundreds of KB.)

Huh, thought it was more like 25k. Well, here goes then...

> > I'll take a look at changing the cache key to include user ID and
> > ripping out the plan invalidation logic from the current patch tomorrow
> > but I seriously doubt I'll be able to get all of that done in the next
> > day or two.
>
> TBH I think we are up against the deadline. April 15 was the agreed-to
> drop dead date for pushing new features into 9.4.

Yeah. :/ May be for the best anyway, this should be able to go in early
in the 9.5 cycle and get more testing and refinement. Still stinks
though as I feel like this patch didn't get the attention it should have
due to a simple misunderstanding, but we do need to stop at some point
to get a release together.

Thanks,

Stephen

Attachment Content-Type Size
rls_ringerc_sf.patch.gz application/octet-stream 28.6 KB

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-04-25 00:51:28
Message-ID: 5359B190.3010908@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/15/2014 10:06 AM, Stephen Frost wrote:
> I've uploaded the latest patch, rebased against master, with my
> changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz
> as I don't believe it'd clear the mailing list (it's 29k).

Does this exist in the form of an accessible git branch, too?

I was trying to maintain the patch as a series of distinct changes to
make it easier to see what each part is doing, and it'd be nice to
preserve that if possible. It also makes seeing what's changed a lot
easier.

- --
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTWbGNAAoJELBXNkqjr+S28W4H/R49CJfz4Y3TMbvwxhrwkjL2
WEv80qY4GDCzG5CGKROn3kT9H5xePvL9eadSjr+CPsilerHrPkHmXnU5w+K2LnKV
MCL/A2969b4ng1cUK9eHEFVx9BLLQmiVI6DbJ2OA2oWUs/Y7Zne5h6q0fNnnnTSq
XEU6r3tVkUp5ipbhHi+aJ+mfckirdcMR0U5X+2fgGpLZ3D+8j9azvuXvQjSOekVB
3+EVVI0UXhhvw4It4/1CjieHvScdxnsz9bOpKGiEeePUB3CGC0iPtBgIGtE0n2OK
cqKryuwZ3++LZih74M8z+Rn6yao5f4ElJrO3gz5q8axKzH/bHkEYElwEUhVfbSE=
=AKzL
-----END PGP SIGNATURE-----


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-04-25 01:20:45
Message-ID: 20140425012045.GB2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Craig Ringer (craig(at)2ndquadrant(dot)com) wrote:
> On 04/15/2014 10:06 AM, Stephen Frost wrote:
> > I've uploaded the latest patch, rebased against master, with my
> > changes to here: http://snowman.net/~sfrost/rls_ringerc_sf.patch.gz
> > as I don't believe it'd clear the mailing list (it's 29k).
>
> Does this exist in the form of an accessible git branch, too?

Eh, no.

> I was trying to maintain the patch as a series of distinct changes to
> make it easier to see what each part is doing, and it'd be nice to
> preserve that if possible. It also makes seeing what's changed a lot
> easier.

Yeah, I almost just posted a patch against your tree. I'll look at
doing that tomorrow.

Thanks,

Stephen


From: Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-10 17:46:30
Message-ID: CAC+8xRLUi33F7Rmo6BNMkxUb1UPeuzt+UtEpvbXPZdgG79HDCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

This is my first post to the mailing list and I am looking forward to
working with everyone in the community.

With that said...

I'll take a look at changing the cache key to include user ID and
> ripping out the plan invalidation logic from the current patch tomorrow
> but I seriously doubt I'll be able to get all of that done in the next
> day or two. If anyone else is able to help out, it'd certainly be
> appreciated; I really think that's the main hurdle to address at this
> point with this patch- without the plan invalidation complexity, the
> the patch is really just building out the catalog, the SQL-level
> statements for managing it, and the bit of code required to add the
> conditional to statements involving RLS-enabled tables.
>

I have been collaborating with Stephen on addressing this particular item
with RLS.

As a basis, I have been working with Craig's 'rls-9.4-upd-sb-views' branch
rebased against master around 9.4beta1.

Through this effort, we have concluded that for RLS the case of
invalidating a plan is only necessary when switching between a superuser
and a non-superuser. Obviously, re-planning on every role change would be
too costly, but this approach should help minimize that cost. As well,
there were not any cases outside of this one that were immediately apparent
with respect to RLS that would require re-planning on a per userid basis.

I have tested this approach with the following patch.

https://github.com/abrightwell/postgres/commit/4c959e63f7a89b24ebbd46575a31a629d24efa75

Does this sound like a sane approach? Thoughts? Recommendations?

Thanks,
Adam


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-10 18:19:41
Message-ID: 2089.1402424381@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com> writes:
> Through this effort, we have concluded that for RLS the case of
> invalidating a plan is only necessary when switching between a superuser
> and a non-superuser. Obviously, re-planning on every role change would be
> too costly, but this approach should help minimize that cost. As well,
> there were not any cases outside of this one that were immediately apparent
> with respect to RLS that would require re-planning on a per userid basis.

Hm ... I'm not following why we'd need a special case for superusers and
not anyone else? Seems like any useful RLS scheme is going to require
more privilege levels than just superuser and not-superuser.

Could we put the "if superuser then ok" test into the RLS condition test
and thereby not need more than one plan at all?

regards, tom lane


From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-10 20:28:24
Message-ID: CAKRt6CTcLGEjcfb5Ahg2nmjNBTZL1kHC-rEx5tX30WTZzwdftg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hey Tom,

> Hm ... I'm not following why we'd need a special case for superusers and
> not anyone else? Seems like any useful RLS scheme is going to require
> more privilege levels than just superuser and not-superuser.
>

As it stands right now, superuser is the only case where RLS policies
should not be applied/completely ignored. I suppose it is possible to
create RLS policies that are related to other privilege levels, but those
would still need to be applied despite user id, excepting superuser. I'll
defer to Stephen or Craig on the usefulness of this scheme.

Could we put the "if superuser then ok" test into the RLS condition test
> and thereby not need more than one plan at all?
>

As I understand it, the application of RLS policies occurs in the rewriter.
Therefore, when switching back and forth between superuser and
not-superuser the query must be rewritten, which would ultimately result in
the need for a new plan correct? If that is the case, then I am not sure
how one plan is possible. However, again, I'll have to defer to Stephen or
Craig on this one.

Thanks,
Adam


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-10 23:18:39
Message-ID: 5397924F.5070904@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/11/2014 02:19 AM, Tom Lane wrote:
> Hm ... I'm not following why we'd need a special case for superusers and
> not anyone else? Seems like any useful RLS scheme is going to require
> more privilege levels than just superuser and not-superuser.

What it really needs is to invalidate plans when switching between
RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS
exempt" right or mode sooner rather than later, so I'm against tying
this explicitly to superuser as such.

I wouldn't be surprised to see

SET ROW SECURITY ON|OFF

down the track, with a right controlling whether you can or not. Or at
least, a right that directly exempts a user from row security.

> Could we put the "if superuser then ok" test into the RLS condition test
> and thereby not need more than one plan at all?

Only if we put it in another level of security barrier subquery, because
otherwise the planner might execute the other quals (including possible
user defined functions) before the superuser test. Which was the whole
reason for the superuser test in the first place.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-10 23:24:11
Message-ID: 7872.1402442651@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> On 06/11/2014 02:19 AM, Tom Lane wrote:
>> Could we put the "if superuser then ok" test into the RLS condition test
>> and thereby not need more than one plan at all?

> Only if we put it in another level of security barrier subquery, because
> otherwise the planner might execute the other quals (including possible
> user defined functions) before the superuser test. Which was the whole
> reason for the superuser test in the first place.

Is the point of that that the table owner might have put trojan-horse
functions into the RLS qual? If so, why are we only concerned about
defending the superuser and not other users? Seems like the right fix
would be to insist that functions in the RLS qual run as the table owner.
Granted, that might be painful to do. But it still seems like "we only
need to do this for superusers" is designing with blinkers on.

regards, tom lane


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-10 23:26:15
Message-ID: 53979417.4030808@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/11/2014 07:24 AM, Tom Lane wrote:
> Is the point of that that the table owner might have put trojan-horse
> functions into the RLS qual? If so, why are we only concerned about
> defending the superuser and not other users? Seems like the right fix
> would be to insist that functions in the RLS qual run as the table owner.
> Granted, that might be painful to do. But it still seems like "we only
> need to do this for superusers" is designing with blinkers on.

I agree, and now that the urgency of trying to deliver this for 9.4 is
over it's worth seeing if we can just run as table owner.

Failing that, we could take the approach a certain other RDBMS does and
make the ability to define row security quals a GRANTable right
initially held only by the superuser.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-10 23:32:07
Message-ID: 8024.1402443127@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> On 06/11/2014 07:24 AM, Tom Lane wrote:
>> Is the point of that that the table owner might have put trojan-horse
>> functions into the RLS qual? If so, why are we only concerned about
>> defending the superuser and not other users? Seems like the right fix
>> would be to insist that functions in the RLS qual run as the table owner.
>> Granted, that might be painful to do. But it still seems like "we only
>> need to do this for superusers" is designing with blinkers on.

> I agree, and now that the urgency of trying to deliver this for 9.4 is
> over it's worth seeing if we can just run as table owner.

> Failing that, we could take the approach a certain other RDBMS does and
> make the ability to define row security quals a GRANTable right
> initially held only by the superuser.

Hmm ... that might be a workable compromise. I think the main issue here
is whether we expect that RLS quals will be something that the planner
could optimize to any meaningful extent. If they're always (in effect)
wrapped in SECURITY DEFINER functions, I think that largely blocks any
optimizations; but maybe that wouldn't matter in practice.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-11 14:26:46
Message-ID: CA+TgmoYdK6k1QmOtcRg-qex7UF2nWbMfD3fONxogeZ32SwFerQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 10, 2014 at 7:18 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 06/11/2014 02:19 AM, Tom Lane wrote:
>> Hm ... I'm not following why we'd need a special case for superusers and
>> not anyone else? Seems like any useful RLS scheme is going to require
>> more privilege levels than just superuser and not-superuser.
>
> What it really needs is to invalidate plans when switching between
> RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS
> exempt" right or mode sooner rather than later, so I'm against tying
> this explicitly to superuser as such.
>
> I wouldn't be surprised to see
>
> SET ROW SECURITY ON|OFF
>
> down the track, with a right controlling whether you can or not. Or at
> least, a right that directly exempts a user from row security.

I'm really concerned about the security implications of this patch. I
think we're setting ourselves up for a whole lot of hurt for somewhat
unclear gain.

In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically
*is* row-level security: instead of applying a row-level security
policy to a table, just create a security-barrier view over the table
and grant access to the view. Forget that the table ever existed.
Done.

With this approach, there's a lot of stuff that we don't have to
reinvent. We've talked a lot about whether row-level security should
only be concerned with the rows it scans, or whether it should also
restrict the new rows that can be created. You can get either
behavior by choosing whether or not to use WITH CHECK OPTION. And
then there's this question of who should be RLS-exempt; that's
basically a question of to whom you grant privileges on the underlying
table. Note that this can be very fine-grained: for example, you can
allow someone to exempt themselves for selects but not for updates by
granting them SELECT privileges but not UPDATE privileges on the
underlying table. And potentially-exempt users can choose whether
they want a particular access to actually be exempt by targeting the
view when they don't want to be exempt and the table when they do.
That's mighty useful for debugging, at least IMHO. And, if you want
to have several row-level security policies for different classes of
users, just create more than one view and grant different privileges
on each.

By contrast, it seems to me that every design so far proposed for
something that is actually called row-level security - as opposed to
commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is*
row-level security, is extremely limited. Look back at all the things
listed in the previous paragraph; can you do those things easily with
the designs that have been proposed? As far as I can see, not really.
Your (Craig's) rls-9.4-upd-sb-views patch seems to have a rough
equivalent of WITH CHECK OPTION, probably because we've talked a lot
about that specific issue, but it doesn't line up exactly to what WITH
CHECK OPTION actually does. There's no independently-grantable
RLS-exemption privilege - and even when we talk about that, it's
usually some kind of global bit that applies to all tables and all
operations equally - whereas with the above approach it can be
per-table and per-operation and doesn't require superuser intervention
to flip the bit. There's no way for users who are RLS exempt to turn
off their exemption for testing purposes, let alone on a per-table
basis. There's no way to have multiple RLS policies on a single
table. All of those are things that we get "for free" in the
view-over-table model, and implementing formal RLS basically requires
us to either invent a new RLS-specific way of doing each of those
things, or suffer along with a subset of the functionality. Yuck.

But what's really awful about this whole design is that it breaks the
invariant that reading from a table doesn't run anybody else's code.
It's already the case that users need to be awfully careful about
modifying tables, because that might fire triggers that do bad things.
But at least you can SELECT from a table and it will either work, or
it will fail with a permission denied error. What it will not do is
unexpectedly run some code that you weren't expecting it to run. You
can't be so blithe about selecting from views, but reading a plain
table is always OK. Now, as soon as we introduce the concept that
selecting from a table might not really mean "read from the table" but
"read from the table after applying this owner-specified qual", we're
opening up a whole new set of attack surfaces. Every pg_dump is an
opportunity to hack somebody else's account, or at least audit their
activity. Protecting the superuser against everybody else is nice,
but I think it's just as important to protect non-superusers against
each other, and I think that's going to be hard -- because in the RLS
world, SELECT * FROM tab is now *fundamentally* ambiguous. Maybe it's
reading from the table, and maybe it's really clandestinely reading
from a view over the table, and the user has no way of being really
clear about which behavior they want. From a security point of view,
that seems very bad.

To recap:

1. Reinventing RLS-specific ways to do all of the things that can
already be done in the view-over-table model is a lot of work.
2. There's a danger that the functionality available in the two models
will diverge, so that certain things can only be done in one world or
the other.
3. On the whole, it seems likely that the RLS-specific world will
remain impoverished compared to the view-over-table model.
4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-11 15:40:09
Message-ID: 20140611154009.GX2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Craig Ringer (craig(at)2ndquadrant(dot)com) wrote:
> On 06/11/2014 02:19 AM, Tom Lane wrote:
> > Hm ... I'm not following why we'd need a special case for superusers and
> > not anyone else? Seems like any useful RLS scheme is going to require
> > more privilege levels than just superuser and not-superuser.
>
> What it really needs is to invalidate plans when switching between
> RLS-enabled and RLS-exempt users, yes. I'm sure we'll want an "RLS
> exempt" right or mode sooner rather than later, so I'm against tying
> this explicitly to superuser as such.

That certainly sounds reasonable to me, but the point is we're just
looking to see if the current role executing the plan should or should
not have RLS applied and, if it's changing, we need to re-plan. We
don't need to actually track an independent plan for each and every user
executing the plan, which means that the plan cache can be largely left
alone.

> I wouldn't be surprised to see
>
> SET ROW SECURITY ON|OFF
>
> down the track, with a right controlling whether you can or not. Or at
> least, a right that directly exempts a user from row security.

Agreed, but doing a re-planning in that case seems reasonable to me. I
find it pretty unlikely that there will be a lot of critical path cases
of the same plan flipping back and forth between a role for which RLS is
applied and a role where it shouldn't be.

> > Could we put the "if superuser then ok" test into the RLS condition test
> > and thereby not need more than one plan at all?
>
> Only if we put it in another level of security barrier subquery, because
> otherwise the planner might execute the other quals (including possible
> user defined functions) before the superuser test. Which was the whole
> reason for the superuser test in the first place.

Yeah, I'm not a big fan of this and it certainly seems a simpler
approach to just force a re-plan. We're talking about a query which
has been prepared and then is being executed by different roles, some
of which are RLS enabled and some which are RLS exempt. That just
strikes me as pretty unlikely to happen and if it does become an issue,
a user could work around it by having two different plans prepared and
making sure that they are called from the appropriate roles to avoid the
replanning.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-11 15:48:34
Message-ID: 20140611154833.GY2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Craig Ringer (craig(at)2ndquadrant(dot)com) wrote:
> On 06/11/2014 07:24 AM, Tom Lane wrote:
> > Is the point of that that the table owner might have put trojan-horse
> > functions into the RLS qual? If so, why are we only concerned about
> > defending the superuser and not other users? Seems like the right fix
> > would be to insist that functions in the RLS qual run as the table owner.
> > Granted, that might be painful to do. But it still seems like "we only
> > need to do this for superusers" is designing with blinkers on.
>
> I agree, and now that the urgency of trying to deliver this for 9.4 is
> over it's worth seeing if we can just run as table owner.

We'll need to work out how to ensure that things like current_user()
still returns the calling user in that case, otherwise it won't make any
sense. In general, I agree that having the RLS quals run as the table
owner is a good approach and would love to hear suggestions about how we
can make that happen.

> Failing that, we could take the approach a certain other RDBMS does and
> make the ability to define row security quals a GRANTable right
> initially held only by the superuser.

I don't particularly like this idea- it's akin, to me anyway, to making
the ability to control other permissions on a table (SELECT, INSERT,
etc) something which a user would have to be granted- and it doesn't
really address the issue.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-11 16:23:17
Message-ID: 20140611162317.GA2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> I'm really concerned about the security implications of this patch. I
> think we're setting ourselves up for a whole lot of hurt for somewhat
> unclear gain.

I'm certainly of a different opinion and, for the most part, I feel that
if there are security concerns then they need to be addressed- and
better by us than by asking users to use some other mechanism to
implement RLS.

> In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically
> *is* row-level security: instead of applying a row-level security
> policy to a table, just create a security-barrier view over the table
> and grant access to the view. Forget that the table ever existed.
> Done.

This argument could have been made for column-level privileges also, no?
Yet I don't hear any calls for that to be ripped out now that you could
implement it through updatable security-barrier views. That commit was
the ground-work to allow us to finally get proper RLS and I'm very
disappointed to hear that the mechanical pieces around making RLS easy
for users to use (and getting that check-box taken care of in a wide
variety of fields that we are being exposed to now, see the PGConf.NYC
keynote speakers...) is receiving such push-back.

> With this approach, there's a lot of stuff that we don't have to
> reinvent. We've talked a lot about whether row-level security should
> only be concerned with the rows it scans, or whether it should also
> restrict the new rows that can be created. You can get either
> behavior by choosing whether or not to use WITH CHECK OPTION. And
> then there's this question of who should be RLS-exempt; that's
> basically a question of to whom you grant privileges on the underlying
> table. Note that this can be very fine-grained: for example, you can
> allow someone to exempt themselves for selects but not for updates by
> granting them SELECT privileges but not UPDATE privileges on the
> underlying table. And potentially-exempt users can choose whether
> they want a particular access to actually be exempt by targeting the
> view when they don't want to be exempt and the table when they do.

I agree that views, or even security-definer functions, offer a great
deal of flexibility, and that may be necessary in some use-cases, but I
fail to see why that means we should avoid providing the mechanics to
achieve simple and usable RLS akin to what other major RDBMS's have.

> That's mighty useful for debugging, at least IMHO. And, if you want
> to have several row-level security policies for different classes of
> users, just create more than one view and grant different privileges
> on each.

I'm really not impressed with the idea that RLS should be done with
multiple different views of the same underlying table.

> By contrast, it seems to me that every design so far proposed for
> something that is actually called row-level security - as opposed to
> commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is*
> row-level security, is extremely limited. Look back at all the things
> listed in the previous paragraph; can you do those things easily with
> the designs that have been proposed? As far as I can see, not really.

I don't feel that RLS will, or even *should*, have the same level of
flexibility that you can achieve with views and/or security definer
functions. I expect that, over time, we will add more capabilities to
it, but it's never going to be able to redefine the contents of a column
as a view can, nor will it be able to add columns to a table as views
can. I don't see those as reasons against having support for RLS.

> Your (Craig's) rls-9.4-upd-sb-views patch seems to have a rough
> equivalent of WITH CHECK OPTION, probably because we've talked a lot
> about that specific issue, but it doesn't line up exactly to what WITH
> CHECK OPTION actually does. There's no independently-grantable
> RLS-exemption privilege - and even when we talk about that, it's
> usually some kind of global bit that applies to all tables and all
> operations equally - whereas with the above approach it can be
> per-table and per-operation and doesn't require superuser intervention
> to flip the bit.

I'm glad to hear your thoughts on the level of granularity which might
be nice to have with RLS. What would be great is to spend a bit more
time reviewing what other systems provide in this area and considering
what makes sense for us. This will also be a feature and an area which
we will be improving for a long time to come, but we do need this
capability and we have to start somewhere.

> There's no way for users who are RLS exempt to turn
> off their exemption for testing purposes, let alone on a per-table
> basis.

I don't follow this argument entirely- users can't turn off the existing
permissions system for testing either, unless an authorized user with
the correct permissions makes the change to allow it- or the user bumps
themselves up to superuser, or to a role which has broader permissions,
both of which would also be possible to do with RLS.

> There's no way to have multiple RLS policies on a single
> table. All of those are things that we get "for free" in the
> view-over-table model, and implementing formal RLS basically requires
> us to either invent a new RLS-specific way of doing each of those
> things, or suffer along with a subset of the functionality. Yuck.

What would probably be good is to review the use-cases which the current
patch already addresses- and we've had good responses from actual users
who are already playing with the patch and are hearing that it is
addressing their requirements.

> But what's really awful about this whole design is that it breaks the
> invariant that reading from a table doesn't run anybody else's code.

You're suggesting that we use views instead, which clearly could run
someone else's code. Perhaps the user will notice that they're
selecting from a view instead of a table, but I've never seen a security
design around making sure that what is being select'd from is a table
vs. a view. Have you seen applications which implement such a check
prior to running a query?

> It's already the case that users need to be awfully careful about
> modifying tables, because that might fire triggers that do bad things.
> But at least you can SELECT from a table and it will either work, or
> it will fail with a permission denied error. What it will not do is
> unexpectedly run some code that you weren't expecting it to run. You
> can't be so blithe about selecting from views, but reading a plain
> table is always OK. Now, as soon as we introduce the concept that
> selecting from a table might not really mean "read from the table" but
> "read from the table after applying this owner-specified qual", we're
> opening up a whole new set of attack surfaces.

With this, I agree, there is risk associated with the implementation
we're looking at for RLS. We could narrow the case by reducing the
capabilities of RLS in PG by only allowing certain functions to be used
in the definition of a RLS policy (eg: btree operators of known data
types, or something similar to our "leak-proof" attribute), but I don't
see that it really buys us much. There are a *lot* of ways in which an
individual who has the ability to create objects inside the database can
cause problems, but that comes with the flexibility we provide users
with. That will always be a balance but, I believe, we wouldn't have
the same level of success or have such an awesome system without that
flexibility.

> Every pg_dump is an
> opportunity to hack somebody else's account, or at least audit their
> activity. Protecting the superuser against everybody else is nice,
> but I think it's just as important to protect non-superusers against
> each other, and I think that's going to be hard -- because in the RLS
> world, SELECT * FROM tab is now *fundamentally* ambiguous. Maybe it's
> reading from the table, and maybe it's really clandestinely reading
> from a view over the table, and the user has no way of being really
> clear about which behavior they want. From a security point of view,
> that seems very bad.

I don't see this as being an insurmountable issue. I agree that having
a way for pg_dump to run safely is important and the superuser check
does address that, given that we don't have a "read-only (and
everything)" capability today. Once we do (and I surely hope that will
come sooner rather than later), such a role should also have the 'no
RLS' bit, as it wouldn't make any sense for such a role anyway. The
lack of that is not a strike against RLS though.

> To recap:
>
> 1. Reinventing RLS-specific ways to do all of the things that can
> already be done in the view-over-table model is a lot of work.

I agree that there's a fair bit of work involved, but I do not see
reimplementing views as RLS as the goal.

> 2. There's a danger that the functionality available in the two models
> will diverge, so that certain things can only be done in one world or
> the other.

They will always be distinct, intentionally so.

> 3. On the whole, it seems likely that the RLS-specific world will
> remain impoverished compared to the view-over-table model.

Agreed. As is the case with views vs. security definer functions.

> 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield.

While I agree that we need to consider this, I don't think it will be a
"minefield", but rather something we need to document and educate our
users about. If you'd like a "disable-all-RLS" GUC, I'm all for it.

Tossing out any hope of having RLS in PG is tossing the baby out with
the bathwater though, imv.

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-11 18:47:10
Message-ID: 24977.1402512430@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> I'm really concerned about the security implications of this patch. I
>> think we're setting ourselves up for a whole lot of hurt for somewhat
>> unclear gain.

> I'm certainly of a different opinion and, for the most part, I feel that
> if there are security concerns then they need to be addressed- and
> better by us than by asking users to use some other mechanism to
> implement RLS.

TBH, I found Robert's argument pretty persuasive. The idea that
"SELECT * FROM table" might invoke arbitrary processing ought to scare
anyone who's concerned about security, because that's going to completely
break any assumptions about pg_dump being safe for instance, as well as
force top-to-bottom rethinking of many other security assumptions.

> ... That commit was
> the ground-work to allow us to finally get proper RLS and I'm very
> disappointed to hear that the mechanical pieces around making RLS easy
> for users to use (and getting that check-box taken care of in a wide
> variety of fields that we are being exposed to now, see the PGConf.NYC
> keynote speakers...) is receiving such push-back.

If this is being sold as merely "ease of use", then it is probably going
to get rejected. In order to get some extra ease of use for the minority
of users who need RLS, you are going to significantly complicate the lives
of all Postgres users. That's not a net win in any sane calculation of
ease of use.

Maybe the right thing to think about is how we can make it easier to set
up table + view combinations according to the pattern Robert described.
I wouldn't have a problem with some more-or-less-automated support for
doing that. (Consider SERIAL as a possible precedent here: it's basically
a table creation macro.)

> You're suggesting that we use views instead, which clearly could run
> someone else's code. Perhaps the user will notice that they're
> selecting from a view instead of a table, but I've never seen a security
> design around making sure that what is being select'd from is a table
> vs. a view.

pg_dump is a sufficient counterexample to that statement.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-11 19:26:54
Message-ID: CA+TgmoYP1mc6Lg2VtUP5Y-HccXNFkHdXOC+1YTTvTKOZ6pxfJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 11, 2014 at 12:23 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> In my view, commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5 basically
>> *is* row-level security: instead of applying a row-level security
>> policy to a table, just create a security-barrier view over the table
>> and grant access to the view. Forget that the table ever existed.
>> Done.
>
> This argument could have been made for column-level privileges also, no?

Not really. First of all, we didn't have security_barrier views at
that time, let alone security barrier views that are auto-updateable.
That's a really important piece of technology which makes filtering
access via views feasible in ways that really were not feasible in the
past. Secondly, column-level permissions - like every other
currently-existing type of permissions - are declarative. They are an
additional opportunity for the system to say "no" to something it
otherwise would have allowed, but no user-defined code is executed.
Row-level security is not a chance for the system to deny access; it's
a chance for user-defined code to take control and perform arbitrary
operations. So the scope of what we're contemplating for row-level
security is really far, far more invasive than what we did for
column-level privileges.

> I agree that views, or even security-definer functions, offer a great
> deal of flexibility, and that may be necessary in some use-cases, but I
> fail to see why that means we should avoid providing the mechanics to
> achieve simple and usable RLS akin to what other major RDBMS's have.

Because we don't have a good design.

I'm not categorically opposed to adding more RLS features to
PostgreSQL and never have been; in fact, I was deeply involved in the
original design of security barrier views and committed the original
patch to add that functionality to PostgreSQL, without which none of
what we're talking about here would be possible. But the
currently-proposed design is very unappealing to me, for the reasons
that I've explained. The right answer to "this feature doesn't
provide anything that we don't already have and will introduce major
new security exposures that haven't been adequate thought" is
debatable, but "well other people have this so we should too" is
definitely not it. Craig's patch really hasn't grappled with any of
these thorny definition and security issues; it's just about making
the basic functionality work. That's fine for a POC, but it's not
enough for a feature that the project would be committing to maintain
for the indefinite future.

>> That's mighty useful for debugging, at least IMHO. And, if you want
>> to have several row-level security policies for different classes of
>> users, just create more than one view and grant different privileges
>> on each.
>
> I'm really not impressed with the idea that RLS should be done with
> multiple different views of the same underlying table.

Are you equally unimpressed with the idea that RLS as proposed can't
support more than one security policy right now *at all*? Because it
seems to me that either you think multiple RLS policies on a single
table is important (in which case the current patch is inadequate) or
you think it's not important (in which case we need not argue about
whether doing it with multiple views over the same underlying table is
awkward).

>> By contrast, it seems to me that every design so far proposed for
>> something that is actually called row-level security - as opposed to
>> commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is*
>> row-level security, is extremely limited. Look back at all the things
>> listed in the previous paragraph; can you do those things easily with
>> the designs that have been proposed? As far as I can see, not really.
>
> I don't feel that RLS will, or even *should*, have the same level of
> flexibility that you can achieve with views and/or security definer
> functions. I expect that, over time, we will add more capabilities to
> it, but it's never going to be able to redefine the contents of a column
> as a view can, nor will it be able to add columns to a table as views
> can. I don't see those as reasons against having support for RLS.

What this patch is doing is basically allowing a table to really be a
view over itself. If we choose to support that, I think it is
absolutely inevitable that people are going to want all the same
options that they would have if they really made a separate view -
separate permissions, WITH CHECK OPTION, all of it. I find the
contrary argument - that people will only want X amount and no more -
simply not plausible. If it's valuable to have some of those
capabilities in an RLS framework, somebody's going to want all of
them. There's no bright line to divide the things that are valuable
in that context from those that aren't.

> I'm glad to hear your thoughts on the level of granularity which might
> be nice to have with RLS. What would be great is to spend a bit more
> time reviewing what other systems provide in this area and considering
> what makes sense for us. This will also be a feature and an area which
> we will be improving for a long time to come, but we do need this
> capability and we have to start somewhere.

I think this definitely important. I also think that we should be
careful to study the deficiencies in those other systems and to
clearly call out what value the capabilities we're thinking of adding
to PostgreSQL 9.5 have over the status quo in PostgreSQL 9.4. I'm not
so much arguing that we shouldn't have row-level security as that, in
every way that's really meaningful, we already do.

>> There's no way for users who are RLS exempt to turn
>> off their exemption for testing purposes, let alone on a per-table
>> basis.
>
> I don't follow this argument entirely- users can't turn off the existing
> permissions system for testing either, unless an authorized user with
> the correct permissions makes the change to allow it- or the user bumps
> themselves up to superuser, or to a role which has broader permissions,
> both of which would also be possible to do with RLS.

Sure, but in the existing system, the query either returns the same
results for everybody, or it fails outright with an error. It's
certainly possible to screw up the existing permissions, but this new
thing that's being proposed is much more complicated, because it's not
just whether it works that's at issue, but what results you actually
get.

>> There's no way to have multiple RLS policies on a single
>> table. All of those are things that we get "for free" in the
>> view-over-table model, and implementing formal RLS basically requires
>> us to either invent a new RLS-specific way of doing each of those
>> things, or suffer along with a subset of the functionality. Yuck.
>
> What would probably be good is to review the use-cases which the current
> patch already addresses- and we've had good responses from actual users
> who are already playing with the patch and are hearing that it is
> addressing their requirements.

Yes. And in particular, I think we should have a much clearer
statement than we currently do about the use cases in which it falls
short.

>> But what's really awful about this whole design is that it breaks the
>> invariant that reading from a table doesn't run anybody else's code.
>
> You're suggesting that we use views instead, which clearly could run
> someone else's code. Perhaps the user will notice that they're
> selecting from a view instead of a table, but I've never seen a security
> design around making sure that what is being select'd from is a table
> vs. a view. Have you seen applications which implement such a check
> prior to running a query?

Yes. pg_dump, to name one really important one. I wouldn't be
surprised if graphical clients did something similar - display the
table data for a table, or the view definition for a view. But I
admit to not having checked that. More than that, if I were a DBA,
I'd certainly be darn careful about selecting from untrusted views,
but I expect to be able to read a table, or run pg_dump, without
getting my account hacked.

> With this, I agree, there is risk associated with the implementation
> we're looking at for RLS. We could narrow the case by reducing the
> capabilities of RLS in PG by only allowing certain functions to be used
> in the definition of a RLS policy (eg: btree operators of known data
> types, or something similar to our "leak-proof" attribute), but I don't
> see that it really buys us much. There are a *lot* of ways in which an
> individual who has the ability to create objects inside the database can
> cause problems, but that comes with the flexibility we provide users
> with. That will always be a balance but, I believe, we wouldn't have
> the same level of success or have such an awesome system without that
> flexibility.

I don't think restricting what can go into an RLS policy is the right
answer; that to me misses the point. What needs to be restricted is
the possibility that a user will inadvertently run code they didn't
mean to run.

>> Every pg_dump is an
>> opportunity to hack somebody else's account, or at least audit their
>> activity. Protecting the superuser against everybody else is nice,
>> but I think it's just as important to protect non-superusers against
>> each other, and I think that's going to be hard -- because in the RLS
>> world, SELECT * FROM tab is now *fundamentally* ambiguous. Maybe it's
>> reading from the table, and maybe it's really clandestinely reading
>> from a view over the table, and the user has no way of being really
>> clear about which behavior they want. From a security point of view,
>> that seems very bad.
>
> I don't see this as being an insurmountable issue. I agree that having
> a way for pg_dump to run safely is important and the superuser check
> does address that, given that we don't have a "read-only (and
> everything)" capability today. Once we do (and I surely hope that will
> come sooner rather than later), such a role should also have the 'no
> RLS' bit, as it wouldn't make any sense for such a role anyway. The
> lack of that is not a strike against RLS though.

It addresses running pg_dump *as the superuser*, but not as a database
owner or just a regular users. If unprivileged user A runs pg_dump -t
some_table_owned_by_user_b, and falls victim to a Trojan horse, that
is going to get reported as a security defect in PostgreSQL. Telling
the person who reports that issue that it's design behavior is not
going to make them happy, or result in good press coverage for
PostgreSQL.

>> 2. There's a danger that the functionality available in the two models
>> will diverge, so that certain things can only be done in one world or
>> the other.
>
> They will always be distinct, intentionally so.

I think that's an absolutely terrible idea. We do not want to be in
the business of having two parallel systems with slightly different
capabilities and syntax that are providing the same fundamental
functionality. And they are: the proposal for RLS is to make it work
just like a security_barrier view, sharing a common implementation.

>> 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield.
>
> While I agree that we need to consider this, I don't think it will be a
> "minefield", but rather something we need to document and educate our
> users about. If you'd like a "disable-all-RLS" GUC, I'm all for it.

I would definitely like that. I have proposed it in the past.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-11 23:46:56
Message-ID: 20140611234656.GC2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> >> I'm really concerned about the security implications of this patch. I
> >> think we're setting ourselves up for a whole lot of hurt for somewhat
> >> unclear gain.
>
> > I'm certainly of a different opinion and, for the most part, I feel that
> > if there are security concerns then they need to be addressed- and
> > better by us than by asking users to use some other mechanism to
> > implement RLS.
>
> TBH, I found Robert's argument pretty persuasive. The idea that
> "SELECT * FROM table" might invoke arbitrary processing ought to scare
> anyone who's concerned about security, because that's going to completely
> break any assumptions about pg_dump being safe for instance, as well as
> force top-to-bottom rethinking of many other security assumptions.

SELECT triggers for a wide variety of use-cases are pretty commonly
asked for here and are something I'd like to see us support also. There
are also quite a few ways in which a select can end up executing code.
Today it requires more than 'select * from table;', but not very
much.. I agree that it'd be good if we had a way to address that but I
continue to view that as an independent issue.

What I haven't heard any comments on, yet found interesting, was the
idea of having the RLS quals run as the owner of the table. Would that
address these concerns? I seem to recall wondering why we didn't do
that for views in the first place, though I doubt we could change it
now even if we wanted to (and I'm guessing the spec has something to say
about this, though I haven't gone and looked and don't remember
offhand). It's certainly rather curious that functions called under a
view are run as the calling user while permissions checks on relations
referred to by the view are as the view owner.

Hopefully that will make the rest of this discussion less relevant, but
I'll respond with my feelings anyway.

> > ... That commit was
> > the ground-work to allow us to finally get proper RLS and I'm very
> > disappointed to hear that the mechanical pieces around making RLS easy
> > for users to use (and getting that check-box taken care of in a wide
> > variety of fields that we are being exposed to now, see the PGConf.NYC
> > keynote speakers...) is receiving such push-back.
>
> If this is being sold as merely "ease of use", then it is probably going
> to get rejected. In order to get some extra ease of use for the minority
> of users who need RLS, you are going to significantly complicate the lives
> of all Postgres users. That's not a net win in any sane calculation of
> ease of use.

I don't view this as being at all accurate- how is this complicating the
lives of all Postgres users? If they are worried about running user
defined code then they *already* have a lot to worry about.

While the users of RLS might be less than 50% and therefore the
minority, I expect it will have quite a bit of up-take in certain
industries and I know that our lack of any RLS is currently preventing
use of Postgres in some rather important cases.

As for it being ease-of-use, again, there are ways in which column level
privileges could have been dealt with using views, rules, security
definer functions, etc, but that doesn't mean we don't want that
feature. I certainly view RLS (and have for quite some time..) as a much
needed capability, even if it can be done today using a bunch of user
written code that must be security audited.

> Maybe the right thing to think about is how we can make it easier to set
> up table + view combinations according to the pattern Robert described.

While this sounds interesting, I don't see adding columns or redefining
them as being in the perview of RLS. The current approach of
allowing a boolean expression to be defined is both extremely flexible
while also being simple when the requirement is simple. Having to
create, manage, update, etc, an independent object would add unnecessary
complexity.

Perhaps having it be a boolean expression is too much flexibility but
the alternatives that I can think of aren't terribly attractive to me
and the boolean expression approach is what folks coming from other
RDBMS's will be familiar with and understand how to build their
applications around. We may need to provide some additional pieces
around this (perhaps a trigger-like function type which also gets
information about the object being queried, etc) but the point is to
have a straight-forward and simply reasoned about way of limiting what
data is returned.

> I wouldn't have a problem with some more-or-less-automated support for
> doing that. (Consider SERIAL as a possible precedent here: it's basically
> a table creation macro.)

Perhaps there's a way to make that work, but personally it looks like a
whole bunch more work and I don't see the gain. How would adding RLS to
an existing table work? It's worse than the SERIAL case as at least
a default clause can be added later without impacting the application
code. Would the functions referenced through such a view run as the
user of the view?

> > You're suggesting that we use views instead, which clearly could run
> > someone else's code. Perhaps the user will notice that they're
> > selecting from a view instead of a table, but I've never seen a security
> > design around making sure that what is being select'd from is a table
> > vs. a view.
>
> pg_dump is a sufficient counterexample to that statement.

No, it isn't. pg_dump's defined purpose is explicitly to pull out the
data contents underneath, or the definition of the object, which means
it happens to issue explicit select * from table's (or COPY commands)
for tables and pull the view definition for views.

There's no way to even ask it to dump out the contents of a view (rather
than the definition of it). I don't consider that a security design
which checks if the object *that we're asking to select the contents of*
is checked to see if it's a view or a table, in order to avoid calling
user-defined code.

I agree that pg_dump takes many precautions to avoid running user code
in a way which could be dangerous, both to avoid security issues and
because its goal is to reproduce the system exactly as it was, and
running user code would likely cause problems for that.

I still do not buy this argument that individuals or applications pay
much more attention to selecting from views than they do selecting from
tables, or generally go out of their way to try and avoid running user
defined code (indeed, much of the point is to be able to add such things
without having to change the application around..).

We care about these issues a great deal in pg_dump, rightfully, but
psql, pgAdmin3, Perl DBD/DBI, libpq-using application, etc, etc, have no
mechanism to say "give me just the data and only the data and don't run
any user-defined code". Adding that capability might be interesting if
we can figure out how exactly to define it but it's still an orthogonal
issue to RLS, imv.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-12 00:59:17
Message-ID: 20140612005917.GD2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jun 11, 2014 at 12:23 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > This argument could have been made for column-level privileges also, no?
>
> Not really. First of all, we didn't have security_barrier views at
> that time, let alone security barrier views that are auto-updateable.

We had security definer functions, even set-returning ones, along with
rules and triggers.

> That's a really important piece of technology which makes filtering
> access via views feasible in ways that really were not feasible in the
> past. Secondly, column-level permissions - like every other
> currently-existing type of permissions - are declarative. They are an
> additional opportunity for the system to say "no" to something it
> otherwise would have allowed, but no user-defined code is executed.

We could try to avoid calling user-defined code for RLS, but it'd add a
whole lot of complexity and as far as I can see and your proposed
solution isn't avoiding the user-defined code anyway, so I'm not sure
why this solution should be required to meet that.

> Row-level security is not a chance for the system to deny access; it's
> a chance for user-defined code to take control and perform arbitrary
> operations. So the scope of what we're contemplating for row-level
> security is really far, far more invasive than what we did for
> column-level privileges.

In this case the user-defined code needs to return a boolean. We don't
currently do anything to prevent it from having side-effects, no, but
the same is true with views which incorporate functions. I agree that
it makes a difference when compared to column-level privileges, but my
point was that we have provided easier ways to do things which were
possible using more complicated methods before. Perhaps the risk with
RLS is higher but these issues look managable to me and the level of
doubt about our ability to provide this feature in a reasonable and
principled way that our users will understand surprises me.

> > I agree that views, or even security-definer functions, offer a great
> > deal of flexibility, and that may be necessary in some use-cases, but I
> > fail to see why that means we should avoid providing the mechanics to
> > achieve simple and usable RLS akin to what other major RDBMS's have.
>
> Because we don't have a good design.

We're using a design that's found in multiple other RDBMS's and used
extensively by certain industries which use those RDBMS's today. I'm
certainly open to improving what is found in other systems for PG but I
have a hard time seeing this approach as a bad design. Perhaps you're
referring to our implementation, in which case I might agree and things
like running the quals as the table owner is something which should be
considered (I don't know how the other RDBMS's operate in this regard
offhand- it'd be good to find out).

> I'm not categorically opposed to adding more RLS features to
> PostgreSQL and never have been; in fact, I was deeply involved in the
> original design of security barrier views and committed the original
> patch to add that functionality to PostgreSQL, without which none of
> what we're talking about here would be possible. But the
> currently-proposed design is very unappealing to me, for the reasons
> that I've explained. The right answer to "this feature doesn't
> provide anything that we don't already have and will introduce major
> new security exposures that haven't been adequate thought" is
> debatable, but "well other people have this so we should too" is
> definitely not it.

How about "it's in high demand by our user base"? In particular, it's
being asked for by a *highly* technical section of our user base who
uses these capabilities today, with this design, in those other
databases.

> Craig's patch really hasn't grappled with any of
> these thorny definition and security issues; it's just about making
> the basic functionality work. That's fine for a POC, but it's not
> enough for a feature that the project would be committing to maintain
> for the indefinite future.

Improving the patch is exactly what I'd like to do, but throwing out the
notion that RLS can't be allowed to execute user-defined code is cutting
the legs out of the feature completely- particularly with our system
where users can create all manner of objects in the system with their
own code being run.

> >> That's mighty useful for debugging, at least IMHO. And, if you want
> >> to have several row-level security policies for different classes of
> >> users, just create more than one view and grant different privileges
> >> on each.
> >
> > I'm really not impressed with the idea that RLS should be done with
> > multiple different views of the same underlying table.
>
> Are you equally unimpressed with the idea that RLS as proposed can't
> support more than one security policy right now *at all*? Because it
> seems to me that either you think multiple RLS policies on a single
> table is important (in which case the current patch is inadequate) or
> you think it's not important (in which case we need not argue about
> whether doing it with multiple views over the same underlying table is
> awkward).

The current approach allows a nearly unlimited level of flexibility,
should the user wish it, by being able to run user-defined code.
Perhaps that would be considered 'one policy', but it could certainly
take under consideration the calling user, the object being queried
(if a function is defined per table, or if we provide a way to get
that information in the function), etc. What it wouldn't require is
the same object to be queried through different object names, which is
what I was principally objecting to. What would it mean to have
mutliple RLS policies for a given object? There would have to be some
criteria to distinguish which one would be applied, yet that can be
handled with the existing design by the user already, if they wish to.

Were we to preclude users from being able to have user-defined functions
called, then there's quite a bit of additional complexity we'd need to
replicate. Per-user policies, per-role policies, a definition of which
one applies when, per-source-IP, per-connection-type (SSL vs. non-SSL),
per-security-label, etc..

> >> By contrast, it seems to me that every design so far proposed for
> >> something that is actually called row-level security - as opposed to
> >> commit 842faa714c0454d67e523f5a0b6df6500e9bc1a5, which *really is*
> >> row-level security, is extremely limited. Look back at all the things
> >> listed in the previous paragraph; can you do those things easily with
> >> the designs that have been proposed? As far as I can see, not really.
> >
> > I don't feel that RLS will, or even *should*, have the same level of
> > flexibility that you can achieve with views and/or security definer
> > functions. I expect that, over time, we will add more capabilities to
> > it, but it's never going to be able to redefine the contents of a column
> > as a view can, nor will it be able to add columns to a table as views
> > can. I don't see those as reasons against having support for RLS.
>
> What this patch is doing is basically allowing a table to really be a
> view over itself.

I don't agree with this characterization. This patch specifically
allows filtering the rows returned from the table, and it intentionally
does not allow changing the data.

> If we choose to support that, I think it is
> absolutely inevitable that people are going to want all the same
> options that they would have if they really made a separate view -
> separate permissions, WITH CHECK OPTION, all of it.

We are already looking at WITH CHECK OPTION-style support, but I
disagree that separate permissions or data changing will ever be a part
of RLS because then it's no longer RLS.

> I find the
> contrary argument - that people will only want X amount and no more -
> simply not plausible.

I'm not sure where you are seeing the requests for this feature from,
but where I have heard them it's been to match what exists in other
RDBMS's which do not have the capabilities that you're describing users
will want- yet RLS is heavily used in those organizations. For the use
cases that I've had in the past, RLS-as-defined would be the feature
that I want for most tables, with views for joins and data-changing
operations.

> If it's valuable to have some of those
> capabilities in an RLS framework, somebody's going to want all of
> them. There's no bright line to divide the things that are valuable
> in that context from those that aren't.

I see the line quite clearly- RLS is about having a filtering mechanism
and that's it. If it isn't filtering the rows (meaning giving back a
'true' or 'false' result for each row) then it's beyond RLS.

> > I'm glad to hear your thoughts on the level of granularity which might
> > be nice to have with RLS. What would be great is to spend a bit more
> > time reviewing what other systems provide in this area and considering
> > what makes sense for us. This will also be a feature and an area which
> > we will be improving for a long time to come, but we do need this
> > capability and we have to start somewhere.
>
> I think this definitely important. I also think that we should be
> careful to study the deficiencies in those other systems and to
> clearly call out what value the capabilities we're thinking of adding
> to PostgreSQL 9.5 have over the status quo in PostgreSQL 9.4. I'm not
> so much arguing that we shouldn't have row-level security as that, in
> every way that's really meaningful, we already do.

This is not the feeling that the users which I have been working with
have, nor does it match my feelings about this. As mentioned in my
email to Tom just now, having another object to deal with adds
unnecessary complexity and will require application changes potentially
to implement over existing tables.

> >> There's no way for users who are RLS exempt to turn
> >> off their exemption for testing purposes, let alone on a per-table
> >> basis.
> >
> > I don't follow this argument entirely- users can't turn off the existing
> > permissions system for testing either, unless an authorized user with
> > the correct permissions makes the change to allow it- or the user bumps
> > themselves up to superuser, or to a role which has broader permissions,
> > both of which would also be possible to do with RLS.
>
> Sure, but in the existing system, the query either returns the same
> results for everybody, or it fails outright with an error. It's
> certainly possible to screw up the existing permissions, but this new
> thing that's being proposed is much more complicated, because it's not
> just whether it works that's at issue, but what results you actually
> get.

I agree that we'll need to make sure we return the correct answer.
There is complexity there, but hopefully we've addressed much or all of
that with what we have in 9.4 and this is just adding a simpler and
often requested way to use that capability without the need to create
and manage another object in the system.

> >> There's no way to have multiple RLS policies on a single
> >> table. All of those are things that we get "for free" in the
> >> view-over-table model, and implementing formal RLS basically requires
> >> us to either invent a new RLS-specific way of doing each of those
> >> things, or suffer along with a subset of the functionality. Yuck.
> >
> > What would probably be good is to review the use-cases which the current
> > patch already addresses- and we've had good responses from actual users
> > who are already playing with the patch and are hearing that it is
> > addressing their requirements.
>
> Yes. And in particular, I think we should have a much clearer
> statement than we currently do about the use cases in which it falls
> short.

I'm happy to have that discussion with the users who are asking for this
but in the conversations that I've had to date, updatable s.b. views are
not RLS to them and I have to agree- having to maintain twice as many
objects in the system which have to be named differently and have
permissions which can be distinct from each other (which is something
that could be a *problem* if it isn't intended), must both be updated
when adding or removing columns, etc, makes that solution quite
unappealing.

> > You're suggesting that we use views instead, which clearly could run
> > someone else's code. Perhaps the user will notice that they're
> > selecting from a view instead of a table, but I've never seen a security
> > design around making sure that what is being select'd from is a table
> > vs. a view. Have you seen applications which implement such a check
> > prior to running a query?
>
> Yes. pg_dump, to name one really important one. I wouldn't be
> surprised if graphical clients did something similar - display the
> table data for a table, or the view definition for a view.

I'm quite sure you can select back the data from a view in every
graphical client that exists- and without any warning popping up that
you might be running code that someone else wrote. Yes, you can also
get the definition of the view in many cases and you can tell if what
you're selecting is a view or a table but that doesn't mean people are
actively being paranoid about that distinction or worrying about the
other cases where user-defined code might be run, even when selecting
from a table, in general.

> But I
> admit to not having checked that. More than that, if I were a DBA,
> I'd certainly be darn careful about selecting from untrusted views,
> but I expect to be able to read a table, or run pg_dump, without
> getting my account hacked.

I'd love to hear how you decide which views are trusted and which are
not. Last I checked, most serious attacks still come from internal
individuals rather than external ones. Don't get me wrong- we
definitely have an issue here that it'd be great to find a solution to,
as has been discussed extensively, but I don't see RLS as making that
problem particularly worse, and really, excluding superusers and having
the option for other users to be excluded goes above what we've done to
date in other areas.

> I don't think restricting what can go into an RLS policy is the right
> answer; that to me misses the point. What needs to be restricted is
> the possibility that a user will inadvertently run code they didn't
> mean to run.

I'm glad that you agree that restricting the RLS policy isn't the right
answer. I agree that we want to come up with a way to prevent users
from running code that isn't safe or isn't intended. I still don't see
RLS as making that particularly worse. The system is really nearly
unusable in any interactive way if you restrict yourself to operations
which can't possibly run any user-defined code today. There have been
discussions about ways to possibly improve that, and those ways would
need to address the RLS case in addition to the other already existing
cases but I don't see that as a signifigant increase in the amount of
work required to address that problem (which is already quite large..).

> > I don't see this as being an insurmountable issue. I agree that having
> > a way for pg_dump to run safely is important and the superuser check
> > does address that, given that we don't have a "read-only (and
> > everything)" capability today. Once we do (and I surely hope that will
> > come sooner rather than later), such a role should also have the 'no
> > RLS' bit, as it wouldn't make any sense for such a role anyway. The
> > lack of that is not a strike against RLS though.
>
> It addresses running pg_dump *as the superuser*, but not as a database
> owner or just a regular users. If unprivileged user A runs pg_dump -t
> some_table_owned_by_user_b, and falls victim to a Trojan horse, that
> is going to get reported as a security defect in PostgreSQL. Telling
> the person who reports that issue that it's design behavior is not
> going to make them happy, or result in good press coverage for
> PostgreSQL.

We have this problem with psql today, as has been discussed. The fact
that pg_dump doesn't happen to have this problem is great but it's no
true solution for the problem at hand.

> >> 2. There's a danger that the functionality available in the two models
> >> will diverge, so that certain things can only be done in one world or
> >> the other.
> >
> > They will always be distinct, intentionally so.
>
> I think that's an absolutely terrible idea. We do not want to be in
> the business of having two parallel systems with slightly different
> capabilities and syntax that are providing the same fundamental
> functionality. And they are: the proposal for RLS is to make it work
> just like a security_barrier view, sharing a common implementation.

While RLS could be viewed as providing a subset of what updatable sb
views provide, I can see a clear line between the two and, for my part,
we should allow users to make their own decision about if they want the
complexity involved with maintaining another object in the system to
provide the filtering or if they want to implement the filtering and the
data manipulation, joins, etc, independently.

That's really another big point to be made here- there's value in
separating these concerns. Security is a big enough concern and a big
enough issue that being able to address it explicitly and with a simple
syntax is extremely valuable. RLS as we've been discussing it allows
that, while having to include it in more complicated view definitions
could make it much more difficult to reason about. I suppose one could
define a view for just the filtering and then another view for the data
manipulation and joining over top of the other views, but, again, that
adds another level of complexity that isn't needed- and you can't be
100% sure that the only thing the supposedly filtering view is doing is
*just* filtering unless you audit it regularly.

> >> 4. Making SELECT * FROM tab ambiguous seems likely to be a security minefield.
> >
> > While I agree that we need to consider this, I don't think it will be a
> > "minefield", but rather something we need to document and educate our
> > users about. If you'd like a "disable-all-RLS" GUC, I'm all for it.
>
> I would definitely like that. I have proposed it in the past.

Great.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-12 02:37:36
Message-ID: CA+TgmoZu76cRBQEKNNHKRMmGhMfns0VWvD4HX+KHP-PgpVvmfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 11, 2014 at 8:59 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> Row-level security is not a chance for the system to deny access; it's
>> a chance for user-defined code to take control and perform arbitrary
>> operations. So the scope of what we're contemplating for row-level
>> security is really far, far more invasive than what we did for
>> column-level privileges.
>
> In this case the user-defined code needs to return a boolean. We don't
> currently do anything to prevent it from having side-effects, no, but
> the same is true with views which incorporate functions. I agree that
> it makes a difference when compared to column-level privileges, but my
> point was that we have provided easier ways to do things which were
> possible using more complicated methods before. Perhaps the risk with
> RLS is higher but these issues look managable to me and the level of
> doubt about our ability to provide this feature in a reasonable and
> principled way that our users will understand surprises me.

I'm glad the issues look manageable to you, but you haven't really
explained how to manage them. The way to dispel doubt is to come up
with specific technical proposals that address the technical issues
that have been raised. I accept that you are surprised that someone
might not think we are on the right course here, but it's entirely
appropriate for me to express my doubts about this or any other patch,
much as many people do in regards to many patches that are posted here
- generally for good and valid reasons.

For my part, I'm mildly surprised that anyone thinks it's a good idea
to have SELECT * FROM tab to mean different things depending on who is
typing it. To me, that seems very confusing; how does an unprivileged
user with no ability to assume some other role validate that the row
security policy they've configured works at all and exposes precisely
the intended set of rows? Even aside from security exposures, how
does a non-superuser who runs pg_dump know whether they've got a
complete backup or a filtered dump that's missing some rows? A
filtered dump might not even be restorable if foreign keys are
involved. I think those are serious issues that deserve serious
thought and consideration, not just a vague assurance that the issues
are probably manageable.

>> Because we don't have a good design.
>
> We're using a design that's found in multiple other RDBMS's and used
> extensively by certain industries which use those RDBMS's today. I'm
> certainly open to improving what is found in other systems for PG but I
> have a hard time seeing this approach as a bad design. Perhaps you're
> referring to our implementation, in which case I might agree and things
> like running the quals as the table owner is something which should be
> considered (I don't know how the other RDBMS's operate in this regard
> offhand- it'd be good to find out).

I'm not referring to the proposed implementation particularly; or at
least not that aspect of it. I don't think trying to run the view
quals as the defining user is likely to be very appealing, because I
think it's going to hurt performance, for example by preventing
function inlining and requiring lots of user-ID switches. But I'm not
gonna complain if someone wants to mull it over and make a proposal
for how to make it work. Rather, my concern is that all we've got is
what might be called the core of the feature; the actual guts of it.
There are a lot of ancillary details that seem to me to be not worked
out at all yet, or only half-baked.

>> I'm not categorically opposed to adding more RLS features to
>> PostgreSQL and never have been; in fact, I was deeply involved in the
>> original design of security barrier views and committed the original
>> patch to add that functionality to PostgreSQL, without which none of
>> what we're talking about here would be possible. But the
>> currently-proposed design is very unappealing to me, for the reasons
>> that I've explained. The right answer to "this feature doesn't
>> provide anything that we don't already have and will introduce major
>> new security exposures that haven't been adequate thought" is
>> debatable, but "well other people have this so we should too" is
>> definitely not it.
>
> How about "it's in high demand by our user base"? In particular, it's
> being asked for by a *highly* technical section of our user base who
> uses these capabilities today, with this design, in those other
> databases.

Sure, that's a valid reason for considering any feature. But it's not
an excuse to overlook whatever design problems may exist.

>> Are you equally unimpressed with the idea that RLS as proposed can't
>> support more than one security policy right now *at all*? Because it
>> seems to me that either you think multiple RLS policies on a single
>> table is important (in which case the current patch is inadequate) or
>> you think it's not important (in which case we need not argue about
>> whether doing it with multiple views over the same underlying table is
>> awkward).
>
> The current approach allows a nearly unlimited level of flexibility,
> should the user wish it, by being able to run user-defined code.
> Perhaps that would be considered 'one policy', but it could certainly
> take under consideration the calling user, the object being queried
> (if a function is defined per table, or if we provide a way to get
> that information in the function), etc.

In theory, that's true. But in practice, performance will suck unless
the security qual is easily optimizable. If your security qual is
WHERE somecomplexfunction() you're going to have to implement that by
sequential-scanning the table and evaluating the function for each
row.

For example, I once worked at a company where we had a table
containing information about our customers and potential customers.
Sales representatives were allowed to see their own accounts, and
partners were allowed to see accounts associated with that partner.
These things were independent. So for a sales rep, the security qual
was WHERE sales_rep_id = <something> and for a partner the security
qual was WHERE partner_id = <something>. Now, you could maybe write
this as a single qual, something like this:

WHERE sales_rep_id = (SELECT oid FROM pg_authid WHERE rolname =
current_user AND oid IN (SELECT id FROM person WHERE is_sales_rep)) OR
partner_id = (SELECT p.org_id FROM pg_authid a, person p WHERE
a.rolname = current_user and a.oid = p.id)

But that's probably not going to perform very well, because to match
an index on sales_rep_id, or an index on partner_id, that's going to
have to get simplified a whole lot, and that's probably not going to
happen. If we've only got one branch of the OR, I think we'll realize
we can evaluate the subquery as an InitPlan and then use an index, but
with two branches I think that will fail.

I don't want to overstate the importance of this particular case; but
I do think scenarios in which it's advantageous to have multiple
row-level security policies are plausible. Another, perhaps-simpler
example is that you might have a table containing unclassified data,
classified data, and secret data. You want to give access to the
unclassified data only to one category of users; access to the
unclassified data and the classified data to a second group of
more-trusted users; and access to all of the data to a third group of
very highly trusted users. If the table can only have one security
policy that applies to everyone who isn't exempt, how will you do
that? This sort of use case seems very plausible to me so I think we
need to give some real thought to what we will recommend to users who
want to do things like this. Can the proposed patch handle it? How?

>> > I don't feel that RLS will, or even *should*, have the same level of
>> > flexibility that you can achieve with views and/or security definer
>> > functions. I expect that, over time, we will add more capabilities to
>> > it, but it's never going to be able to redefine the contents of a column
>> > as a view can, nor will it be able to add columns to a table as views
>> > can. I don't see those as reasons against having support for RLS.
>>
>> What this patch is doing is basically allowing a table to really be a
>> view over itself.
>
> I don't agree with this characterization. This patch specifically
> allows filtering the rows returned from the table, and it intentionally
> does not allow changing the data.

I don't know what to say to this. What I said is, quite literally,
what the patch does. It wraps the patch in an subquery RTE that is
precisely the same thing you would get if you defined a
security_barrier view with the security qual in the WHERE clause.
This is not a question of opinion; the patch either does that or it
doesn't, and I think it does.

>> If we choose to support that, I think it is
>> absolutely inevitable that people are going to want all the same
>> options that they would have if they really made a separate view -
>> separate permissions, WITH CHECK OPTION, all of it.
>
> We are already looking at WITH CHECK OPTION-style support, but I
> disagree that separate permissions or data changing will ever be a part
> of RLS because then it's no longer RLS.

What do you mean by "data changing"? If you mean inserts, updates,
and deletes, I am very sure people are going to want to perform those
operations on RLS-enabled tables.

Do you find it implausible that someone will want to exempt a certain
role from RLS on only one table but not on other tables in the system?
Do you find it implausible that someone will want to allow a certain
table to bypass RLS when selecting rows, but not when updating or
deleting them? I find those scenarios very plausible.

>> It addresses running pg_dump *as the superuser*, but not as a database
>> owner or just a regular users. If unprivileged user A runs pg_dump -t
>> some_table_owned_by_user_b, and falls victim to a Trojan horse, that
>> is going to get reported as a security defect in PostgreSQL. Telling
>> the person who reports that issue that it's design behavior is not
>> going to make them happy, or result in good press coverage for
>> PostgreSQL.
>
> We have this problem with psql today, as has been discussed. The fact
> that pg_dump doesn't happen to have this problem is great but it's no
> true solution for the problem at hand.

It's true that users can break security by being incautious about the
queries they type into psql, and I'm all for having better tools to
manage that. But a feature that causes currently-safe uses of pg_dump
to become unsafe is, in my opinion, absolutely not OK.

I do agree with your argument that things like adding and removing
columns, or changing their data types, could be simpler with RLS than
in the view-over-table model - because in the view-over-table model,
we don't really know whether the user would like a new column to
cascade to the view, whereas in the RLS model, we can automatically do
the right thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-12 13:13:24
Message-ID: 1402578804.43282.YahooMailNeo@web122302.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Even aside from security exposures, how
> does a non-superuser who runs pg_dump know whether they've got a
> complete backup or a filtered dump that's missing some rows?

This seems to me to be a killer objection to the feature as
proposed, and points out a huge difference between column level
security and the proposed implementation of row level security.
(In fact it is a difference between just about any GRANTed
permission and row level security.)  If you try to SELECT * FROM
sometable and you don't have rights to all the columns, you get an
error.  A dump would always either work as expected or generate an
error.

test=# create user bob;
CREATE ROLE
test=# create user bill;
CREATE ROLE
test=# set role bob;
SET
test=> create table person (person_id int not null primary key,
name text not null, ssn text);
CREATE TABLE
test=> grant select (person_id, name) on table person to bill;
GRANT
test=> reset role;
RESET
test=# set role bill;
SET
test=> select person_id, name from person;
 person_id | name
-----------+------
(0 rows)

test=> select * from person;
ERROR:  permission denied for relation person

The proposed approach would leave the validity of any dump which
was not run as a superuser in doubt.  The last thing we need, in
terms of improving security, is another thing you can't do without
connecting as a superuser.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-12 22:33:20
Message-ID: 539A2AB0.1030806@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6/11/14, 10:26 AM, Robert Haas wrote:
> Now, as soon as we introduce the concept that selecting from a table
> might not really mean "read from the table" but "read from the table
> after applying this owner-specified qual", we're opening up a whole
> new set of attack surfaces. Every pg_dump is an opportunity to hack
> somebody else's account, or at least audit their activity.

I'm in full agreement we should clearly communicate the issues around
pg_dump in particular, because they can't necessarily be eliminated
altogether without some major work that's going to take a while to
finish. And if the work-around is some sort of GUC for killing RLS
altogether, that's ugly but not unacceptable to me as a short-term fix.

One of the difficult design requests in my inbox right now asks how
pg_dump might be changed both to reduce its overlap with superuser
permissions and to allow auditing of its activity. Those requests
aren't going away; their incoming frequency is actually rising quite
fast right now. They're both things people expect from serious SQL
oriented commercial database products, and I'd like to see PostgreSQL
continue to displace those as we reach feature parity in those areas.

Any way you implement finer grained user permissions and auditing
features will be considered a new attack vector when you use those
features. The way the proposed RLS feature inserts an arbitrary
function for reads has a similar new attack vector when you use that
feature.

I'm kind of surprised to see this turn into a hot button all of the
sudden though, because my thought on all that so far has been a giant so
what? This is what PostgreSQL does.

You wanna write your own C code and then link the thing right into the
server, so that bugs can expose data and crash the whole server? Not
only can you shoot yourself in the foot that way, we supply a sample gun
and bullets in contrib. How about writing arbitrary code in any one of
a dozen server-side languages of wildly varying quality, then hooking
that code so it runs as a trigger function whenever you change a row?
PostgreSQL is *on it*; we love letting people write some random thing,
and then running that random thing against your data as a side-effect of
doing an operation. And if you like that...just wait until you learn
about this half-assed rules feature we have too!

And when the database breaks because the functions people inserted were
garbage, that's their fault, not a cause for a CVE. And when someone
blindly installs adminpack because it sounded like a pgAdmin
requirement, lets a monitoring system run as root so it can watch
pg_stat_activity, and then discovers that pair of reasonable decisions
suddenly means any fool with monitoring access can call
pg_file_unlink...that's their fault too. These are powerful tools with
serious implications, and they're expected to be used by equally serious
users.

We as a development community do need to put a major amount of work into
refactoring all of these security mechanisms. There should be less of
these embarrassing incidents where bad software design really forced the
insecure thing to happen, which I'd argue is the case for that
pg_stat_activity example. And luckily so far development resources are
appearing for organizations I know of working in that direction
recently, as fast as the requirements are rising. I think there's a
good outcome at the end of that road.

But let's not act like RLS is a scary bogeyman because it introduces a
new way to hack the server or get surprising side-effects. That's
expected and possibly unavoidable behavior in a feature like this, and
there are much worse instances of arbitrary function risk throughout the
core code already.

--
Greg Smith greg(dot)smith(at)crunchydatasolutions(dot)com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-13 00:13:50
Message-ID: CAOuzzgqO7i7SjBegshf3KUBkL_KyZGgsWkgYgHAQ4hn=4Htkyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg, all,

I will reply to the emails in detail when I get a chance but am out of town
at a funeral, so it'll likely be delayed. I did want to echo my agreement
for the most part with Greg and in particular...

On Thursday, June 12, 2014, Gregory Smith <gregsmithpgsql(at)gmail(dot)com> wrote:

> On 6/11/14, 10:26 AM, Robert Haas wrote:
>
>> Now, as soon as we introduce the concept that selecting from a table
>> might not really mean "read from the table" but "read from the table after
>> applying this owner-specified qual", we're opening up a whole new set of
>> attack surfaces. Every pg_dump is an opportunity to hack somebody else's
>> account, or at least audit their activity.
>>
>
> I'm in full agreement we should clearly communicate the issues around
> pg_dump in particular, because they can't necessarily be eliminated
> altogether without some major work that's going to take a while to finish.
> And if the work-around is some sort of GUC for killing RLS altogether,
> that's ugly but not unacceptable to me as a short-term fix.

A GUC which is enable / disable / error-instead may work quiet well, with
error-instead for pg_dump default if people really want it (there would
have to be a way to disable that though, imv).

Note that enable is default in general, disable would be for superuser only
(or on start-up) to disable everything, and error-instead anyone could use
but it would error instead of implementing RLS when querying an RLS-enabled
table.

This approach was suggested by an existing user testing out this RLS
approach, to be fair, but it looks pretty sane to me as a way to address
some of these concerns. Certainly open to other ideas and thoughts though.

Thanks,

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-13 07:11:44
Message-ID: CAEZATCXT0h=4G_WTEtoy_g6PPxgckZsQLMOgt4vSTifRHMdCQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 June 2014 01:13, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Greg, all,
>
> I will reply to the emails in detail when I get a chance but am out of town
> at a funeral, so it'll likely be delayed. I did want to echo my agreement
> for the most part with Greg and in particular...
>
> On Thursday, June 12, 2014, Gregory Smith <gregsmithpgsql(at)gmail(dot)com> wrote:
>>
>> On 6/11/14, 10:26 AM, Robert Haas wrote:
>>>
>>> Now, as soon as we introduce the concept that selecting from a table
>>> might not really mean "read from the table" but "read from the table after
>>> applying this owner-specified qual", we're opening up a whole new set of
>>> attack surfaces. Every pg_dump is an opportunity to hack somebody else's
>>> account, or at least audit their activity.
>>
>>
>> I'm in full agreement we should clearly communicate the issues around
>> pg_dump in particular, because they can't necessarily be eliminated
>> altogether without some major work that's going to take a while to finish.
>> And if the work-around is some sort of GUC for killing RLS altogether,
>> that's ugly but not unacceptable to me as a short-term fix.
>
>
> A GUC which is enable / disable / error-instead may work quiet well, with
> error-instead for pg_dump default if people really want it (there would have
> to be a way to disable that though, imv).
>
> Note that enable is default in general, disable would be for superuser only
> (or on start-up) to disable everything, and error-instead anyone could use
> but it would error instead of implementing RLS when querying an RLS-enabled
> table.
>
> This approach was suggested by an existing user testing out this RLS
> approach, to be fair, but it looks pretty sane to me as a way to address
> some of these concerns. Certainly open to other ideas and thoughts though.
>

Yeah, I was thinking something like this could work, but I would go
further. Suppose you had separate GRANTable privileges for direct
access to individual tables, bypassing RLS, e.g.

GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name

Combined with the GUC (direct_table_access, say) to request direct
access to all tables. Then with direct_table_access = true/required, a
SELECT from a table would error if the user hadn't been granted the
DIRECT SELECT privilege on all the tables referenced in the query.
Tools like pg_dump would require direct_table_access, but there might
be other levels of access that didn't error out.

I think if I were using RLS, I would definitely want/expect this level
of fine-grained control over permissions on a per-table basis, rather
than the superuser/non-superuser level of control, or having
RLS-exempt users.

Actually, given the fact that the majority of users won't be using
RLS, I would be tempted to invert the above logic and have the new
privilege be for LIMITED access (via RLS quals). So a user granted the
normal SELECT privilege would be able to bypass RLS, but a user only
granted LIMITED SELECT wouldn't.

Regards,
Dean


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Greg Smith <greg(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-16 04:24:16
Message-ID: 20140616042416.GJ2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com> writes:
> > Through this effort, we have concluded that for RLS the case of
> > invalidating a plan is only necessary when switching between a superuser
> > and a non-superuser. Obviously, re-planning on every role change would be
> > too costly, but this approach should help minimize that cost. As well,
> > there were not any cases outside of this one that were immediately apparent
> > with respect to RLS that would require re-planning on a per userid basis.
>
> Hm ... I'm not following why we'd need a special case for superusers and
> not anyone else? Seems like any useful RLS scheme is going to require
> more privilege levels than just superuser and not-superuser.

Just to clarify this- the proposal allows RLS to be implemented
essentially by any user-defined qual, where that qual can include the
current user, the IP the user is connecting from, or more-or-less
anything else, possibly even via a user-defined function or security
module. It is not superuser-or-not. This discussion is about how to
support users for which RLS should not be applied. I can see that being
useful at a more granular level than superuser-or-not, but even at that
level, RLS is still extremely useful.

> Could we put the "if superuser then ok" test into the RLS condition test
> and thereby not need more than one plan at all?

As discussed, that unfortunately doesn't quite work.

This discussion, in general, has been quite useful and I'll work on
adding documentation to the wiki pages which discusses the consideration
and suggestions for a GUC to disable-or-error when RLS is encountered,
along with a per-role capability to bypass RLS; that is in line with the
goal of avoiding adding superuser-specific capabilities.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-16 04:30:44
Message-ID: 20140616043044.GK2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> > I agree, and now that the urgency of trying to deliver this for 9.4 is
> > over it's worth seeing if we can just run as table owner.
>
> > Failing that, we could take the approach a certain other RDBMS does and
> > make the ability to define row security quals a GRANTable right
> > initially held only by the superuser.
>
> Hmm ... that might be a workable compromise. I think the main issue here
> is whether we expect that RLS quals will be something that the planner
> could optimize to any meaningful extent. If they're always (in effect)
> wrapped in SECURITY DEFINER functions, I think that largely blocks any
> optimizations; but maybe that wouldn't matter in practice.

From what I've heard from actual users with other RDBMS's who are coming
to PostgreSQL- the reality is that they're going to be using a security
module (eg: SELinux) whose responsibility it is to manage this whole
question of "can this user see this row", meaning there's zero chance of
optimization.

I'd certainly like to see the ability to optimize remain in cases where
the qual itself gives us a way to filter (eg: a table partitioned based
on some security level, where another table maps users to levels), but
that is, from a practical standpoint, not an immediate concern from real
users and I don't believe our approach paints us into a corner which
would prevent that. What that would require is better support for true
partitioning rather than constraint exclusions.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-16 05:15:30
Message-ID: 20140616051530.GL2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jun 11, 2014 at 8:59 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > In this case the user-defined code needs to return a boolean. We don't
> > currently do anything to prevent it from having side-effects, no, but
> > the same is true with views which incorporate functions. I agree that
> > it makes a difference when compared to column-level privileges, but my
> > point was that we have provided easier ways to do things which were
> > possible using more complicated methods before. Perhaps the risk with
> > RLS is higher but these issues look managable to me and the level of
> > doubt about our ability to provide this feature in a reasonable and
> > principled way that our users will understand surprises me.
>
> I'm glad the issues look manageable to you, but you haven't really
> explained how to manage them.

There's been a number of suggestions made and it'd be great to get more
feedback on them- running the quals as the table owner, having a GUC
which can be set to either run 'as normal' or either ignore RLS (if the
user has that right) or error out if RLS would happen, and undoubtably
there are other ideas along those same lines to address the pg_dump and
other concerns.

> For my part, I'm mildly surprised that anyone thinks it's a good idea
> to have SELECT * FROM tab to mean different things depending on who is
> typing it.

Realistically, in the RDBMS realm in which we're in and that we're
working to break into- this is absolutely a given and expected. It's
new to PostgreSQL, certainly, but it's not uncommon or surprising at all
in our industry.

> To me, that seems very confusing; how does an unprivileged
> user with no ability to assume some other role validate that the row
> security policy they've configured works at all and exposes precisely
> the intended set of rows?

While I see what you're getting at, I'm not convinced it's really all
that different from being set up without access to some schema or table
which the administrator setting up accounts didn't include for you.
Sure, in the case of a schema or table, you can get an error back
instead of just not seeing the data, but if you're looking for specific
data, chances are pretty good you'll realize the lack of data quickly
and ask the same question regarding access.

To wit, I've certainly had users ask exactly that question of- "do I
have access to all the data in this table?" even when using PG where
it's a bit tricky to limit such access. Clearly, the same risk applies
when using views and so the question is understandable. Perhaps these
were users with more experience in other RDBMS's where it's more common
to have RLS, but there are at least a couple cases which I can think of
where that wouldn't apply.

> Even aside from security exposures, how
> does a non-superuser who runs pg_dump know whether they've got a
> complete backup or a filtered dump that's missing some rows?

This would be addressed with the GUC that's been proposed. As would the
previous paragraph, though I wanted to apply to that independently.

> I'm not referring to the proposed implementation particularly; or at
> least not that aspect of it. I don't think trying to run the view
> quals as the defining user is likely to be very appealing, because I
> think it's going to hurt performance, for example by preventing
> function inlining and requiring lots of user-ID switches.

I understand that there are performance implications. As mentioned to
Tom, realistically, there's zero way to optimized at least some of these
use-cases because they require a completely external module (eg:
SELlinux) to be involved in the decision about who can view what
records. If we can optimize that, it'd be by a completely different
approach whereby we pull up the qual higher because we know the whole
query only involves leakproof functions or similar, allowing us to only
apply the filter to the final set of records prior to them being
returned to the user. The point being that such optimizations would
happen independently and regardless of the quals or user-defined
functions involved. At the end of the day, I can't think of a better
optimization for such a case (where we have to ask an external security
module if a row is acceptable to return to the user) than that. Is
there something specific you're thinking about that we'd be missing out
on?

> But I'm not
> gonna complain if someone wants to mull it over and make a proposal
> for how to make it work. Rather, my concern is that all we've got is
> what might be called the core of the feature; the actual guts of it.
> There are a lot of ancillary details that seem to me to be not worked
> out at all yet, or only half-baked.

Perhaps it's just my experience, but I've been focused on the main core
feature for quite some time and it feels like we're really close to
having it there. I agree that a few additional bits would be nice to
have but these strike me as relatively straight-forward to implement
overtop of this general construct. I do see value in documenting these
concerns and will see about making that happen, along with what the
general viewpoints and thoughts are about how to address the concern.

> > How about "it's in high demand by our user base"? In particular, it's
> > being asked for by a *highly* technical section of our user base who
> > uses these capabilities today, with this design, in those other
> > databases.
>
> Sure, that's a valid reason for considering any feature. But it's not
> an excuse to overlook whatever design problems may exist.

Agreed- improvements in the design, provided it continues to meet the
expectations of the user-base, are absolutely welcome.

> > The current approach allows a nearly unlimited level of flexibility,
> > should the user wish it, by being able to run user-defined code.
> > Perhaps that would be considered 'one policy', but it could certainly
> > take under consideration the calling user, the object being queried
> > (if a function is defined per table, or if we provide a way to get
> > that information in the function), etc.
>
> In theory, that's true. But in practice, performance will suck unless
> the security qual is easily optimizable. If your security qual is
> WHERE somecomplexfunction() you're going to have to implement that by
> sequential-scanning the table and evaluating the function for each
> row.

That's not actualy true today, is it? Given our leak-proof attribute,
if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()"
then we would be able to push down the leak-proof function and not
necessairly run a straight sequential scan, no? Even so, though, we've
had users who have tested exactly what this patch implements and they've
been happy with their real-world use-cases. I'm certainly all for
optimization and would love to see us make this better for everyone, but
I don't view that as a reason to delay this particular feature which is
really just bringing us up to parity with other RDMBS's.

> For example, I once worked at a company where we had a table
> containing information about our customers and potential customers.
> Sales representatives were allowed to see their own accounts, and
> partners were allowed to see accounts associated with that partner.
> These things were independent. So for a sales rep, the security qual
> was WHERE sales_rep_id = <something> and for a partner the security
> qual was WHERE partner_id = <something>. Now, you could maybe write
> this as a single qual, something like this:
>
> WHERE sales_rep_id = (SELECT oid FROM pg_authid WHERE rolname =
> current_user AND oid IN (SELECT id FROM person WHERE is_sales_rep)) OR
> partner_id = (SELECT p.org_id FROM pg_authid a, person p WHERE
> a.rolname = current_user and a.oid = p.id)

That looks like it'd work, or a pl/pgsql function which did the same.

> But that's probably not going to perform very well, because to match
> an index on sales_rep_id, or an index on partner_id, that's going to
> have to get simplified a whole lot, and that's probably not going to
> happen. If we've only got one branch of the OR, I think we'll realize
> we can evaluate the subquery as an InitPlan and then use an index, but
> with two branches I think that will fail.

You're right- we could perform better in such a case.

What solution did you come up with for this case, which performed well
and was also secure..?

> I don't want to overstate the importance of this particular case; but
> I do think scenarios in which it's advantageous to have multiple
> row-level security policies are plausible.

I'm not against this in general. The question, in my mind, is what
level of granularity we would provide this at. As I tried to outline
previously, there's a huge number of combinations which we could come up
with to support this under and I'm not 100% sure that it'd actualy end
up being better than the simplicity of a single qual where the user gets
to define any kind of relationship they want between the various
policies; even programatically if they want.

> Another, perhaps-simpler
> example is that you might have a table containing unclassified data,
> classified data, and secret data. You want to give access to the
> unclassified data only to one category of users; access to the
> unclassified data and the classified data to a second group of
> more-trusted users; and access to all of the data to a third group of
> very highly trusted users. If the table can only have one security
> policy that applies to everyone who isn't exempt, how will you do
> that? This sort of use case seems very plausible to me so I think we
> need to give some real thought to what we will recommend to users who
> want to do things like this. Can the proposed patch handle it? How?

There are multiple ways this could be implemented- the first, basic, way
would be through a table which maps users to security levels via an enum
where more privileged levels are higher in value and therefore a simple
greater-than could be applied after a join which would implement this
particular policy.

The reality (which I've had discussions with users about..) is actually
much more complicated where an extermal security module makes the
decision about if a given user/connection can have access to a specific
bit of labeled data. The reason is that things are not classified so
simply as "unclass", "class" and "secret" but rather into much more
granular pieces- user X might have access to A and B, but not C, while
user Y can access B and C. The absolute levels described above may
exist for less sensetive data but for data beyond that (which I'd hazard
to guess is most of it...), more granularity and control is needed.

> >> What this patch is doing is basically allowing a table to really be a
> >> view over itself.
> >
> > I don't agree with this characterization. This patch specifically
> > allows filtering the rows returned from the table, and it intentionally
> > does not allow changing the data.
>
> I don't know what to say to this. What I said is, quite literally,
> what the patch does. It wraps the patch in an subquery RTE that is
> precisely the same thing you would get if you defined a
> security_barrier view with the security qual in the WHERE clause.

Exactly- it does *not* allow changing the SELECT clause, or adding in a
GROUP BY, or a JOIN, or tossing in a windowing function, etc.

> This is not a question of opinion; the patch either does that or it
> doesn't, and I think it does.

Apologies for not being clearer but my point was that only the WHERE
clause can be modified by this patch, which is quite intentional. This
separates the concerns of "can I access this data" from "modify the data
to represent it in X way".

> > We are already looking at WITH CHECK OPTION-style support, but I
> > disagree that separate permissions or data changing will ever be a part
> > of RLS because then it's no longer RLS.
>
> What do you mean by "data changing"? If you mean inserts, updates,
> and deletes, I am very sure people are going to want to perform those
> operations on RLS-enabled tables.

Yes, they'll want to support those operations. However, they will not
expect RLS to allow them to redefine a columns as "x+10" instead of "x",
which a view does allow.

> Do you find it implausible that someone will want to exempt a certain
> role from RLS on only one table but not on other tables in the system?

No- excempting certain roles from RLS makes sense as a capability.

> Do you find it implausible that someone will want to allow a certain
> table to bypass RLS when selecting rows, but not when updating or
> deleting them? I find those scenarios very plausible.

This is also plausible and something which we were anticipating while
developing this patch. Simon, KaiGai and I specifically discussed
addressing SELECT vs UPDATE/DELETE earlier this year, as I recall.
Providing that level of flexibility is absolutely on the road map, but I
don't know that it all has to exist in 9.5; it may, which would be
great, but I don't view it as required.

> > We have this problem with psql today, as has been discussed. The fact
> > that pg_dump doesn't happen to have this problem is great but it's no
> > true solution for the problem at hand.
>
> It's true that users can break security by being incautious about the
> queries they type into psql, and I'm all for having better tools to
> manage that. But a feature that causes currently-safe uses of pg_dump
> to become unsafe is, in my opinion, absolutely not OK.

I don't particularly like it, and would require a way to override it,
but a GUC which pg_dump sets by default that says "give me everything or
error" would work to address this. I'm open to other thoughts, of
course, but it does seem like a relatively simple solution (which is a
good thing when it comes to security concerns, imv).

> I do agree with your argument that things like adding and removing
> columns, or changing their data types, could be simpler with RLS than
> in the view-over-table model - because in the view-over-table model,
> we don't really know whether the user would like a new column to
> cascade to the view, whereas in the RLS model, we can automatically do
> the right thing.

Agreed.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-16 05:25:45
Message-ID: 20140616052545.GM2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin,

* Kevin Grittner (kgrittn(at)ymail(dot)com) wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > Even aside from security exposures, how
> > does a non-superuser who runs pg_dump know whether they've got a
> > complete backup or a filtered dump that's missing some rows?
>
> This seems to me to be a killer objection to the feature as
> proposed, and points out a huge difference between column level
> security and the proposed implementation of row level security.

I really hate this notion of "killer objection". It's been discussed
(perhaps not seen by all) at least one suggestion for how to address
this specific issue and there are other ways in which to address it
(having COPY have the same behavior as the GUC being discussed, instead
of having a GUC, though I feel like the GUC is a better approach..).

> (In fact it is a difference between just about any GRANTed
> permission and row level security.)  If you try to SELECT * FROM
> sometable and you don't have rights to all the columns, you get an
> error.  A dump would always either work as expected or generate an
> error.

Provided you know all of the tables and other objects which need to be
included in such a partial dump (as a full dump, today, must be run by a
superuser to be sure you're actually getting everything anyway...).

> The proposed approach would leave the validity of any dump which
> was not run as a superuser in doubt.  The last thing we need, in
> terms of improving security, is another thing you can't do without
> connecting as a superuser.

Any dump not run by a superuser is already in doubt, imv. That is a
problem we already have which really needs to be addressed, but I view
that as an independent issue.

I agree with avoiding adding another superuser-only capability; see the
other sub-thread about making this a per-user capability.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-16 14:12:58
Message-ID: 20140616141258.GA29243@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 13 June 2014 01:13, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > This approach was suggested by an existing user testing out this RLS
> > approach, to be fair, but it looks pretty sane to me as a way to address
> > some of these concerns. Certainly open to other ideas and thoughts though.
>
> Yeah, I was thinking something like this could work, but I would go
> further. Suppose you had separate GRANTable privileges for direct
> access to individual tables, bypassing RLS, e.g.
>
> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name

This is certainly an interesting idea and I'm glad we're getting this
level of discussion early on in the 9.5 cycle as I'd really like to see
a good solution implemented for 9.5.

I've been going back-and-forth about this and what's really swaying me
right now is that it'd be nearly impossible to determine if a given RLS
qual actually allows full access to a table for a given user without
going through the entire table and testing the qual against each row.
With this GRANT ability, we'd be able to completely avoid calling the
RLS quals when the user is granted this right.

Not sure offhand how many bits we've got left at the per-table level
though; we added TRUNCATE rights not that long ago and this looks like
another good right to add, but there are only so many bits available..
At the same time, I do think this is something we could also add later,
perhaps after figuring out a good way to extend the set of bits
available for privileges on tables.

> Combined with the GUC (direct_table_access, say) to request direct
> access to all tables. Then with direct_table_access = true/required, a
> SELECT from a table would error if the user hadn't been granted the
> DIRECT SELECT privilege on all the tables referenced in the query.

I can see this working. One thing I'm curious about is if we would want
to support this inside of the SELECT statement (or perhaps COPY?)
directly, rather than making a user have to flip a GUC back and forth
while they're doing something. I can imagine, during testing, a session
looking like this:

select * from table;
@#(at)!$!
set direct_table_access = true;
select * from table;
select * from table where blah = x;
alter table set row level security blah = x;
select * from table;
select * from table;
select * from table;
@!#$!(at)#!
set direct_table_access = false;
select * from table;
...

Would 'select direct' or 'select * from DIRECT table' (or maybe 'ONLY'?)
be workable? There's certainly SQL standard concerns to be thought of
here which might precldue anything we do with SELECT, but we could
support something with COPY.

> Tools like pg_dump would require direct_table_access, but there might
> be other levels of access that didn't error out.

pg_dump would need an option to set direct_table_access or not. Having
it ask by default is acceptable to me, but I do think we need to be able
to tell it to *not* set that.

> I think if I were using RLS, I would definitely want/expect this level
> of fine-grained control over permissions on a per-table basis, rather
> than the superuser/non-superuser level of control, or having
> RLS-exempt users.

I agree that it'd be great to have- and we need to make sure we don't
paint ourselves into a corner with the initial versions. What I'm
worried about is that we're going to end up feature-creeping this to
death and ending up with nothing in 9.5. I'll try to get a wiki page
going to discuss these items (as mentioned up-thread) and we can look at
prioritizing them and looking at what dependencies exist on other parts
of the system and seeing what's required for the initial version.

> Actually, given the fact that the majority of users won't be using
> RLS, I would be tempted to invert the above logic and have the new
> privilege be for LIMITED access (via RLS quals). So a user granted the
> normal SELECT privilege would be able to bypass RLS, but a user only
> granted LIMITED SELECT wouldn't.

This I don't agree with- it goes against what is done on existing
systems afaik and part of the idea is that you can minimize changes to
the applications or users but still be able to curtail what they can
see. Making regular SELECTs start erroring if they haven't set some GUC
because RLS has been implemented on a given table would be quite
annoying, imv.

Now, that said, wouldn't the end user be able to control this for their
particular environment by setting the GUC accordingly in
postgresql.conf? I'd still argue that it should be defaulted to what I
view as the 'normal' case, where RLS is applied unless you asked for
your queries to error instead, but if a user wants to have it flipped
around the other way, they could update their postgresql.conf to make it
so.

Thanks,

Stephen


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-16 14:44:11
Message-ID: 1402929851.96221.YahooMailNeo@web122305.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Kevin Grittner (kgrittn(at)ymail(dot)com) wrote:

>> The proposed approach would leave the validity of any dump which
>> was not run as a superuser in doubt.  The last thing we need, in
>> terms of improving security, is another thing you can't do
>> without connecting as a superuser.
>
> Any dump not run by a superuser is already in doubt, imv.  That
> is a problem we already have which really needs to be addressed,
> but I view that as an independent issue.

I'm not seeing that.  If the user can't dump, you get an error and
pg_dump returns something other than SUCCESS.

test=# create user bob;
CREATE ROLE
test=# create user tom;
CREATE ROLE
test=# set role bob;
SET
test=> create table person(person_id int primary key, name text not null, ssn text);
CREATE TABLE
test=> insert into person values (1, 'Stephen Frost', '123-45-6789');
INSERT 0 1
test=> insert into person values (2, 'Kevin Grittner');
INSERT 0 1
test=> grant select (person_id, name) on person to tom;
GRANT
test=> \q
kgrittn(at)Kevin-Desktop:~/pg/master$ pg_dump -U bob test >bob-backup.sql
kgrittn(at)Kevin-Desktop:~/pg/master$ pg_dump -U tom test >tom-backup.sql
pg_dump: [archiver (db)] query failed: ERROR:  permission denied for relation person
pg_dump: [archiver (db)] query was: LOCK TABLE public.person IN ACCESS SHARE MODE
kgrittn(at)Kevin-Desktop:~/pg/master$ echo $?
1

> I agree with avoiding adding another superuser-only capability;
> see the other sub-thread about making this a per-user capability.

It should be possible to design something which does not have this
risk.  What I was saying was that what was being described at that
point wasn't it, and IMV was not acceptable.  I think that there
should never by any doubt that a pg_dump run which completes
without error copied all requested tables in their entirety, not a
subset of the rows in the tables.

A GUC which only caused an error on the attempt to actually read
specific rows which the user does not have permission to see would
leak too much information.  A GUC which caused a SELECT or COPY
from a table to throw an error if the user was not entitled to see
all rows in the table could work.  Another thing which could work,
if it can be coded, would be a GUC which would throw an error if
the there were not quals on the query to prohibit seeing rows which
the security conditions would prohibit, whether or not any matching
rows actually existed.  The latter would match the behavior of
column level security -- you get an error when trying to select a
prohibited column even if there are no rows in the table.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-16 15:28:35
Message-ID: 20140616152835.GD29243@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Kevin Grittner (kgrittn(at)ymail(dot)com) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Any dump not run by a superuser is already in doubt, imv.  That
> > is a problem we already have which really needs to be addressed,
> > but I view that as an independent issue.
>
> I'm not seeing that.  If the user can't dump, you get an error and
> pg_dump returns something other than SUCCESS.

We've outlined an approach with RLS which would do the same.

I'm still of the opinion that, today, we have a problem that only a
superuser-run dump has any chance of success (and even if you get it
working today it'll probably break tomorrow, and you had better be
paying attention). I'd like to fix that situation, but it's an
independent effort from this. We've had issues in the past with pg_dump
creating things that can't be restored and they're certainly bugs but
trying to make that work with a regular user as a whole system backup
strategy, today, is just asking for trouble.

> > I agree with avoiding adding another superuser-only capability;
> > see the other sub-thread about making this a per-user capability.
>
> It should be possible to design something which does not have this
> risk. 

The risk that pg_dump might create a dump which can't be restored?
Agreed, and I'd love to hear your thoughts on the proposal.

> What I was saying was that what was being described at that
> point wasn't it, and IMV was not acceptable.  I think that there
> should never by any doubt that a pg_dump run which completes
> without error copied all requested tables in their entirety, not a
> subset of the rows in the tables.

pg_dump needs to be able to have an option to go either way on this
case, as I can see value in running pg_dump in "RLS-enforcing" mode, but
it could default to "error-if-RLS".

> A GUC which only caused an error on the attempt to actually read
> specific rows which the user does not have permission to see would
> leak too much information.  A GUC which caused a SELECT or COPY
> from a table to throw an error if the user was not entitled to see
> all rows in the table could work.

Right- this would be the 'DIRECT SELECT' which would allow bypassing all
RLS and therefore mean that the user is allowed to see ALL rows of a
table. That's one of the reasons why I agree with Dean's approach,
because we really need to know at the outset if the calling user is
allowed to extract all rows from a table or not- we can't go looking
through the entire table testing each row before we start running the
query.

>   Another thing which could work,
> if it can be coded, would be a GUC which would throw an error if
> the there were not quals on the query to prohibit seeing rows which
> the security conditions would prohibit, whether or not any matching
> rows actually existed. 

If I'm following you correctly, this would be an optimization that
allows avoiding RLS in the case where some information about the user
causes the overall qual to always return 'true', correct? I'd certainly
like to see what happens in that case today and agree that it'd be great
to optimize for and perhaps even allow a user for which that is true to
not need the 'DIRECT SELECT' privilege, but in practice, I don't think
it'll be possible in most cases (certainly not in the case where an
external security module is deciding the access) and the optimization
may not be worth it.

> The latter would match the behavior of
> column level security -- you get an error when trying to select a
> prohibited column even if there are no rows in the table.

Agreed, but that would be a relaxation of the proposed approach and
therefore something which could be added later, if it's deemed
worthwhile.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-17 18:54:04
Message-ID: CA+TgmobKYTQA+MHCNi8v8Q7tVSgZjXJD3DhpaJ1m_42bqHocUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 12, 2014 at 6:33 PM, Gregory Smith <gregsmithpgsql(at)gmail(dot)com> wrote:
> I'm kind of surprised to see this turn into a hot button all of the sudden
> though, because my thought on all that so far has been a giant so what?
> This is what PostgreSQL does.
[...]
> But let's not act like RLS is a scary bogeyman because it introduces a new
> way to hack the server or get surprising side-effects. That's expected and
> possibly unavoidable behavior in a feature like this, and there are much
> worse instances of arbitrary function risk throughout the core code already.

I have some technical comments on later emails in this thread, but
first let me address this point. In the past, people have sometimes
complained that reviewers waited until very late in the cycle to
complain about issues which they found problematic. By the time the
issues were pointed out, insufficient time remained before feature
freeze to get those issues addressed, causing the patch to slip out of
the release and provoking developer frustration. It has therefore
been requested numerous times by numerous people that potential issues
be raised as early as possible.

The concerns that I have raised in this thread are not new; I have
raised them before. However, we are now at the beginning of a new
development cycle, and it seems fair to assume that the people who are
working on this patch are hoping very much that something will get
committed to 9.5. Since it seems to me that there are unaddressed
issues with the design of this patch, I felt that it was a good idea
to make sure that those concerns were on the table right from the
beginning of the process, rather than waiting until the patch was on
the verge of commit or, indeed, already committed. That is why, when
this thread was revived on June 10th, I decide that it was a good time
to again comment on the design points about which I was concerned.

After sending that one (1) email, I was promptly told that "I'm very
disappointed to hear that the mechanical pieces around making RLS easy
for users to use ... is receiving such push-back." The push-back, at
that point in time, consisted of one (1) email. Several more emails
have been sent that time, including the above-quoted text, seeming to
me to imply that the people who are concerned about this feature are
being unreasonable. I don't believe I am the only such person,
although I may be the main one right at the moment, and you may not be
entirely surprised to hear that I don't think I'm being unreasonable.

I will admit that my initial email may have contained just a touch of
hyperbole. But I won't admit to more than a touch, and frankly, I
think it was warranted. I perfectly well understand that people
really, really, really want this feature, and if I hadn't understood
that before, I certainly understand it now. However, I believe that
there has been a lack of focus in the development of the patch thus
far in a couple of key areas - first in terms of articulating how it
is different from and better than a writeable security barrier view,
and second on how to manage the security and operational aspects of
having a feature like this. I think that the discussion subsequent to
my June 10th email has let to some good discussion on both points,
which was my intent, but I still think much more time and thought
needs to be spent on those issues if we are to have a feature which is
up to our usual standards. I do apologize to anyone who interpreted
that initial as a pure rant, because it really wasn't intended that
way. Contrariwise, I hope that the people defending this patch will
admit that the issues I am raising are real and focus on whether and
how those concerns can be addressed.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-17 19:14:23
Message-ID: CA+TgmoYc+D=uLhf3f287gJ9XEKLY31WzT2NB99rO-67f2B9hKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 12, 2014 at 8:13 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> I'm in full agreement we should clearly communicate the issues around
>> pg_dump in particular, because they can't necessarily be eliminated
>> altogether without some major work that's going to take a while to finish.
>> And if the work-around is some sort of GUC for killing RLS altogether,
>> that's ugly but not unacceptable to me as a short-term fix.
>
> A GUC which is enable / disable / error-instead may work quiet well, with
> error-instead for pg_dump default if people really want it (there would have
> to be a way to disable that though, imv).
>
> Note that enable is default in general, disable would be for superuser only
> (or on start-up) to disable everything, and error-instead anyone could use
> but it would error instead of implementing RLS when querying an RLS-enabled
> table.
>
> This approach was suggested by an existing user testing out this RLS
> approach, to be fair, but it looks pretty sane to me as a way to address
> some of these concerns. Certainly open to other ideas and thoughts though.

In general, I agree that this is a good approach. I think it will be
difficult to have a GUC with three values, one of which is
superuser-only and the other two of which are not. I don't think
there's any precedent for something like that in the existing
framework, and I think it's likely we'll run into unpleasant corner
cases if we try to graft it in. Also, I think we need to separate
things: whether the system is willing to allow the user to access the
table without RLS, and whether the user is willing to accept RLS if
the system deems it necessary.

For the first one, two solutions have been proposed. The initial
proposal was to insist on RLS except for the superuser (and maybe the
table owner?). Having a separate grantable privilege, as Dean
suggests, may be better. I'll reply separately to that email also, as
I have a question about what he's proposing.

For the second one, I think the two most useful behaviors are "normal
mode" - i.e. allow access to the table, applying RLS predicates if
required and not applying them if I am exempt - and "error-instead"
mode - i.e. if my access to this table would be mediated by an RLS
predicate, then throw an error instead. There's a third mode which
might be useful as well, which is "even though I have the *right* to
bypass the RLS predicate on this table, please apply the predicate
anyway". This could be used by the table owner in testing, for
example. Here again, the level of granularity we want to provide is
an interesting question. Having a GUC (e.g. enable_row_level_security
= on, off, force) would be adequate for pg_dump, but allowing the
table name to be qualified in the query, as proposed downthread, would
be more granular, admittedly at some parser cost. I'm personally of
the view that we *at least* need the GUC, because that seems like the
best way to secure pg_dump, and perhaps other applications. We can
and should give pg_dump an--allow-row-level-security flag, I think,
but pg_dump's default behavior should be to configure the system in
such a way that the dump will fail rather than complete with a subset
of the data. I'm less sure whether we should have something that can
be used to qualify table names in particular queries. I think it
would be really useful, but I'm concerned that it will require
creating additional fully-reserved keywords, which are somewhat
painful for users.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-17 19:17:38
Message-ID: CAOuzzgorTJistbpi724+WpMVQX7Kmb+naG1jU7LK0kyuUUnCAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

On Tuesday, June 17, 2014, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> After sending that one (1) email, I was promptly told that "I'm very
> disappointed to hear that the mechanical pieces around making RLS easy
> for users to use ... is receiving such push-back." The push-back, at
> that point in time, consisted of one (1) email. Several more emails
> have been sent that time, including the above-quoted text, seeming to
> me to imply that the people who are concerned about this feature are
> being unreasonable. I don't believe I am the only such person,
> although I may be the main one right at the moment, and you may not be
> entirely surprised to hear that I don't think I'm being unreasonable.

I'm on my phone at the moment but that looks like a quote from me. My email
and concern there was regarding the specific suggestion that we could check
off the "RLS" capability which users have been asking us to provide nearly
since I started with PG by saying that they could use Updatable SB views. I
did not intend it as a comment regarding the specific technical concerns
raised and have been responding to and trying to address those
independently and openly.

I've expressed elsewhere on this thread my gratitude that the technical
concerns are being brought up now, near the beginning of the cycle, so we
can address them. I've been working with others who are interested in RLS
on a wiki page to outline and understand the options and identify
dependencies and priorities. Hopefully the link will be posted shortly
(again, not at a computer right now) and we can get comments back. There
are some very specific questions which really need to be addressed and
which I've mentioned before (in particular the question of what user the
functions in a view definition should run as, both for "normal" views, for
SB views, and for when an RLS qual is included and run through that
framework, and if doing so would address some of the concerns which have
been raised regarding selects running code).

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-17 19:19:39
Message-ID: CA+TgmobZTJNDisC8ikUWDjja+fnxgu-9zeTMDe-BEaFG6u8_2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> Yeah, I was thinking something like this could work, but I would go
> further. Suppose you had separate GRANTable privileges for direct
> access to individual tables, bypassing RLS, e.g.
>
> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name

So, is this one new privilege (DIRECT) or four separate new privileges
that are variants of the existing privileges (DIRECT SELECT, DIRECT
INSERT, DIRECT UPDATE, DIRECT DELETE)?

> Actually, given the fact that the majority of users won't be using
> RLS, I would be tempted to invert the above logic and have the new
> privilege be for LIMITED access (via RLS quals). So a user granted the
> normal SELECT privilege would be able to bypass RLS, but a user only
> granted LIMITED SELECT wouldn't.

Well, for the people who are not using RLS, there's no difference
anyway. I think it matters more what users of RLS will expect from a
command like GRANT SELECT ... and I'm guessing they'll prefer that RLS
always apply unless they very specifically grant the right for RLS to
not apply. I might be wrong, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-17 19:41:04
Message-ID: CA+TgmoZc2VUDNuPTjG193SZMPV_+63+EhzveCh6MC+4gRK+wAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 16, 2014 at 1:15 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> I'm not referring to the proposed implementation particularly; or at
>> least not that aspect of it. I don't think trying to run the view
>> quals as the defining user is likely to be very appealing, because I
>> think it's going to hurt performance, for example by preventing
>> function inlining and requiring lots of user-ID switches.
>
> I understand that there are performance implications. As mentioned to
> Tom, realistically, there's zero way to optimized at least some of these
> use-cases because they require a completely external module (eg:
> SELlinux) to be involved in the decision about who can view what
> records. If we can optimize that, it'd be by a completely different
> approach whereby we pull up the qual higher because we know the whole
> query only involves leakproof functions or similar, allowing us to only
> apply the filter to the final set of records prior to them being
> returned to the user. The point being that such optimizations would
> happen independently and regardless of the quals or user-defined
> functions involved. At the end of the day, I can't think of a better
> optimization for such a case (where we have to ask an external security
> module if a row is acceptable to return to the user) than that. Is
> there something specific you're thinking about that we'd be missing out
> on?

Yeah, if we have to ask an external security module a question for
each row, there's little hope of any real optimization. However, I
think there will be a significant number of cases where people will
want filtering clauses that can be realized by doing an index scan
instead of a sequential scan, and if we end up forcing a sequential
scan anyway, the feature will be useless to those people.

>> But I'm not
>> gonna complain if someone wants to mull it over and make a proposal
>> for how to make it work. Rather, my concern is that all we've got is
>> what might be called the core of the feature; the actual guts of it.
>> There are a lot of ancillary details that seem to me to be not worked
>> out at all yet, or only half-baked.
>
> Perhaps it's just my experience, but I've been focused on the main core
> feature for quite some time and it feels like we're really close to
> having it there. I agree that a few additional bits would be nice to
> have but these strike me as relatively straight-forward to implement
> overtop of this general construct. I do see value in documenting these
> concerns and will see about making that happen, along with what the
> general viewpoints and thoughts are about how to address the concern.

I feel like there's quite a bit of work left to do around these
issues. The technical bits may not be too hard, but deciding what we
want will take some thought and discussion.

>> > The current approach allows a nearly unlimited level of flexibility,
>> > should the user wish it, by being able to run user-defined code.
>> > Perhaps that would be considered 'one policy', but it could certainly
>> > take under consideration the calling user, the object being queried
>> > (if a function is defined per table, or if we provide a way to get
>> > that information in the function), etc.
>>
>> In theory, that's true. But in practice, performance will suck unless
>> the security qual is easily optimizable. If your security qual is
>> WHERE somecomplexfunction() you're going to have to implement that by
>> sequential-scanning the table and evaluating the function for each
>> row.
>
> That's not actualy true today, is it? Given our leak-proof attribute,
> if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()"
> then we would be able to push down the leak-proof function and not
> necessairly run a straight sequential scan, no? Even so, though, we've
> had users who have tested exactly what this patch implements and they've
> been happy with their real-world use-cases. I'm certainly all for
> optimization and would love to see us make this better for everyone, but
> I don't view that as a reason to delay this particular feature which is
> really just bringing us up to parity with other RDMBS's.

I'm a bit confused here, because your example seems to be totally
different from my example. In my example, somecomplexfunction() will
get pushed down because it's the security qual; that needs to be
inside the security_barrier view, or a malicious user can subvert the
system by getting some other qual evaluated first. In your example,
you seem to be imagining WHERE somecomplexfunction() AND
leakprooffunctionx() as queries sent by the untrusted user, in which
case, yet, the leak-proof one will get pushed down and the other one
will not.

>> But that's probably not going to perform very well, because to match
>> an index on sales_rep_id, or an index on partner_id, that's going to
>> have to get simplified a whole lot, and that's probably not going to
>> happen. If we've only got one branch of the OR, I think we'll realize
>> we can evaluate the subquery as an InitPlan and then use an index, but
>> with two branches I think that will fail.
>
> You're right- we could perform better in such a case.
>
> What solution did you come up with for this case, which performed well
> and was also secure..?

I put the logic in the client. :-(

>> I don't want to overstate the importance of this particular case; but
>> I do think scenarios in which it's advantageous to have multiple
>> row-level security policies are plausible.
>
> I'm not against this in general. The question, in my mind, is what
> level of granularity we would provide this at. As I tried to outline
> previously, there's a huge number of combinations which we could come up
> with to support this under and I'm not 100% sure that it'd actualy end
> up being better than the simplicity of a single qual where the user gets
> to define any kind of relationship they want between the various
> policies; even programatically if they want.

I agree. That's why I think we need some more design work in this
area. Perhaps it's OK to allow only one RLS-qual per table at most,
and tell people that if you want more than that, you need to use
security-barrier views as wrappers instead. But I'm not sure; that
feels like it's giving something up that might be important. And I
think that the kinds of syntax we're discussing won't support leaving
that out of the initial version and adding it later, so if we commit
to this syntax, we're stuck with that behavior. To avoid that, we'd
need something like this:

ALTER TABLE tab ADD POLICY polname WHERE quals;
GRANT SELECT (polname) ON TABLE tab TO role;

>> Another, perhaps-simpler
>> example is that you might have a table containing unclassified data,
>> classified data, and secret data. You want to give access to the
>> unclassified data only to one category of users; access to the
>> unclassified data and the classified data to a second group of
>> more-trusted users; and access to all of the data to a third group of
>> very highly trusted users. If the table can only have one security
>> policy that applies to everyone who isn't exempt, how will you do
>> that? This sort of use case seems very plausible to me so I think we
>> need to give some real thought to what we will recommend to users who
>> want to do things like this. Can the proposed patch handle it? How?
>
> There are multiple ways this could be implemented- the first, basic, way
> would be through a table which maps users to security levels via an enum
> where more privileged levels are higher in value and therefore a simple
> greater-than could be applied after a join which would implement this
> particular policy.

Interesting.

>> What do you mean by "data changing"? If you mean inserts, updates,
>> and deletes, I am very sure people are going to want to perform those
>> operations on RLS-enabled tables.
>
> Yes, they'll want to support those operations. However, they will not
> expect RLS to allow them to redefine a columns as "x+10" instead of "x",
> which a view does allow.

Hmm, I think some users do want to do things like this. There are
previous discussions of wanting to fuzz a set of coordinates, for
example, or blank out a certain list of columns.

>> Do you find it implausible that someone will want to allow a certain
>> table to bypass RLS when selecting rows, but not when updating or
>> deleting them? I find those scenarios very plausible.
>
> This is also plausible and something which we were anticipating while
> developing this patch. Simon, KaiGai and I specifically discussed
> addressing SELECT vs UPDATE/DELETE earlier this year, as I recall.
> Providing that level of flexibility is absolutely on the road map, but I
> don't know that it all has to exist in 9.5; it may, which would be
> great, but I don't view it as required.

I think we at least need to have a clear design for it before
committing anything. Otherwise we may find that we've committed to
syntax which backs us into a corner.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 01:45:34
Message-ID: CAKRt6CQrUZ2GE6VOTWV8fVKqvHR7tWFmdWP+tYv-MjTK9RNe9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

However, I believe that
> there has been a lack of focus in the development of the patch thus
> far in a couple of key areas - first in terms of articulating how it
> is different from and better than a writeable security barrier view,
> and second on how to manage the security and operational aspects of
> having a feature like this. I think that the discussion subsequent to
> my June 10th email has let to some good discussion on both points,
> which was my intent, but I still think much more time and thought
> needs to be spent on those issues if we are to have a feature which is
> up to our usual standards. I do apologize to anyone who interpreted
> that initial as a pure rant, because it really wasn't intended that
> way. Contrariwise, I hope that the people defending this patch will
> admit that the issues I am raising are real and focus on whether and
> how those concerns can be addressed.

I absolutely appreciate all of the feedback that has been provided. It has
been educational. To your point above, I started putting together a wiki
page, as Stephen has spoken to, that is meant to capture these concerns and
considerations as well as to capture ideas around solutions.

https://wiki.postgresql.org/wiki/Row_Security_Considerations

This page is obviously not complete, but I think it is a good start.
Hopefully this document will help to continue the conversation and assist
in addressing all the concerns that have been brought to the table. As
well, I hope that this document serves to demonstrate our intent and that
we *are* taking these concerns seriously. I assure you that as one of the
individuals who is working towards the acceptance of this feature/patch, I
am very much concerned about meeting the expected standards of quality and
security.

Thanks,
Adam


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 02:02:10
Message-ID: 20140618020210.GF16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Thu, Jun 12, 2014 at 8:13 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > This approach was suggested by an existing user testing out this RLS
> > approach, to be fair, but it looks pretty sane to me as a way to address
> > some of these concerns. Certainly open to other ideas and thoughts though.
>
> In general, I agree that this is a good approach. I think it will be
> difficult to have a GUC with three values, one of which is
> superuser-only and the other two of which are not. I don't think
> there's any precedent for something like that in the existing
> framework, and I think it's likely we'll run into unpleasant corner
> cases if we try to graft it in. Also, I think we need to separate
> things: whether the system is willing to allow the user to access the
> table without RLS, and whether the user is willing to accept RLS if
> the system deems it necessary.

Good point- I agree that it's best to avoid having to support individual
superuser-only only options on a GUC. Also, addressing the issues
independently also makes sense to me.

> For the first one, two solutions have been proposed. The initial
> proposal was to insist on RLS except for the superuser (and maybe the
> table owner?). Having a separate grantable privilege, as Dean
> suggests, may be better. I'll reply separately to that email also, as
> I have a question about what he's proposing.

I like the idea of a grantable privilege as it allows the granularity
that some users may require (or be frustrated that we don't have it).

> For the second one, I think the two most useful behaviors are "normal
> mode" - i.e. allow access to the table, applying RLS predicates if
> required and not applying them if I am exempt - and "error-instead"
> mode - i.e. if my access to this table would be mediated by an RLS
> predicate, then throw an error instead.

Right, makes sense.

> There's a third mode which
> might be useful as well, which is "even though I have the *right* to
> bypass the RLS predicate on this table, please apply the predicate
> anyway". This could be used by the table owner in testing, for
> example.

Agreed, this sounds very useful too.

> Here again, the level of granularity we want to provide is
> an interesting question. Having a GUC (e.g. enable_row_level_security
> = on, off, force) would be adequate for pg_dump, but allowing the
> table name to be qualified in the query, as proposed downthread, would
> be more granular, admittedly at some parser cost. I'm personally of
> the view that we *at least* need the GUC, because that seems like the
> best way to secure pg_dump, and perhaps other applications. We can
> and should give pg_dump an--allow-row-level-security flag, I think,
> but pg_dump's default behavior should be to configure the system in
> such a way that the dump will fail rather than complete with a subset
> of the data.

This sounds good to me.

> I'm less sure whether we should have something that can
> be used to qualify table names in particular queries. I think it
> would be really useful, but I'm concerned that it will require
> creating additional fully-reserved keywords, which are somewhat
> painful for users.

I've been trying to think of the use-case for this. It certainly
*sounds* nice, but on reflection, the use-case for this seems to me to
be that you're trying to develop some application which will be
constrained by RLS totally and therefore want to flip back-and-forth
between "RLS on" and "RLS off" (for the tables involved). When would
you really need, in the same query, to have RLS enabled for table X but
disabled for table Y? I do like the idea of an *independent* option to
(just) COPY which says "give me all the data or error, independent of
the GUC for the same purpose". Would be curious to hear what others
think of that proposal.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 02:06:38
Message-ID: 20140618020638.GG16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> > Yeah, I was thinking something like this could work, but I would go
> > further. Suppose you had separate GRANTable privileges for direct
> > access to individual tables, bypassing RLS, e.g.
> >
> > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name
>
> So, is this one new privilege (DIRECT) or four separate new privileges
> that are variants of the existing privileges (DIRECT SELECT, DIRECT
> INSERT, DIRECT UPDATE, DIRECT DELETE)?

I had taken it to be a single privilege, but you're right, it could be
done for each of those.. I really don't think we have the bits for more
than one case here though (if that) without a fair bit of additional
rework. I'm not against that rework (and called for it wayyy back when
I proposed the TRUNCATE privilege, as I recall) but that's a whole
different challenge and no small bit of work..

> > Actually, given the fact that the majority of users won't be using
> > RLS, I would be tempted to invert the above logic and have the new
> > privilege be for LIMITED access (via RLS quals). So a user granted the
> > normal SELECT privilege would be able to bypass RLS, but a user only
> > granted LIMITED SELECT wouldn't.
>
> Well, for the people who are not using RLS, there's no difference
> anyway. I think it matters more what users of RLS will expect from a
> command like GRANT SELECT ... and I'm guessing they'll prefer that RLS
> always apply unless they very specifically grant the right for RLS to
> not apply. I might be wrong, though.

The preference from the folks using RLS that I've talked to is
absolutely that it be applied by default for all 'normal' (eg:
non-pg_dump) sessions.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 02:25:13
Message-ID: 20140618022513.GH16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Mon, Jun 16, 2014 at 1:15 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I understand that there are performance implications. As mentioned to
> > Tom, realistically, there's zero way to optimized at least some of these
> > use-cases because they require a completely external module (eg:
> > SELlinux) to be involved in the decision about who can view what
> > records. If we can optimize that, it'd be by a completely different
> > approach whereby we pull up the qual higher because we know the whole
> > query only involves leakproof functions or similar, allowing us to only
> > apply the filter to the final set of records prior to them being
> > returned to the user. The point being that such optimizations would
> > happen independently and regardless of the quals or user-defined
> > functions involved. At the end of the day, I can't think of a better
> > optimization for such a case (where we have to ask an external security
> > module if a row is acceptable to return to the user) than that. Is
> > there something specific you're thinking about that we'd be missing out
> > on?
>
> Yeah, if we have to ask an external security module a question for
> each row, there's little hope of any real optimization. However, I
> think there will be a significant number of cases where people will
> want filtering clauses that can be realized by doing an index scan
> instead of a sequential scan, and if we end up forcing a sequential
> scan anyway, the feature will be useless to those people.

I agree that we want to support that, if we can do so reasonably. What
I was trying to get at is simply this- don't we provide that already
with the leakproof attribute and functions? If we don't have enough
there to allow index scans then we should be looking to add more, I'm
thinking.

> > Perhaps it's just my experience, but I've been focused on the main core
> > feature for quite some time and it feels like we're really close to
> > having it there. I agree that a few additional bits would be nice to
> > have but these strike me as relatively straight-forward to implement
> > overtop of this general construct. I do see value in documenting these
> > concerns and will see about making that happen, along with what the
> > general viewpoints and thoughts are about how to address the concern.
>
> I feel like there's quite a bit of work left to do around these
> issues. The technical bits may not be too hard, but deciding what we
> want will take some thought and discussion.

I agree on this point, but I'm still hopeful that we'll be able to get a
good feature into 9.5. There are quite a few resources available for
the 'just programming' part, so the long pole in the tent here is
absolutely hashing out what we want and how it should function.

I'd be happy to host or participate in a conference call or similar if
that would be useful to move this along- or we can continue to
communicate via email. There's a bit of a lull in conferences to which
I'm going to right now, so in person is unlikely, unless folks want to
get together somewhere on the east coast (I'd be happy to travel to
Philly, Pittsburgh, NYC, etc, if it'd help..).

> > That's not actualy true today, is it? Given our leak-proof attribute,
> > if the qual is "WHERE somecomplexfunction() AND leakprooffunctionx()"
> > then we would be able to push down the leak-proof function and not
> > necessairly run a straight sequential scan, no? Even so, though, we've
> > had users who have tested exactly what this patch implements and they've
> > been happy with their real-world use-cases. I'm certainly all for
> > optimization and would love to see us make this better for everyone, but
> > I don't view that as a reason to delay this particular feature which is
> > really just bringing us up to parity with other RDMBS's.
>
> I'm a bit confused here, because your example seems to be totally
> different from my example. In my example, somecomplexfunction() will
> get pushed down because it's the security qual; that needs to be
> inside the security_barrier view, or a malicious user can subvert the
> system by getting some other qual evaluated first. In your example,
> you seem to be imagining WHERE somecomplexfunction() AND
> leakprooffunctionx() as queries sent by the untrusted user, in which
> case, yet, the leak-proof one will get pushed down and the other one
> will not.

Right- my point there was that the leakproof one might allow an index
scan to be run. This is all pretty hand-wavey, I admit, so I'll see if
I can get more details about how the currently-proposed patch is
performing for the users who are testing it and what kind of plans
they're seeing. If that falls through, I'll try and build up my own set
of realistic-looking (to myself and the users who are testing) example.

> > What solution did you come up with for this case, which performed well
> > and was also secure..?
>
> I put the logic in the client. :-(

Well, that's not helpful here. ;)

> >> I don't want to overstate the importance of this particular case; but
> >> I do think scenarios in which it's advantageous to have multiple
> >> row-level security policies are plausible.
> >
> > I'm not against this in general. The question, in my mind, is what
> > level of granularity we would provide this at. As I tried to outline
> > previously, there's a huge number of combinations which we could come up
> > with to support this under and I'm not 100% sure that it'd actualy end
> > up being better than the simplicity of a single qual where the user gets
> > to define any kind of relationship they want between the various
> > policies; even programatically if they want.
>
> I agree. That's why I think we need some more design work in this
> area. Perhaps it's OK to allow only one RLS-qual per table at most,
> and tell people that if you want more than that, you need to use
> security-barrier views as wrappers instead.

Note that my suggestion would be to simply put a pl/pgsql call (perhaps
a security definer one) into the RLS definition- not to say "use views".

> But I'm not sure; that
> feels like it's giving something up that might be important. And I
> think that the kinds of syntax we're discussing won't support leaving
> that out of the initial version and adding it later, so if we commit
> to this syntax, we're stuck with that behavior. To avoid that, we'd
> need something like this:
>
> ALTER TABLE tab ADD POLICY polname WHERE quals;
> GRANT SELECT (polname) ON TABLE tab TO role;

Right, if we were to support multiple policies on a given table then we
would have to support adding and removing them individually, as well as
specify when they are to be applied- and what if that "when" overlaps?
Do we apply both and only a row which passed them all gets sent to the
user? Essentially we'd be defining the RLS policies to be AND'd
together, right? Would we want to support both AND-based and OR-based,
and allow users to pick what set of conditionals they want applied to
their various overlapping RLS policies?

Sounds all rather painful and much better done programatically by the
user in a language which is suited to that task- eg: pl/pgsql, perl, C,
or something besides our ALTER syntax + catalog representation.

> >> What do you mean by "data changing"? If you mean inserts, updates,
> >> and deletes, I am very sure people are going to want to perform those
> >> operations on RLS-enabled tables.
> >
> > Yes, they'll want to support those operations. However, they will not
> > expect RLS to allow them to redefine a columns as "x+10" instead of "x",
> > which a view does allow.
>
> Hmm, I think some users do want to do things like this. There are
> previous discussions of wanting to fuzz a set of coordinates, for
> example, or blank out a certain list of columns.

Absolutely they'll want to be able to do this- but that's going to be a
case which I (and others, I think) feel comfortable going back and
saying "use views for that". I'm trying to draw that line in the ground
between what is RLS and what are views and keeping RLS to the WHERE
clause strikes me as a good line to draw (and one which matches up with
existing expectations in this space).

> > This is also plausible and something which we were anticipating while
> > developing this patch. Simon, KaiGai and I specifically discussed
> > addressing SELECT vs UPDATE/DELETE earlier this year, as I recall.
> > Providing that level of flexibility is absolutely on the road map, but I
> > don't know that it all has to exist in 9.5; it may, which would be
> > great, but I don't view it as required.
>
> I think we at least need to have a clear design for it before
> committing anything. Otherwise we may find that we've committed to
> syntax which backs us into a corner.

Fair enough. There was some support for this idea in the original patch
by Craig, but we can further develop this syntax (and what it may look
like for 9.5, if it ends up not covering all cases).

Thanks!

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)hobby(dot)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)hobby(dot)2ndquadrant(dot)com>, Andres Freund <andres(at)hobby(dot)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 13:25:37
Message-ID: CA+Tgmoa30KKuurSsQerbqG1u5YHcaqJf01RL33zro+rpex4Lwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 9:45 PM, Brightwell, Adam
<adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
> I absolutely appreciate all of the feedback that has been provided. It has
> been educational. To your point above, I started putting together a wiki
> page, as Stephen has spoken to, that is meant to capture these concerns and
> considerations as well as to capture ideas around solutions.
>
> https://wiki.postgresql.org/wiki/Row_Security_Considerations
>
> This page is obviously not complete, but I think it is a good start.
> Hopefully this document will help to continue the conversation and assist in
> addressing all the concerns that have been brought to the table. As well, I
> hope that this document serves to demonstrate our intent and that we *are*
> taking these concerns seriously. I assure you that as one of the
> individuals who is working towards the acceptance of this feature/patch, I
> am very much concerned about meeting the expected standards of quality and
> security.

Cool, thanks for weighing in. I think that page is a good start. An
item that I think should be added there is the potential overlap
between security_barrier views and row-level security. How can we
reuse code (and SQL syntax?) for existing features like WITH CHECK
OPTION instead of writing new code (and inventing new syntax) for very
similar concepts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 13:50:34
Message-ID: CA+TgmobSrp5hsLonz+XH+BS75TNZhP+DzBtJNGoN51f1-phTHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> > Yeah, I was thinking something like this could work, but I would go
>> > further. Suppose you had separate GRANTable privileges for direct
>> > access to individual tables, bypassing RLS, e.g.
>> >
>> > GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name
>>
>> So, is this one new privilege (DIRECT) or four separate new privileges
>> that are variants of the existing privileges (DIRECT SELECT, DIRECT
>> INSERT, DIRECT UPDATE, DIRECT DELETE)?
>
> I had taken it to be a single privilege, but you're right, it could be
> done for each of those.. I really don't think we have the bits for more
> than one case here though (if that) without a fair bit of additional
> rework. I'm not against that rework (and called for it wayyy back when
> I proposed the TRUNCATE privilege, as I recall) but that's a whole
> different challenge and no small bit of work..

Technically, there are 4 bits left, and that's what we need for
separate privileges. We last consumed bits in 2008 (for TRUNCATE) and
2006 (for GRANT ON DATABASE), so even if we used all of the remaining
bits it might be another 5 years before anyone has to do that
refactoring. But even if the refactoring needs to be done now for
some reason, it's only June, and the last CommitFest doesn't start
until February 15th. I think we're being way too quick to jump to
talking about what can and can't be done in time for 9.5. Let's start
by figuring out how we'd really like it to work and then, if it's too
ambitious, we can scale it back.

My main concern about using only one bit is that someone might want to
allow a user to bypass RLS on SELECT while still enforcing it for
data-modifying operations. That seems like a plausible use case to
me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 14:40:49
Message-ID: 20140618144049.GZ16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I had taken it to be a single privilege, but you're right, it could be
> > done for each of those.. I really don't think we have the bits for more
> > than one case here though (if that) without a fair bit of additional
> > rework. I'm not against that rework (and called for it wayyy back when
> > I proposed the TRUNCATE privilege, as I recall) but that's a whole
> > different challenge and no small bit of work..
>
> Technically, there are 4 bits left, and that's what we need for
> separate privileges.

I'd really hate to chew them all up..

> We last consumed bits in 2008 (for TRUNCATE) and
> 2006 (for GRANT ON DATABASE), so even if we used all of the remaining
> bits it might be another 5 years before anyone has to do that
> refactoring.

Perhaps, or we might come up with some new whiz-bang permission to add
next year. :/

> But even if the refactoring needs to be done now for
> some reason, it's only June, and the last CommitFest doesn't start
> until February 15th. I think we're being way too quick to jump to
> talking about what can and can't be done in time for 9.5. Let's start
> by figuring out how we'd really like it to work and then, if it's too
> ambitious, we can scale it back.

Alright- perhaps we can discuss what kind of refactoring would be needed
for such a change then, to get a better idea as to the scope of the
change and the level of effort required.

My thoughts on how to address this were to segregate the ACL bits by
object type. That is to say, the AclMode stored for databases might
only use bits 0-2 (create/connect/temporary), while tables would use
bits 0-7 (insert/select/update/delete/references/trigger). This would
allow us to more easily add more rights at the database and/or
tablespace level too.

> My main concern about using only one bit is that someone might want to
> allow a user to bypass RLS on SELECT while still enforcing it for
> data-modifying operations. That seems like a plausible use case to
> me.

I absolutely agree that it's a real use-case and one which we should
support, just trying to avoid biting off more than can be done between
now and February. Still, if we get things hammered out and more-or-less
agreement on the way forward, getting the code written may move quickly.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 14:43:16
Message-ID: CA+Tgmob80g1ciJ0tJ-8OfHYyqphYSq0BA3_cztke2rYdzvLtUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 10:25 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> Yeah, if we have to ask an external security module a question for
>> each row, there's little hope of any real optimization. However, I
>> think there will be a significant number of cases where people will
>> want filtering clauses that can be realized by doing an index scan
>> instead of a sequential scan, and if we end up forcing a sequential
>> scan anyway, the feature will be useless to those people.
>
> I agree that we want to support that, if we can do so reasonably. What
> I was trying to get at is simply this- don't we provide that already
> with the leakproof attribute and functions? If we don't have enough
> there to allow index scans then we should be looking to add more, I'm
> thinking.

So the reason why we got onto this particular topic was because of the
issue of multiple security policies for a single table. Of course,
multiple security policies can always be merged into a single
more-complex policy, but the resulting policy may be so complex that
the query-planner is no longer capable of doing a good job optimizing
it. I won't mention here exactly what a certain large commercial
database vendor has implemented here; suffice it to say, however, that
their design avoids this pitfall, and ours currently does not.

> I agree on this point, but I'm still hopeful that we'll be able to get a
> good feature into 9.5. There are quite a few resources available for
> the 'just programming' part, so the long pole in the tent here is
> absolutely hashing out what we want and how it should function.

Agreed.

> I'd be happy to host or participate in a conference call or similar if
> that would be useful to move this along- or we can continue to
> communicate via email. There's a bit of a lull in conferences to which
> I'm going to right now, so in person is unlikely, unless folks want to
> get together somewhere on the east coast (I'd be happy to travel to
> Philly, Pittsburgh, NYC, etc, if it'd help..).

For me, email is easiest; but there are other options, too.

>> > What solution did you come up with for this case, which performed well
>> > and was also secure..?
>>
>> I put the logic in the client. :-(
>
> Well, that's not helpful here. ;)

Sure. The reason I brought it up is to say - hey, look, I had this
come up in the real world. What would it take to be able to do
actually do it in the database server? And the answer is - something
that will handle multiple security policies cleanly.

>> But I'm not sure; that
>> feels like it's giving something up that might be important. And I
>> think that the kinds of syntax we're discussing won't support leaving
>> that out of the initial version and adding it later, so if we commit
>> to this syntax, we're stuck with that behavior. To avoid that, we'd
>> need something like this:
>>
>> ALTER TABLE tab ADD POLICY polname WHERE quals;
>> GRANT SELECT (polname) ON TABLE tab TO role;
>
> Right, if we were to support multiple policies on a given table then we
> would have to support adding and removing them individually, as well as
> specify when they are to be applied- and what if that "when" overlaps?
> Do we apply both and only a row which passed them all gets sent to the
> user? Essentially we'd be defining the RLS policies to be AND'd
> together, right? Would we want to support both AND-based and OR-based,
> and allow users to pick what set of conditionals they want applied to
> their various overlapping RLS policies?

AND is not a sensible policy; it would need to be OR. If you grant
someone access to two different subsets of the rows in a table, it
stands to reason that they will expect to have access to all of the
rows that are in at least one of those subsets. If you give someone
your car key and your house key, that means they can operate your car
or enter your house; it does not mean that they can operate your car
but only when it's inside your garage.

Alternatively, we could:

- Require the user to specify in some way which of the available
policies they want applied, and then apply only that one.
or
- Decide that such scenarios constitute misconfiguration. Throw an
error and make the table owner or other relevant local authority fix
it.

> Sounds all rather painful and much better done programatically by the
> user in a language which is suited to that task- eg: pl/pgsql, perl, C,
> or something besides our ALTER syntax + catalog representation.

I think exactly the opposite, for the query planning reasons
previously stated. I think the policies will quickly get so
complicated that they're no longer optimizable. Here's a simple
example:

- Policy 1 allows the user to access rows for which complexfunc() returns true.
- Policy 2 allows the user to access rows for which a = 1.

Most users have access only through policy 2, but some have access
through policy 1. Users who have access through policy 1 will always
get a sequential scan, but users who have access through policy 2 have
an excellent chance of getting an index scan if the selectivity of a =
1 is high. When you merge those two things into a single policy, no
matter how you do it, everyone gets sequential scans all the time.
That sucks.

>> Hmm, I think some users do want to do things like this. There are
>> previous discussions of wanting to fuzz a set of coordinates, for
>> example, or blank out a certain list of columns.
>
> Absolutely they'll want to be able to do this- but that's going to be a
> case which I (and others, I think) feel comfortable going back and
> saying "use views for that". I'm trying to draw that line in the ground
> between what is RLS and what are views and keeping RLS to the WHERE
> clause strikes me as a good line to draw (and one which matches up with
> existing expectations in this space).

Fair.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 14:50:43
Message-ID: CA+TgmoboPV8TjAWXnWbUHgrG2BcPpACuhUhfz6JH1K8UXhHjvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 18, 2014 at 10:40 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> On Tue, Jun 17, 2014 at 10:06 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > I had taken it to be a single privilege, but you're right, it could be
>> > done for each of those.. I really don't think we have the bits for more
>> > than one case here though (if that) without a fair bit of additional
>> > rework. I'm not against that rework (and called for it wayyy back when
>> > I proposed the TRUNCATE privilege, as I recall) but that's a whole
>> > different challenge and no small bit of work..
>>
>> Technically, there are 4 bits left, and that's what we need for
>> separate privileges.
>
> I'd really hate to chew them all up..

Usually it's the patch author who WANTS to chew up all the available
bit space and OTHER people who say no. :-)

>> We last consumed bits in 2008 (for TRUNCATE) and
>> 2006 (for GRANT ON DATABASE), so even if we used all of the remaining
>> bits it might be another 5 years before anyone has to do that
>> refactoring.
>
> Perhaps, or we might come up with some new whiz-bang permission to add
> next year. :/

Well, people proposed separate permissions for things like VACUUM and
ANALYZE around the time TRUNCATE was added, and those were rejected on
the grounds that they didn't add enough value to justify wasting bits
on them. I think we see whether there's a workable system that such
that marginal permissions (like TRUNCATE) that won't be checked often
don't have to consume bits.

>> But even if the refactoring needs to be done now for
>> some reason, it's only June, and the last CommitFest doesn't start
>> until February 15th. I think we're being way too quick to jump to
>> talking about what can and can't be done in time for 9.5. Let's start
>> by figuring out how we'd really like it to work and then, if it's too
>> ambitious, we can scale it back.
>
> Alright- perhaps we can discuss what kind of refactoring would be needed
> for such a change then, to get a better idea as to the scope of the
> change and the level of effort required.
>
> My thoughts on how to address this were to segregate the ACL bits by
> object type. That is to say, the AclMode stored for databases might
> only use bits 0-2 (create/connect/temporary), while tables would use
> bits 0-7 (insert/select/update/delete/references/trigger). This would
> allow us to more easily add more rights at the database and/or
> tablespace level too.

Yeah, that's another idea. But it really deserves its own thread.
I'm still not convinced we have to do this at all to meet this need,
but that should be argued back and forth on that other thread.

>> My main concern about using only one bit is that someone might want to
>> allow a user to bypass RLS on SELECT while still enforcing it for
>> data-modifying operations. That seems like a plausible use case to
>> me.
>
> I absolutely agree that it's a real use-case and one which we should
> support, just trying to avoid biting off more than can be done between
> now and February. Still, if we get things hammered out and more-or-less
> agreement on the way forward, getting the code written may move quickly.

Nifty.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-18 18:18:32
Message-ID: 20140618181832.GE16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Tue, Jun 17, 2014 at 10:25 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I agree that we want to support that, if we can do so reasonably. What
> > I was trying to get at is simply this- don't we provide that already
> > with the leakproof attribute and functions? If we don't have enough
> > there to allow index scans then we should be looking to add more, I'm
> > thinking.
>
> So the reason why we got onto this particular topic was because of the
> issue of multiple security policies for a single table. Of course,
> multiple security policies can always be merged into a single
> more-complex policy, but the resulting policy may be so complex that
> the query-planner is no longer capable of doing a good job optimizing
> it.

Yeah, I could see that happening with some use-cases.

> >> ALTER TABLE tab ADD POLICY polname WHERE quals;
> >> GRANT SELECT (polname) ON TABLE tab TO role;
> >
> > Right, if we were to support multiple policies on a given table then we
> > would have to support adding and removing them individually, as well as
> > specify when they are to be applied- and what if that "when" overlaps?
> > Do we apply both and only a row which passed them all gets sent to the
> > user? Essentially we'd be defining the RLS policies to be AND'd
> > together, right? Would we want to support both AND-based and OR-based,
> > and allow users to pick what set of conditionals they want applied to
> > their various overlapping RLS policies?
>
> AND is not a sensible policy; it would need to be OR. If you grant
> someone access to two different subsets of the rows in a table, it
> stands to reason that they will expect to have access to all of the
> rows that are in at least one of those subsets.

I think I can buy off on this. What that also means is that any
'short-circuiting' that we try to do here would be based on "stop once
we get back a 'true'". This could seriously change how we actually
implement RLS though as doing it all through query rewrites and making
this work with multiple security policies which are OR'd together and
yet keeping the optimization and qual push-down and index-based plans is
looking pretty daunting.

I'm also of the opinion that this isn't strictly necessary for the
initial RLS offering in PG- there's a clear way we could migrate
existing users to a multi-policy system from a single-policy system.
Sure, to get the performance and optimization benefits that we'd
presumably have in the multi-policy case they'd need to re-work their
RLS configuration, but for users who care, they'll likely be very happy
to do so to gain those benefits.

Perhaps the question here is- if we implement RLS one way for the single
case and then change the implementation all around for the multi case,
will we end up breaking the single case? Or destroying the performance
for it? I can't see either of those cases being allowed- if and when we
support multi, we must still support single and the whole point of multi
would be to allow more performant implementations and that solution will
require the single case to be at least as performant as what we're
proposing to do today, I believe.

Or are you thinking that we would never support calling user-defined
functions in any RLS scheme because we want to be able to do that
optimization? I don't see that being acceptable from a feature
standpoint.

> Alternatively, we could:
>
> - Require the user to specify in some way which of the available
> policies they want applied, and then apply only that one.

I'd want to at least see a way to apply an ordering to the policies
being applied, or have PG work out which one is "cheapest" and try that
one first.

> - Decide that such scenarios constitute misconfiguration. Throw an
> error and make the table owner or other relevant local authority fix
> it.

Having them all be OR'd together feels simpler and easier to work with
than trying to provide the user with all the knobs necessary to select
which subset of users they want the policy applied to when (user X from
IP range a.b.c.d/24 at time 1500). We could probably make it work with
exclusion constraints, range types, etc, and perhaps it'd be a reason to
bring btree_gist into core (which I'm all for) and make it work with
catalog tables, but... just 'yuck' all around, for my part.

> > Sounds all rather painful and much better done programatically by the
> > user in a language which is suited to that task- eg: pl/pgsql, perl, C,
> > or something besides our ALTER syntax + catalog representation.
>
> I think exactly the opposite, for the query planning reasons
> previously stated. I think the policies will quickly get so
> complicated that they're no longer optimizable. Here's a simple
> example:
>
> - Policy 1 allows the user to access rows for which complexfunc() returns true.
> - Policy 2 allows the user to access rows for which a = 1.
>
> Most users have access only through policy 2, but some have access
> through policy 1. Users who have access through policy 1 will always
> get a sequential scan,

This is the thing which I most object to- if the quals being provided at
any level are leakproof and would be able to reduce the returned set
sufficiently that an index scan is the best bet, we should be doing
that. I don't anticipate the RLS quals to be as selective as the
quals which the user is adding.

I agree that in cases where the user isn't using a leakproof function in
their quals and the policy is complex, a sequential scan would have to
be done over the table, but looking at the set of leakproof vs not
leakproof functions used by operators which return boolean, certainly
the most common of the index using cases are covered and we may be able
to add more leakproof functions, should we get user complaints that the
function they're using works fine with an index but isn't leakproof.

> but users who have access through policy 2 have
> an excellent chance of getting an index scan if the selectivity of a =
> 1 is high. When you merge those two things into a single policy, no
> matter how you do it, everyone gets sequential scans all the time.
> That sucks.

It just strikes me as unlikely that in such a simple policy the
selectivity of the RLS qual used will be high and this feels like a lot
of mechanism and complication to be adding for that use-case. If the
selectivity is actually high in terms of what the RLS qual will allow,
then it seems likely, to me at least, that it's going to need to depend
on another table or function, eg:

exists(select 1 from security_table
where (current_user(),a) = (sec_user,sec_label))

Still thinking about this approach in general. Having a good answer to
the question about granularity and how this multiple RLS-policy would
actually work would certainly help. Being able to pick a single policy
(rather than deal with overlapping policies that all have to be tested)
would definitely make this simpler, but I suppose we could build up "X
OR Y OR Z..." inside the query..

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-21 02:33:56
Message-ID: 20140621023356.GQ16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jun 18, 2014 at 10:40 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> >> Technically, there are 4 bits left, and that's what we need for
> >> separate privileges.
> >
> > I'd really hate to chew them all up..
>
> Usually it's the patch author who WANTS to chew up all the available
> bit space and OTHER people who say no. :-)

Ah, well, technically I'm not the patch author here, though I would like
to see it happen. :) Still, have to balance these features and
capabilities against the future unknown options we might want to add and
it certainly doesn't seem terribly nice to chew up all that remain
rather than addressing the need to support more.

Still, perhaps we can put together a patch for this and then review the
implementation and, if we like it and that functionality, we can make
the decision about if it should be on this patch to make more bits
available.

> > Perhaps, or we might come up with some new whiz-bang permission to add
> > next year. :/
>
> Well, people proposed separate permissions for things like VACUUM and
> ANALYZE around the time TRUNCATE was added, and those were rejected on
> the grounds that they didn't add enough value to justify wasting bits
> on them. I think we see whether there's a workable system that such
> that marginal permissions (like TRUNCATE) that won't be checked often
> don't have to consume bits.

That's an interesting approach but I'm not sure that we need to go a
system where we segregate "often-used" bits from "less-used" ones.

> > My thoughts on how to address this were to segregate the ACL bits by
> > object type. That is to say, the AclMode stored for databases might
> > only use bits 0-2 (create/connect/temporary), while tables would use
> > bits 0-7 (insert/select/update/delete/references/trigger). This would
> > allow us to more easily add more rights at the database and/or
> > tablespace level too.
>
> Yeah, that's another idea. But it really deserves its own thread.
> I'm still not convinced we have to do this at all to meet this need,
> but that should be argued back and forth on that other thread.

Fair enough.

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-21 03:40:31
Message-ID: 30691.1403322031@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> [ counting bits in ACLs ]

Wouldn't it be fairly painless to widen AclMode from 32 bits to 64,
and thereby double the number of available bits?

That code was all written before we required platforms to have an int64
primitive type, but of course now we expect that.

In any case, I concur with the position that this feature patch should
be separate from a patch to make additional bitspace available.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-21 04:08:55
Message-ID: 20140621040855.GT16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> >> [ counting bits in ACLs ]
>
> Wouldn't it be fairly painless to widen AclMode from 32 bits to 64,
> and thereby double the number of available bits?

Thanks for commenting on this. I hadn't considered that but I don't see
any particular problem with it either..

> In any case, I concur with the position that this feature patch should
> be separate from a patch to make additional bitspace available.

Certainly. Thanks for your thoughts.

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-22 10:38:32
Message-ID: CAEZATCVjwbehgDgC=t7pk4TJ_9z0Ou=aPkXVdUfCax7hksKyXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17 June 2014 20:19, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> Yeah, I was thinking something like this could work, but I would go
>> further. Suppose you had separate GRANTable privileges for direct
>> access to individual tables, bypassing RLS, e.g.
>>
>> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name
>
> So, is this one new privilege (DIRECT) or four separate new privileges
> that are variants of the existing privileges (DIRECT SELECT, DIRECT
> INSERT, DIRECT UPDATE, DIRECT DELETE)?
>

I was thinking it would be 4 new privileges, so that a user could for
example be granted DIRECT SELECT permission on a table, but not DIRECT
UPDATE.

On reflection though, I think I prefer the approach of allowing
multiple named security policies per table, because it gives the
planner more opportunity to optimize queries against specific RLS
quals, which won't work if the ACL logic is embedded in functions.
That seems like something that would have to be designed in now,
because it's difficult to see how you could add it later.

Managing policy names becomes an issue though, because if you have 2
tables each with 1 policy, but you give them different names, how can
the user querying the data specify that they want policy1 for table1
and policy2 for table2, possibly in the same query? I think that can
be made more manageable by making policies top-level objects that
exist independently of any particular tables. So you might do
something like:

\c - alice
CREATE POLICY policy1;
CREATE POLICY policy2;
ALTER TABLE t1 SET POLICY policy1 TO t1_quals;
ALTER TABLE t2 SET POLICY policy1 TO t2_quals;
...
GRANT SELECT ON TABLE t1, t2 TO bob USING policy1;
GRANT SELECT ON TABLE t1, t2 TO manager; -- Can use any policy, or
bypass all policies

Then a particular user would typically only have to set their policy
once per session, for accessing multiple tables:

\c - bob
SET rls_policy = policy1;
SELECT * FROM t1 JOIN t2; -- OK
SET rls_policy = policy2;
SELECT * FROM t1; -- ERROR: no permission to access t1 using policy2

or you'd be able to set a default policy for users, so that they
wouldn't need to explicitly choose one:

ALTER ROLE bob SET rls_policy = policy1;

Note that the syntax proposed elsewhere --- GRANT SELECT (polname) ON
TABLE tab TO role --- doesn't work because it conflicts with the
syntax for granting column privileges, so there needs to be a distinct
syntax for this, and I think it ought to ultimately allow things like

GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1;

Regards,
Dean


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-23 01:05:53
Message-ID: 20140623010553.GD16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 17 June 2014 20:19, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> >> Yeah, I was thinking something like this could work, but I would go
> >> further. Suppose you had separate GRANTable privileges for direct
> >> access to individual tables, bypassing RLS, e.g.
> >>
> >> GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name
> >
> > So, is this one new privilege (DIRECT) or four separate new privileges
> > that are variants of the existing privileges (DIRECT SELECT, DIRECT
> > INSERT, DIRECT UPDATE, DIRECT DELETE)?
>
> I was thinking it would be 4 new privileges, so that a user could for
> example be granted DIRECT SELECT permission on a table, but not DIRECT
> UPDATE.

Ok.

> On reflection though, I think I prefer the approach of allowing
> multiple named security policies per table, because it gives the
> planner more opportunity to optimize queries against specific RLS
> quals, which won't work if the ACL logic is embedded in functions.

Having more than one policy for the purpose of performance really
doesn't make a huge amount of sense to me. Perhaps someone could
explain the use-case with specific example applications where they would
benefit from this? Based on the discussion, they would have to be OR'd
together in the query as built with any result being marked as success.
One could build an SQL function which could be in-lined potentially
which does the same if their case is that simple.

Being able to define the policy based on some criteria may allow it to
be simpler (eg: policy 'a' applies for certain roles, while policy 'b'
applies for other roles), but I'm not enthusiastic about that approach
because there could be a huge number of permutations to allow.

How about another approach- what about having a function which is called
(as the table owner, I'm thinking..) that then returns the qual to be
included, instead of having to define a specific qual which is included
in the catalog? That function could take into consideration the user,
table, etc, and return a qual which includes constants to compare rows
against for planning purposes. This would have to be done early enough,
of course, which might be difficult. For my part, having that
capability would be neat, but nothing we're trying to do here would
preclude us from adding it later either.

> That seems like something that would have to be designed in now,
> because it's difficult to see how you could add it later.

I don't follow this at all. Going from supporting one qual to
supporting multiple seems like it'd be quite straight-forward to add
in later? Going the other way would be difficult.

> Managing policy names becomes an issue though, because if you have 2
> tables each with 1 policy, but you give them different names, how can
> the user querying the data specify that they want policy1 for table1
> and policy2 for table2, possibly in the same query?

From my experience, users don't pick the policy any more than they get
to pick which set of permissions get applied to them when querying
tables (modulo roles, of course, but that's a mechanism for changing
users, not for saying which set of permissions you get). All that you
describe could be done for regular permissions also, but we don't, and
I don't think we get complaints about that because we have roles-
which would work just the same for RLS (assuming the RLS policy defined
has a role component).

Having a function be able to be called to return a qual to be used would
be a way to have per-role RLS also, along with providing the flexibility
to have per-source-IP, per-connection-type, etc, RLS policies also.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-23 16:34:32
Message-ID: CA+Tgmobs-ZtyFRbTMV9PJbC_RabPE_26PPNukLnh4w-9QE1O4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> I'm also of the opinion that this isn't strictly necessary for the
> initial RLS offering in PG- there's a clear way we could migrate
> existing users to a multi-policy system from a single-policy system.
> Sure, to get the performance and optimization benefits that we'd
> presumably have in the multi-policy case they'd need to re-work their
> RLS configuration, but for users who care, they'll likely be very happy
> to do so to gain those benefits.

I think a lot depends on the syntax we choose. If we choose a syntax
that only makes sense in a single-policy framework, then I think
allowing upgrades to a multi-policy syntax is going to be really
difficult. On the other hand, if we choose a syntax that allows
multiple policies, I suspect we can support multiple policies from the
beginning without much extra effort.

>> - Require the user to specify in some way which of the available
>> policies they want applied, and then apply only that one.
>
> I'd want to at least see a way to apply an ordering to the policies
> being applied, or have PG work out which one is "cheapest" and try that
> one first.

Cost-based comparison of policies that return different results
doesn't seem sensible to me.

>> I think exactly the opposite, for the query planning reasons
>> previously stated. I think the policies will quickly get so
>> complicated that they're no longer optimizable. Here's a simple
>> example:
>>
>> - Policy 1 allows the user to access rows for which complexfunc() returns true.
>> - Policy 2 allows the user to access rows for which a = 1.
>>
>> Most users have access only through policy 2, but some have access
>> through policy 1. Users who have access through policy 1 will always
>> get a sequential scan,
>
> This is the thing which I most object to- if the quals being provided at
> any level are leakproof and would be able to reduce the returned set
> sufficiently that an index scan is the best bet, we should be doing
> that. I don't anticipate the RLS quals to be as selective as the
> quals which the user is adding.

I think it would be a VERY bad idea to design the system around the
assumption that the RLS quals will be much more or less selective than
the user-supplied quals. That's going to be different in different
environments.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-23 18:29:43
Message-ID: 20140623182943.GS16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I'm also of the opinion that this isn't strictly necessary for the
> > initial RLS offering in PG- there's a clear way we could migrate
> > existing users to a multi-policy system from a single-policy system.
> > Sure, to get the performance and optimization benefits that we'd
> > presumably have in the multi-policy case they'd need to re-work their
> > RLS configuration, but for users who care, they'll likely be very happy
> > to do so to gain those benefits.
>
> I think a lot depends on the syntax we choose. If we choose a syntax
> that only makes sense in a single-policy framework, then I think
> allowing upgrades to a multi-policy syntax is going to be really
> difficult. On the other hand, if we choose a syntax that allows
> multiple policies, I suspect we can support multiple policies from the
> beginning without much extra effort.

What are these policies going to depend on? Will they be allowed to
overlap? I don't see multi-policy support as being very easily added.

If there are specific ways to design the syntax which would make it
easier to support multiple policies in the future, I'm all for it. Have
any specific thoughts regarding that?

> >> - Require the user to specify in some way which of the available
> >> policies they want applied, and then apply only that one.
> >
> > I'd want to at least see a way to apply an ordering to the policies
> > being applied, or have PG work out which one is "cheapest" and try that
> > one first.
>
> Cost-based comparison of policies that return different results
> doesn't seem sensible to me.

I keep coming back to the thought that, really, having multiple
overlapping policies just adds unnecessary complication to the system
for not much gain in real functionality. Being able to specify a policy
per-role might be useful, but that's only one dimension and I can
imagine a lot of other dimensions that one might want to use to control
which policy is used.

> >> I think exactly the opposite, for the query planning reasons
> >> previously stated. I think the policies will quickly get so
> >> complicated that they're no longer optimizable. Here's a simple
> >> example:
> >>
> >> - Policy 1 allows the user to access rows for which complexfunc() returns true.
> >> - Policy 2 allows the user to access rows for which a = 1.
> >>
> >> Most users have access only through policy 2, but some have access
> >> through policy 1. Users who have access through policy 1 will always
> >> get a sequential scan,
> >
> > This is the thing which I most object to- if the quals being provided at
> > any level are leakproof and would be able to reduce the returned set
> > sufficiently that an index scan is the best bet, we should be doing
> > that. I don't anticipate the RLS quals to be as selective as the
> > quals which the user is adding.
>
> I think it would be a VERY bad idea to design the system around the
> assumption that the RLS quals will be much more or less selective than
> the user-supplied quals. That's going to be different in different
> environments.

Fine- but do you really see the query planner having a problem pushing
down whichever is the more selective qual, if the user-provided qual is
marked as leakproof?

I realize that you want multiple policies because you'd like a way for
the RLS qual to be made simpler for certain cases while also having more
complex quals for other cases. What I keep waiting to hear is exactly
how you want to specify which policy is used because that's where it
gets ugly and complicated. I still really don't like the idea of trying
to apply multiple policies inside of a single query execution.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-24 12:02:34
Message-ID: CA+Tgmob93444JSVj7-u9WcdNKDQ1O-VMt6tWXaK4QJY7gft=pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 23, 2014 at 2:29 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> What are these policies going to depend on? Will they be allowed to
> overlap? I don't see multi-policy support as being very easily added.

We discussed the point about overlap upthread, and I gave specific
examples. If there's something else you want me to provide here,
please be more clear about it.

> If there are specific ways to design the syntax which would make it
> easier to support multiple policies in the future, I'm all for it. Have
> any specific thoughts regarding that?

I did propose something already upthread, and then Dean said this:

# Note that the syntax proposed elsewhere --- GRANT SELECT (polname) ON
# TABLE tab TO role --- doesn't work because it conflicts with the
# syntax for granting column privileges, so there needs to be a distinct
# syntax for this, and I think it ought to ultimately allow things like
#
# GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1;

He's got a good point there. I don't know whether the policy should
be given inline (e.g. GRANT ... WHERE stuff()) or out-of-line (GRANT
... USING policy1) but it seems like specifying it as some sort of
GRANT modifier might make sense. I'm sure there are other ways also,
of course.

>> >> - Require the user to specify in some way which of the available
>> >> policies they want applied, and then apply only that one.
>> >
>> > I'd want to at least see a way to apply an ordering to the policies
>> > being applied, or have PG work out which one is "cheapest" and try that
>> > one first.
>>
>> Cost-based comparison of policies that return different results
>> doesn't seem sensible to me.
>
> I keep coming back to the thought that, really, having multiple
> overlapping policies just adds unnecessary complication to the system
> for not much gain in real functionality. Being able to specify a policy
> per-role might be useful, but that's only one dimension and I can
> imagine a lot of other dimensions that one might want to use to control
> which policy is used.

Well, I don't agree, and I've given examples upthread showing the
kinds of scenarios that I'm concerned about, which are drawn from real
experiences I've had. It may be that I'm the only one who has had
such experiences, of course; or that there aren't enough people who
have to justify catering to such use cases. But I'm not sure there's
much point in trying to have a conversation about how such a thing
could be made to work if you're just going to revert back to "well, we
don't really need this anyway" each time I make or refute a technical
point.

>> I think it would be a VERY bad idea to design the system around the
>> assumption that the RLS quals will be much more or less selective than
>> the user-supplied quals. That's going to be different in different
>> environments.
>
> Fine- but do you really see the query planner having a problem pushing
> down whichever is the more selective qual, if the user-provided qual is
> marked as leakproof?

I'm not quite sure I understand the scenario you're describing here.
Can you provide a tangible example? I expect that most of the things
the RLS-limited user might write in the WHERE clause will NOT get
pushed down because most functions are not leakproof. However, the
issue I'm actually concerned about is whether the *security* qual is
simple enough to permit an index-scan. Anything with an OR clause in
it probably won't be, and any function call definitely won't be.

> I realize that you want multiple policies because you'd like a way for
> the RLS qual to be made simpler for certain cases while also having more
> complex quals for other cases. What I keep waiting to hear is exactly
> how you want to specify which policy is used because that's where it
> gets ugly and complicated. I still really don't like the idea of trying
> to apply multiple policies inside of a single query execution.

See above comments.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-24 14:30:15
Message-ID: 20140624143015.GG5032@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:

> > Right, if we were to support multiple policies on a given table then we
> > would have to support adding and removing them individually, as well as
> > specify when they are to be applied- and what if that "when" overlaps?
> > Do we apply both and only a row which passed them all gets sent to the
> > user? Essentially we'd be defining the RLS policies to be AND'd
> > together, right? Would we want to support both AND-based and OR-based,
> > and allow users to pick what set of conditionals they want applied to
> > their various overlapping RLS policies?
>
> AND is not a sensible policy; it would need to be OR. If you grant
> someone access to two different subsets of the rows in a table, it
> stands to reason that they will expect to have access to all of the
> rows that are in at least one of those subsets.

I haven't been following this thread, but this bit caught my attention.
I'm not sure I agree that OR is always the right policy either.
There is a case for a policy that says "forbid these rows to these guys,
even if they have read permissions from elsewhere". If OR is the only
way to mix multiple policies there might not be a way to implement this.
So ISTM each policy must be able to indicate what to do -- sort of how
PAM config files allow you to specify "required", "optional" and so
forth for each module.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-24 15:05:49
Message-ID: CA+Tgmob2iAHEn5KeFwCd6AfXSc1bGQ7ivY2pzS9ypP0bTXPoUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 24, 2014 at 10:30 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Robert Haas wrote:
>> > Right, if we were to support multiple policies on a given table then we
>> > would have to support adding and removing them individually, as well as
>> > specify when they are to be applied- and what if that "when" overlaps?
>> > Do we apply both and only a row which passed them all gets sent to the
>> > user? Essentially we'd be defining the RLS policies to be AND'd
>> > together, right? Would we want to support both AND-based and OR-based,
>> > and allow users to pick what set of conditionals they want applied to
>> > their various overlapping RLS policies?
>>
>> AND is not a sensible policy; it would need to be OR. If you grant
>> someone access to two different subsets of the rows in a table, it
>> stands to reason that they will expect to have access to all of the
>> rows that are in at least one of those subsets.
>
> I haven't been following this thread, but this bit caught my attention.
> I'm not sure I agree that OR is always the right policy either.
> There is a case for a policy that says "forbid these rows to these guys,
> even if they have read permissions from elsewhere". If OR is the only
> way to mix multiple policies there might not be a way to implement this.
> So ISTM each policy must be able to indicate what to do -- sort of how
> PAM config files allow you to specify "required", "optional" and so
> forth for each module.

Hmm. Well, that could be useful, but I'm not sure I'd view it as
something we absolutely have to have...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-24 16:27:33
Message-ID: 20140624162733.GO16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

I feel like we are getting to the point of simply talking past each
other and so I'll try anew, and I'll include my understanding of how the
different approaches would address the specific use-case you outlined
up-thread.

Single policy
-------------
The current implementation approach only allows a single policy to be
included. The concern raised with this approach is that it won't be
very performant due to the qual complexity, which you outlined
(reformatted a bit) as:

WHERE
sales_rep_id = (SELECT oid FROM pg_roles
WHERE rolname = current_user
AND
oid IN (SELECT id FROM person WHERE is_sales_rep))
OR
partner_id = (SELECT p.org_id
FROM pg_roles a, person p
WHERE a.rolname = current_user
AND a.oid = p.id)

Which I take to mean there is a 'person' table which looks like:

id, is_sales_rep, org_id

and a table which has the RLS qual which looks like:

pk_id, sales_rep_id, partner_id

Then, if the individual is_sales_rep and it's their account by
sales_rep_id, or if the individual's org_id matches the partner_id, they
can see the record.

Using this example with security barrier views and indexes on person.id,
data.pk_id, data.sales_rep_id, and data.partner_id, we'll get a bitmap
heap scan across the 'data' table by having the two OR's run as
InitPlan 1 and InitPlan 2.

Does that address the concern you had around multi-branch OR policies?
This works with more than two OR branches also, though of course we need
appropriate indexes to make use of a Bitmap Heap Scan.

Even with per-user policies, we would define a policy along these lines,
for the "sfrost" role:

WHERE
sales_rep_id = 16384
OR partner_id = 1

Which also ends up doing a Bitmap Heap Scan across the data table.

For the case where a sales rep isn't also a partner, you could simplify
this to:

WHERE
sales_rep_id = 16384

but I'm not sure that really buys you much? With the bitmap heap
scan, if one side of the OR ends up not returning anything then it
doesn't contribute to the blocks which have to be scanned. The index
might still need to be scanned, although I think you could avoid even
that with an EXISTS check to see if the user is a partner at all.
That's not to say that a bitmap scan is equivilant to an index scan, but
it's certainly likely to be far better than a sequential scan.

Now, if the query is "select * from data_view with pk_id = 1002;", then
we get an indexed lookup on the data table based on the PK. That's what
I was trying to point out previously regarding leakproof functions
(which comprise about half of the boolean functions we provide, if I
recall my previous analysis correctly). We also get indexed lookups
with "pk_id < 10" or similar as those are also leakproof.

Multiple, Overlapping policies
------------------------------
Per discussion, these would generally be OR'd together.

Building up the overall qual which has to include an OR branch for each
individual policy qual(s) looks like a complicated bit of work and one
which might be better left to the user (and, as just pointed out, the
user may actually want AND instead of OR in some cases..).

Managing the plan cache in a sensible way is certainly made more
complicated by this and might mean that it can't be used at all, which
has already been raised as a show-stopper issue.

In the example which you provided, while we could represent that the two
policies exist (sales representatives vs partners) and that they are to
be OR'd together in the catalog, but I don't immediately see how that
would change the qual which ends up being added to the query in this
case or really improving the overall query plan; at least, not without
eliminating one of the OR branches somehow- which I discuss below.

Multiple, Non-overlapping policies
----------------------------------
Preventing the overlap of policies ends up being very complicated if
many dimensions are allowed. For the simple case, perhaps only the
'current role' dimension is useful. I expect that going down that
route would very quickly lead to requests for other dimensions (client
IP, etc) which is why I'm not a big fan of it, but if that's the
concensus then let's work out the syntax and update the patch and move
on.

Another option might be to have a qual for each policy which
the user can define that indicates if that policy is to be applied or
not and then simply pick the first policy for which that qual which
returns 'true'. We would require an ordering to be defined in this
case, which I believe was an issue up-thread. If we allow all policies
matching the quals then we run into the complications mentioned under
"Overlapping policies" above.

If we decide that per-role policies need to be supported, I very
quickly see the need to have "groups" of roles to which a policy is to
be applied. This would differ from roles today as they would not be
allowed to overlap (otherwise we are into overlapping policies again, or
having to figure out which of the overlapping policies should be applied
for each query; another option would be to error at run-time, but that
seems pretty ugly). In this case we would still need to support "all"
as an option, which is what I would expect to have implemented for 9.5,
or at least in the early part of 9.5 (I really don't want to wait until
the last CF or even the CF before that to get anything in as I suspect
it will have grown by that point to be large enough to be an issue..),
adding the per-role(s) option could be for 9.6/10.0.

In your example, if sales representatives have distinct roles from
partners, then the specific policy could be chosen and used based on
which role is running the query, which might lead to, perhaps only
maginal, improved performance in those specific cases, as discussed
above.

General multi-policy concerns
-----------------------------
Choosing which policy or policies to apply for a given query gets very
complicated very quickly if we're to do so in an automated way. Dean
suggests that the user would pick which policy to use, to which I argued
that roles could be used to manage that instead (a user could 'set role'
to a role which has the access requested). That mechanism would also
work in the existing single-policy approach by having a policy which
depends on the calling role (eg: by looking up that role in a table
which defines what access that role should have). It would also work in
the above proposal for multiple non-overlapping policies where the
policy to use is based on the current role.

Overall, while I'm interested in defining where this is going in a way
which allows us implement an initial RLS capability while avoiding
future upgrade issues, I am perfectly happy to say that the 9.5 RLS
implementation may not be exactly syntax-compatible with 9.6 or 10.0.
What I wish to avoid is a case where what's in 9.5 includes RLS
definitions which can't be implemented in 9.6/10.0 and as would cause
upgrade problems. As long as what's in 9.5 can be represented and
supported in 9.6/10.0, we can implement the necessary logic to migrate
from one to the other in pg_dump. We do not guarantee syntax
compatibility between major versions and we often warn users of newer
features that there may be some changes in subsequent releases which
they'll need to address when they upgrade (and, of course, these are
noted in the release notes).

Hopefully this will help us move the discussion forward to a point where
we have a long-term design as well as a short-term goal which is
actionable for 9.5. The current work is around adding the GUCs
discussed previously to the RLS patch and modifying pg_dump to use them,
to address the concerns raised previously about pg_dump running user
code and possibly not having a complete copy of the data.

Thanks,

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-24 21:17:05
Message-ID: CAEZATCW9L89E4Fpc_3OqH2bSbwW8v9jzOc3azCOwaPcn2sY6vw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 June 2014 17:27, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Single policy vs Multiple, Overlapping policies vs Multiple, Non-overlapping policies
>

What I was describing upthread was multiple non-overlapping policies.

I disagree that this will be more complicated to use. It's a strict
superset of the single policy functionality, so if you want to do it
all using a single policy then you can. But I think that once the ACLs
reach a certain level of complexity, you probably will want to break
it up into multiple policies, and I think doing so will make things
simpler, not more complicated.

Taking a specific, simplistic example, suppose you had 2 groups of
users - some are normal users who should only be able to access their
own records. For these users, you might have a policy like

WHERE person_id = current_user

which would be highly selective, and probably use an index scan. Then
there might be another group of users who are managers with access to
the records of, say, everyone in their department. This might then be
a more complex qual along the lines of

WHERE person_id IN (SELECT ... FROM person_department
WHERE mgr_id = current_user AND ...)

which might end up being a hash or merge join, depending on any
user-supplied quals.

You _could_ combine those into a single policy, but I think it would
be much better to have 2 distinct policies, since they're 2 very
different queries, for different use cases. Normal users would only be
granted permission to use the normal_user_policy. Managers might be
granted permission to use either the normal_user_policy or the
manager_policy (but not both at the same time).

That's a very simplified example. In more realistic situations there
are likely to be many more classes of users, and trying to enforce all
the logic in a single WHERE clause is likely to get unmanageable, or
inefficient if it involves lots of logic hidden away in functions.
Allowing multiple, non-overlapping policies allows the problem to be
broken up into more manageable pieces, which also makes the planner's
job easier, since only a single, simpler policy is in effect in any
given query.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-24 21:18:39
Message-ID: CAEZATCUzjJOUgA5Rin83iuqrj3FrVysc-FGHtc4Vh+Q9EJmFtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thinking about the examples upthread, a separate issue occurs to me
--- when defining a RLS qual, I think that there has to be a syntax to
specify an alias for the main table, so that correlated subqueries can
refer to it. I'm not sure if that's been mentioned in any of the
discussions so far, but it might be quite hard to define certain quals
without it.

Regards,
Dean


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-24 21:25:15
Message-ID: 20140624212515.GZ16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> Thinking about the examples upthread, a separate issue occurs to me
> --- when defining a RLS qual, I think that there has to be a syntax to
> specify an alias for the main table, so that correlated subqueries can
> refer to it. I'm not sure if that's been mentioned in any of the
> discussions so far, but it might be quite hard to define certain quals
> without it.

Yeah, that thought had occured to me also. Have any suggestions about
how to approach that issue? The way triggers have OLD/NEW comes to mind
but I'm not sure how easily that'd work.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: RLS Design
Date: 2014-06-25 00:49:00
Message-ID: 20140625004900.GC16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean, all,

Changing the subject of this thread (though keeping it threaded) as
we've really moved on to a much broader discussion.

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 24 June 2014 17:27, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Single policy vs Multiple, Overlapping policies vs Multiple, Non-overlapping policies
>
> What I was describing upthread was multiple non-overlapping policies.

Ok.

> I disagree that this will be more complicated to use. It's a strict
> superset of the single policy functionality, so if you want to do it
> all using a single policy then you can. But I think that once the ACLs
> reach a certain level of complexity, you probably will want to break
> it up into multiple policies, and I think doing so will make things
> simpler, not more complicated.

If we keep it explicitly to per-role only, with only one policy ever
being applied, then perhaps it would be, but I'm not convinced..

> Taking a specific, simplistic example, suppose you had 2 groups of
> users - some are normal users who should only be able to access their
> own records. For these users, you might have a policy like
>
> WHERE person_id = current_user
>
> which would be highly selective, and probably use an index scan. Then
> there might be another group of users who are managers with access to
> the records of, say, everyone in their department. This might then be
> a more complex qual along the lines of
>
> WHERE person_id IN (SELECT ... FROM person_department
> WHERE mgr_id = current_user AND ...)
>
> which might end up being a hash or merge join, depending on any
> user-supplied quals.

Certainly my experience with such a setup is that it includes at least 4
levels (self, manager, director, officer). Now, officer you could
perhaps exclude as being simply RLS-exempt but with such a structure I
would think we'd just make that a special kind of policy (and not chew
up those last 4 bits). As for this example, it's quite naturally done
with a recursive query as it's a tree structure, but if you want to keep
the qual simple and fast, you'd materialize the results of such a query
and simply have:

WHERE EXISTS (SELECT 1 from org_chart
WHERE current_user = emp_id
AND person_id = org_chart.id)

> You _could_ combine those into a single policy, but I think it would
> be much better to have 2 distinct policies, since they're 2 very
> different queries, for different use cases. Normal users would only be
> granted permission to use the normal_user_policy. Managers might be
> granted permission to use either the normal_user_policy or the
> manager_policy (but not both at the same time).

I can't recall a system where managers have to request access to their
manager role. Having another way of changing the permissions which are
applied to a session (the existing one being 'set role') doesn't strike
me as a great idea either.

> That's a very simplified example. In more realistic situations there
> are likely to be many more classes of users, and trying to enforce all
> the logic in a single WHERE clause is likely to get unmanageable, or
> inefficient if it involves lots of logic hidden away in functions.

Functions and external security systems are exactly the real-world
use-case which users I've talked to are looking for. All of this
discussion is completely orthogonal to their requirements. I understand
that there are simpler use-cases than those and we may be able to
provide an approach which performs better for those.

> Allowing multiple, non-overlapping policies allows the problem to be
> broken up into more manageable pieces, which also makes the planner's
> job easier, since only a single, simpler policy is in effect in any
> given query.

Let's try to outline what this would look like then.

Taking your approach, we'd have:

CREATE POLICY p1;
CREATE POLICY p2;

ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals;

GRANT SELECT ON TABLE t1 TO role1 USING p1;
GRANT SELECT ON TABLE t1 TO role2 USING p2;

I'm guessing we would need to further support:

GRANT INSERT ON TABLE t1 TO role1 USING p2;

as we've already discussed being able to support per-action (SELECT,
INSERT, UPDATE, DELETE) policies. I'm not quite sure how to address
that though.

Further, as you mention, users would be able to do:

SET rls_policy = whatever;

and things would appear fine, until they tried to access a table to
which they didn't have that policy for, at which point they'd get an
error.

You mention:

GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1;

but, to be clear, there would be no option for policies to be
column-specific, right? The policy would apply to the whole row and
just the SELECT/UPDATE privileges would be on the specific columns (as
exists today).

From this what I'm gathering is that we'd need catalog tables along
these lines:

rls_policy
oid, polname name, polowner oid, polnamespace oid, polacl aclitme[]
(oid, policy name, policy owner, policy namespace, ACL, eg: usage?)

rls_policy_table
ptblpolid oid, ptblrelid oid, ptblquals text(?), ptblacl aclitem[]?
(policy oid, table/relation oid, quals, ACL)

pg_class
relhasrls boolean ?

An extension to the existing ACLs which are for GRANT to include a
policy OID, eg:

typedef struct AclItem
{
Oid ai_grantee;
Oid ai_grantor;
AclMode ai_privs;
Oid rls_policy;
}

and further:

role1=r|p1/postgres
role2=r|p2/postgres

or even:

bob=|policy1/postgres

with no table-level privileges and only column-level privileges granted
to role3 for this table.

The plan cache would include what policy OID a given plan was run under
(with InvalidOid indicating an "everything-allowed" policy).

This doesn't address the concern raised about having different policies
depending on the action type (SELECT, INSERT, etc) though, as mentioned
above.. For that we may have to add "Oid rls_select_policy", etc, to
AclItem, which would be pretty painful. Other thoughts?

This certainly feels like quite a bit to try and bite off for 9.5 and,
as mentioned, this would be a strict superset of the current approach,
which could be implemented under this structure as:

CREATE POLICY t1_p1_policy;
ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
GRANT (user's rights) ON t1 TO user USING policy1;

Tha main downside here is that we'd have to create a policy for every
table in the system which had RLS applied, to avoid granting more than
should be. Perhaps the 9.4 approach could include the 'CREATE POLICY'
and 'ALTER TABLE' bits, but not the GRANT parts, meaning that we would,
for the 9.5 -> 9.6 upgrade, pg_dump:

GRANT (user's rights) ON t1 TO user USING policy1;

We would still need the GUCs for "rls_enable = on/off" and perhaps the
role-level "bypass_rls" attribute, but those wouldn't change with this.

Thoughts?

Thanks!

Stephen


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-25 01:31:36
Message-ID: 53AA2678.9010105@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/24/2014 10:30 PM, Alvaro Herrera wrote:
> I haven't been following this thread, but this bit caught my attention.
> I'm not sure I agree that OR is always the right policy either.
> There is a case for a policy that says "forbid these rows to these guys,
> even if they have read permissions from elsewhere".

That's generally considered a "DENY" policy, a concept borrowed from ACLs.

You have access to a resource if:

- You have at least one policy that gives you access AND
- You have no policies that deny you access

> If OR is the only
> way to mix multiple policies there might not be a way to implement this.

I think that's a "later" myself, but we shouldn't design ourselves into
a corner where we can't support deny rules either.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-25 01:33:48
Message-ID: 20140625013348.GE16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig,

* Craig Ringer (craig(at)2ndquadrant(dot)com) wrote:
> On 06/24/2014 10:30 PM, Alvaro Herrera wrote:
> > I haven't been following this thread, but this bit caught my attention.
> > I'm not sure I agree that OR is always the right policy either.
> > There is a case for a policy that says "forbid these rows to these guys,
> > even if they have read permissions from elsewhere".
>
> That's generally considered a "DENY" policy, a concept borrowed from ACLs.

Right.

> > If OR is the only
> > way to mix multiple policies there might not be a way to implement this.
>
> I think that's a "later" myself, but we shouldn't design ourselves into
> a corner where we can't support deny rules either.

Agreed, but I don't want to get so wrapped up in all of this that we end
up with a set of requirements so long that we'll never be able to
accomplish them all in a single release...

Thanks!

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-25 12:31:50
Message-ID: CAEZATCWzdatkkj4WSTV8B-HM0f=AJ8ksSPkHzHTYXZ3yFjk78w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 25 June 2014 01:49, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Dean, all,
>
> Changing the subject of this thread (though keeping it threaded) as
> we've really moved on to a much broader discussion.
>
> * Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
>> On 24 June 2014 17:27, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > Single policy vs Multiple, Overlapping policies vs Multiple, Non-overlapping policies
>>
>> What I was describing upthread was multiple non-overlapping policies.
>
> Ok.
>
>> I disagree that this will be more complicated to use. It's a strict
>> superset of the single policy functionality, so if you want to do it
>> all using a single policy then you can. But I think that once the ACLs
>> reach a certain level of complexity, you probably will want to break
>> it up into multiple policies, and I think doing so will make things
>> simpler, not more complicated.
>
> If we keep it explicitly to per-role only, with only one policy ever
> being applied, then perhaps it would be, but I'm not convinced..
>
>> Taking a specific, simplistic example, suppose you had 2 groups of
>> users - some are normal users who should only be able to access their
>> own records. For these users, you might have a policy like
>>
>> WHERE person_id = current_user
>>
>> which would be highly selective, and probably use an index scan. Then
>> there might be another group of users who are managers with access to
>> the records of, say, everyone in their department. This might then be
>> a more complex qual along the lines of
>>
>> WHERE person_id IN (SELECT ... FROM person_department
>> WHERE mgr_id = current_user AND ...)
>>
>> which might end up being a hash or merge join, depending on any
>> user-supplied quals.
>
> Certainly my experience with such a setup is that it includes at least 4
> levels (self, manager, director, officer). Now, officer you could
> perhaps exclude as being simply RLS-exempt but with such a structure I
> would think we'd just make that a special kind of policy (and not chew
> up those last 4 bits). As for this example, it's quite naturally done
> with a recursive query as it's a tree structure, but if you want to keep
> the qual simple and fast, you'd materialize the results of such a query
> and simply have:
>
> WHERE EXISTS (SELECT 1 from org_chart
> WHERE current_user = emp_id
> AND person_id = org_chart.id)
>
>> You _could_ combine those into a single policy, but I think it would
>> be much better to have 2 distinct policies, since they're 2 very
>> different queries, for different use cases. Normal users would only be
>> granted permission to use the normal_user_policy. Managers might be
>> granted permission to use either the normal_user_policy or the
>> manager_policy (but not both at the same time).
>
> I can't recall a system where managers have to request access to their
> manager role. Having another way of changing the permissions which are
> applied to a session (the existing one being 'set role') doesn't strike
> me as a great idea either.
>

Actually I think it's quite common to build applications where more
privileged users might want to initially log in with normal
privileges, and then only escalate to a higher privilege level if
needed (much like only being root on a machine when absolutely
necessary). But as you say, that can be done through 'set role' so I
don't think being able to choose between policies is as important as
being able to define different policies for different roles.

>> That's a very simplified example. In more realistic situations there
>> are likely to be many more classes of users, and trying to enforce all
>> the logic in a single WHERE clause is likely to get unmanageable, or
>> inefficient if it involves lots of logic hidden away in functions.
>
> Functions and external security systems are exactly the real-world
> use-case which users I've talked to are looking for. All of this
> discussion is completely orthogonal to their requirements. I understand
> that there are simpler use-cases than those and we may be able to
> provide an approach which performs better for those.
>

OK.

>> Allowing multiple, non-overlapping policies allows the problem to be
>> broken up into more manageable pieces, which also makes the planner's
>> job easier, since only a single, simpler policy is in effect in any
>> given query.
>
> Let's try to outline what this would look like then.
>
> Taking your approach, we'd have:
>
> CREATE POLICY p1;
> CREATE POLICY p2;
>
> ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals;
>
> GRANT SELECT ON TABLE t1 TO role1 USING p1;
> GRANT SELECT ON TABLE t1 TO role2 USING p2;
>
> I'm guessing we would need to further support:
>
> GRANT INSERT ON TABLE t1 TO role1 USING p2;
>
> as we've already discussed being able to support per-action (SELECT,
> INSERT, UPDATE, DELETE) policies. I'm not quite sure how to address
> that though.
>
> Further, as you mention, users would be able to do:
>
> SET rls_policy = whatever;
>
> and things would appear fine, until they tried to access a table to
> which they didn't have that policy for, at which point they'd get an
> error.
>
> You mention:
>
> GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1;
>
> but, to be clear, there would be no option for policies to be
> column-specific, right? The policy would apply to the whole row and
> just the SELECT/UPDATE privileges would be on the specific columns (as
> exists today).
>

I think that would be OK for the first release. It could be extended
in a future release to support column-specific policy ACLs, as long as
we don't preclude that in the syntax we choose now. The syntax

GRANT <command> [,<command>] ON table TO role USING policy

works because columns can be added to it later.

> From this what I'm gathering is that we'd need catalog tables along
> these lines:
>
> rls_policy
> oid, polname name, polowner oid, polnamespace oid, polacl aclitme[]
> (oid, policy name, policy owner, policy namespace, ACL, eg: usage?)
>
> rls_policy_table
> ptblpolid oid, ptblrelid oid, ptblquals text(?), ptblacl aclitem[]?
> (policy oid, table/relation oid, quals, ACL)
>
> pg_class
> relhasrls boolean ?
>

Seems about right.

> An extension to the existing ACLs which are for GRANT to include a
> policy OID, eg:
>
> typedef struct AclItem
> {
> Oid ai_grantee;
> Oid ai_grantor;
> AclMode ai_privs;
> Oid rls_policy;
> }
>

Alternatively, use the ACLs on rls_policy_table - i.e., to SELECT from
a table using a particular policy, you would need to have the SELECT
bit assigned to you in the corresponding rls_policy_table entry's
ACLs. That seems like it would be a less invasive change, but I don't
know if there are other problems with that approach.

> and further:
>
> role1=r|p1/postgres
> role2=r|p2/postgres
>

Or just

table1:
role1=rw/grantor
table1 using policy1:
role2=rw/grantor

to avoid changing the privilege display pattern. That's also more in
keeping with the model of storing the per-policy ACLs in
rls_policy_table.

> or even:
>
> bob=|policy1/postgres
>
> with no table-level privileges and only column-level privileges granted
> to role3 for this table.
>

I don't get that last one. If there are no table-level privileges,
would it not just be empty?

> The plan cache would include what policy OID a given plan was run under
> (with InvalidOid indicating an "everything-allowed" policy).
>
> This doesn't address the concern raised about having different policies
> depending on the action type (SELECT, INSERT, etc) though, as mentioned
> above.. For that we may have to add "Oid rls_select_policy", etc, to
> AclItem, which would be pretty painful. Other thoughts?
>

Huh? Isn't it just another column in rls_policy_table to specify the
action type?

> This certainly feels like quite a bit to try and bite off for 9.5 and,
> as mentioned, this would be a strict superset of the current approach,
> which could be implemented under this structure as:
>
> CREATE POLICY t1_p1_policy;
> ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> GRANT (user's rights) ON t1 TO user USING policy1;
>
> Tha main downside here is that we'd have to create a policy for every
> table in the system which had RLS applied, to avoid granting more than
> should be. Perhaps the 9.4 approach could include the 'CREATE POLICY'
> and 'ALTER TABLE' bits, but not the GRANT parts, meaning that we would,
> for the 9.5 -> 9.6 upgrade, pg_dump:
>
> GRANT (user's rights) ON t1 TO user USING policy1;
>
> We would still need the GUCs for "rls_enable = on/off" and perhaps the
> role-level "bypass_rls" attribute, but those wouldn't change with this.
>
> Thoughts?
>

Well I think you'd have to flesh out the alternatives to a similar
level of detail to assess the relative effort involved, but I think
it's encouraging to see this level of design this early in the 9.5
cycle.

Regards,
Dean


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Date: 2014-06-25 13:13:29
Message-ID: CA+TgmoY9iAQMPkbLbJL-qebfyvfVWd8Xoaq_2a7_v_w4Rp5Qjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 24, 2014 at 12:27 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> I feel like we are getting to the point of simply talking past each
> other and so I'll try anew, and I'll include my understanding of how the
> different approaches would address the specific use-case you outlined
> up-thread.

Thanks, you're right, and this is a good write-up.

> Single policy
> -------------
> The current implementation approach only allows a single policy to be
> included.
> [...snip...]
> For the case where a sales rep isn't also a partner, you could simplify
> this to:
>
> WHERE
> sales_rep_id = 16384
>
> but I'm not sure that really buys you much? With the bitmap heap
> scan, if one side of the OR ends up not returning anything then it
> doesn't contribute to the blocks which have to be scanned. The index
> might still need to be scanned, although I think you could avoid even
> that with an EXISTS check to see if the user is a partner at all.
> That's not to say that a bitmap scan is equivilant to an index scan, but
> it's certainly likely to be far better than a sequential scan.

True, but the wins could be much larger if one policy is WHERE
sales_rep_id = (SELECT oid FROM pg_roles WHERE rolname = current_user)
and the other policy is WHERE complexfn(). I'll also throw out a +1
for Dean's comments on this topic.

> Multiple, Non-overlapping policies
> ----------------------------------
> Preventing the overlap of policies ends up being very complicated if
> many dimensions are allowed. For the simple case, perhaps only the
> 'current role' dimension is useful. I expect that going down that
> route would very quickly lead to requests for other dimensions (client
> IP, etc) which is why I'm not a big fan of it, but if that's the
> concensus then let's work out the syntax and update the patch and move
> on.

I think role is good enough. That's the primary identifier for all
access-control related decisions, so it should be good enough here,
too.

I don't really understand your concerns about overlapping policies
being complex. If you've got a couple of WHERE clauses, combining
them with OR is not hard. Now, the query optimizer may have trouble
with it, but on the whole I expect to win more than we lose, by
entirely excluding some branches of an OR for users for whom entirely
policies can be excluded.

> Overall, while I'm interested in defining where this is going in a way
> which allows us implement an initial RLS capability while avoiding
> future upgrade issues, I am perfectly happy to say that the 9.5 RLS
> implementation may not be exactly syntax-compatible with 9.6 or 10.0.

Again, I think that's completely non-viable. Are you going to tell
people they can't pg_upgrade, and they can't dump-and-reload either,
without manual fiddling? There's no way that's going to be accepted.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-25 13:26:04
Message-ID: 20140625132604.GR16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 25 June 2014 01:49, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I can't recall a system where managers have to request access to their
> > manager role. Having another way of changing the permissions which are
> > applied to a session (the existing one being 'set role') doesn't strike
> > me as a great idea either.
> >
>
> Actually I think it's quite common to build applications where more
> privileged users might want to initially log in with normal
> privileges, and then only escalate to a higher privilege level if
> needed (much like only being root on a machine when absolutely
> necessary). But as you say, that can be done through 'set role' so I
> don't think being able to choose between policies is as important as
> being able to define different policies for different roles.

For those kinds of applications (eg: sudo), yes. I was, perhaps,
looking at your example a bit too literally- I was thinking of HR
management type systems (timecard systems, etc).

> > You mention:
> >
> > GRANT SELECT (col1, col2), UPDATE (col1) ON t1 TO bob USING policy1;
> >
> > but, to be clear, there would be no option for policies to be
> > column-specific, right? The policy would apply to the whole row and
> > just the SELECT/UPDATE privileges would be on the specific columns (as
> > exists today).
> >
>
> I think that would be OK for the first release. It could be extended
> in a future release to support column-specific policy ACLs, as long as
> we don't preclude that in the syntax we choose now. The syntax
>
> GRANT <command> [,<command>] ON table TO role USING policy
>
> works because columns can be added to it later.

What would per-column RLS policies mean..? Would we have to work out
which columns are being updated vs. select'd on before being able to
choose the policy/quals to include? Seems like that's probably workable
but I've not thought about it very hard.

> > From this what I'm gathering is that we'd need catalog tables along
> > these lines:
> >
> > rls_policy
> > oid, polname name, polowner oid, polnamespace oid, polacl aclitme[]
> > (oid, policy name, policy owner, policy namespace, ACL, eg: usage?)
> >
> > rls_policy_table
> > ptblpolid oid, ptblrelid oid, ptblquals text(?), ptblacl aclitem[]?
> > (policy oid, table/relation oid, quals, ACL)
> >
> > pg_class
> > relhasrls boolean ?
>
> Seems about right.
>
> > An extension to the existing ACLs which are for GRANT to include a
> > policy OID, eg:
> >
> > typedef struct AclItem
> > {
> > Oid ai_grantee;
> > Oid ai_grantor;
> > AclMode ai_privs;
> > Oid rls_policy;
> > }
> >
>
> Alternatively, use the ACLs on rls_policy_table - i.e., to SELECT from
> a table using a particular policy, you would need to have the SELECT
> bit assigned to you in the corresponding rls_policy_table entry's
> ACLs. That seems like it would be a less invasive change, but I don't
> know if there are other problems with that approach.

Ah, that's a good thought. My original thinking for that column was
some kind of privilege structure around who is allowed to modify the
quals for a given policy+table, but using that as the definition of who
has what policies does make sense and means we can leave AclItem
more-or-less alone, which is very nice. The relhasrls boolean would
allow us to only query that catalog in cases where a policy exists,
hopefully minimizing the impact for users who are not using RLS.

> > and further:
> >
> > role1=r|p1/postgres
> > role2=r|p2/postgres
>
> Or just
>
> table1:
> role1=rw/grantor
> table1 using policy1:
> role2=rw/grantor
>
> to avoid changing the privilege display pattern. That's also more in
> keeping with the model of storing the per-policy ACLs in
> rls_policy_table.

I like that output, but do we expect any pushback from users who parse
out that field? Admittedly, they really shouldn't be doing that, but
I'm sure most actually do, and "table1 using policy1" isn't terribly
nice to parse.

> > or even:
> >
> > bob=|policy1/postgres
> >
> > with no table-level privileges and only column-level privileges granted
> > to role3 for this table.
>
> I don't get that last one. If there are no table-level privileges,
> would it not just be empty?

No, as there could be column-level privileges. Note that table-level
privileges get you access to all columns, and column level privileges
(as defined by SQL spec) give you access even if you don't have any
table-level privileges. As I was trying to exclude the notion of
column-level policies, I figured policies would always show up in the
"table" level ACL fields, but if there aren't any table-level rights,
what would that look like? With your proposal, it'd be:

table1 using policy1:
bob=/grantor

?

> > The plan cache would include what policy OID a given plan was run under
> > (with InvalidOid indicating an "everything-allowed" policy).
> >
> > This doesn't address the concern raised about having different policies
> > depending on the action type (SELECT, INSERT, etc) though, as mentioned
> > above.. For that we may have to add "Oid rls_select_policy", etc, to
> > AclItem, which would be pretty painful. Other thoughts?
> >
>
> Huh? Isn't it just another column in rls_policy_table to specify the
> action type?

I had been trying to fit it into the ACL structure somehow. What would
it look like to have multiple action types then? Here's one thought:

table1 using policy1 for INSERT:
bob=rw/grantor
table1 using policy1 for SELECT:
bob=r/grantor

Or how about:

table1|policy1/w:
bob=rw/grantor
table1|policy1/r:
bob=r/grantor

Another question is about showing what the actual quals are for a given
policy which is being applied to a table. Would we want that to show up
in \d, \d+, or only be available through querying the catalog..?

> > This certainly feels like quite a bit to try and bite off for 9.5 and,
> > as mentioned, this would be a strict superset of the current approach,
> > which could be implemented under this structure as:
> >
> > CREATE POLICY t1_p1_policy;
> > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> > GRANT (user's rights) ON t1 TO user USING policy1;
> >
> > Tha main downside here is that we'd have to create a policy for every
> > table in the system which had RLS applied, to avoid granting more than
> > should be. Perhaps the 9.4 approach could include the 'CREATE POLICY'
> > and 'ALTER TABLE' bits, but not the GRANT parts, meaning that we would,
> > for the 9.5 -> 9.6 upgrade, pg_dump:
> >
> > GRANT (user's rights) ON t1 TO user USING policy1;
> >
> > We would still need the GUCs for "rls_enable = on/off" and perhaps the
> > role-level "bypass_rls" attribute, but those wouldn't change with this.
>
> Well I think you'd have to flesh out the alternatives to a similar
> level of detail to assess the relative effort involved, but I think
> it's encouraging to see this level of design this early in the 9.5
> cycle.

I'm not sure which other alternatives you're thinking about here- could
you be more specific..? I can try to flesh them out but I had actually
been hoping that this would be a good compromise position among the
alternatives.

This provides the per-role policy granularity which has been mentioned a
few times, but doesn't allow the policies to overlap. Overlapping
policies could be added to this general design, I believe, though we'd
have to make a few catalog changes and invent some new syntax to define
how the policies are to be combined (ANDs vs ORs, etc). I had brought
up the idea of ordering/prioritizing policies, but I didn't particularly
like the suggestion when I made it and I don't recall anyone else
voicing interest in that approach.

For my part, I don't see the GUCs as really being "alternatives" so much
as pre-requisites. Even with all the granularity and comprehensive set
of features which we're talking about here, we're going to need a way
for pg_dump to simply say "do not apply RLS to me and ERROR out if
that's an issue". I agree that it's great to get these design
discussions happening now but I really do not want this to become a
behemoth patch by the last CF and ends up bounced because of that.

What I'd like to work through is the minimal set which would be accepted
and get that in, in a way that doesn't prevent further improvements, and
then see what can be done to get those improvements and refinements in
during the 9.5 cycle and what gets bounced to the next release. To that
end, I've been trying to gauge interest in this and get some feel for
who is interested in helping push this forward- your help was
instrumental in getting updatable security barrier views into 9.4, would
you have time to help with this also..?

Thanks!

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Subject: Re: RLS Design the rewriter into the planner?
Date: 2014-06-25 14:34:22
Message-ID: 20140625143422.GS16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert, all,

Changing the thread topic to match the other one, and adding Dean in
explicitly since we're talking about the design discussed with him.

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> I think role is good enough. That's the primary identifier for all
> access-control related decisions, so it should be good enough here,
> too.

Alright. That works for me.

> I don't really understand your concerns about overlapping policies
> being complex. If you've got a couple of WHERE clauses, combining
> them with OR is not hard. Now, the query optimizer may have trouble
> with it, but on the whole I expect to win more than we lose, by
> entirely excluding some branches of an OR for users for whom entirely
> policies can be excluded.

On the thread with Dean we're proposing some specific catalog designs
and part of that included (fleshing it out a bit more) something like:

CREATE TABLE pg_relrlspolicy (-- relation RLS policy table
ptblrelid oid, -- Relation/table
ptblaction text, -- SELECT, INSERT, UPDATE, DELETE
ptblpolid oid, -- Policy
ptblquals text, -- Quals to add
ptblacl aclitem[], -- Rights to use this policy on the table

primary key (ptblrelid, ptblaction)
);

And note that I had expected aclitem to only include one entry per role.

To support overlapping policies, we could add 'ptblpolid' into the
primary key and then simply extract out all of the entries for the
relation and action that we're currently running and step through each
to find which of the policies apply to the current_role...?

If a role has policyA with 'INSERT' rights, but no rights to SELECT, but
they also have an entry for policyB with 'SELECT' rights, we would use
only policyB for a SELECT query? Does that approach mean we don't need
'ptblaction' after all? I'm thinking this approach would also toss out
the "pick your policy" concept that Dean had proposed up-thread.

How would these interact with the existing table-level rights? For
column-level rights, if you have access to SELECT the column then you
don't need any table-level rights (and the table-level rights mean you
can SELECT from any column), are we thinking the same would apply here,
such that having 'USING POLICY' rights means you can SELECT from the
table and the table-level rights end up being the 'DIRECT' rights which
had been discussed up-thread? Not sure that I like that approach,
though I understand some others might find it appealing.. As we're
integrating this with the GRANT command, perhaps it'd be alright.

> > Overall, while I'm interested in defining where this is going in a way
> > which allows us implement an initial RLS capability while avoiding
> > future upgrade issues, I am perfectly happy to say that the 9.5 RLS
> > implementation may not be exactly syntax-compatible with 9.6 or 10.0.
>
> Again, I think that's completely non-viable. Are you going to tell
> people they can't pg_upgrade, and they can't dump-and-reload either,
> without manual fiddling? There's no way that's going to be accepted.

I don't understand what you're getting at here. We dump the catalog
using the newer version of pg_dump for pg_upgrade and we could handle
any *syntax* change required during that process to ensure that the same
access is granted in the new cluster as existed in the old cluster.

We do the exact same thing every time we add a new reserved keyword-
anything which used that keyword before ends up getting double-quoted by
the new version of pg_dump and both pg_dump and pg_upgrade work just
fine. We routinly break some syntax compatibility between major
versions, address those changes in the newer version of pg_dump, and
move on.

I am not proposing that users won't be able to upgrade from 9.5 to 9.6
if they have RLS and agree that it'd be a non-starter.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-25 15:44:57
Message-ID: CA+TgmoYu1YOGyodfNscJBw5NrLHpE-xmJ=Urfjj+7vMyy4R5dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 24, 2014 at 8:49 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Let's try to outline what this would look like then.
>
> Taking your approach, we'd have:
>
> CREATE POLICY p1;
> CREATE POLICY p2;
>
> ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals;

This seems like a very nice, flexible framework.

> GRANT SELECT ON TABLE t1 TO role1 USING p1;
> GRANT SELECT ON TABLE t1 TO role2 USING p2;

Instead of doing it this way, we could instead do:

ALTER ROLE role1 ADD POLICY p1;
ALTER ROLE role2 ADD POLICY p2;

We could possibly allow multiple policies to be set for the same user,
but given an error (or OR the quals together) if there are conflicting
policies for the same table. A user with no policies would see
everything to which they've been granted access.

To support different policies on different operations, you could have
something like:

ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals;

Without the ON clause, it would establish the given policy for all operations.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-25 19:26:18
Message-ID: 20140625192618.GV16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Tue, Jun 24, 2014 at 8:49 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Let's try to outline what this would look like then.
> >
> > Taking your approach, we'd have:
> >
> > CREATE POLICY p1;
> > CREATE POLICY p2;
> >
> > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> > ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals;
>
> This seems like a very nice, flexible framework.
>
> > GRANT SELECT ON TABLE t1 TO role1 USING p1;
> > GRANT SELECT ON TABLE t1 TO role2 USING p2;
>
> Instead of doing it this way, we could instead do:
>
> ALTER ROLE role1 ADD POLICY p1;
> ALTER ROLE role2 ADD POLICY p2;
>
> We could possibly allow multiple policies to be set for the same user,
> but given an error (or OR the quals together) if there are conflicting
> policies for the same table. A user with no policies would see
> everything to which they've been granted access.

Ok, I like that as it means that "normal" GRANTs, etc, remain the same
and are just constrained by RLS when there is an RLS policy on the
table, which I believe is really the right approach.

> To support different policies on different operations, you could have
> something like:
>
> ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals;
>
> Without the ON clause, it would establish the given policy for all operations.

Right, this makes sense also and is similar to what we were angling
towards initially. I'll think further on this and propose a catalog
structure and try to delve into the semantics of query operations, etc.

One issue that occurs to me is trying to think through how to address
the plancache invalidation, such that we are not invalidating constantly
but also setting the correct quals for the current query. We had gone
down a road where we saved a plan for each role under which a query was
run but then ripped that out because the RLS policy would handle the
per-role issues (modulo whether RLS should be enabled or not). This
approach means that we'd have to bring back the notion of per-role plan
cacheing. I'm not against that, just making note of it.

Thanks,

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-25 20:48:57
Message-ID: CAEZATCUeFrJ_0xiysv-8owy-EibQ4zg_X3PrJBLp-8SR4h7ykw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 25 June 2014 16:44, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jun 24, 2014 at 8:49 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> Let's try to outline what this would look like then.
>>
>> Taking your approach, we'd have:
>>
>> CREATE POLICY p1;
>> CREATE POLICY p2;
>>
>> ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
>> ALTER TABLE t1 SET POLICY p2 TO t1_p2_quals;
>
> This seems like a very nice, flexible framework.
>
>> GRANT SELECT ON TABLE t1 TO role1 USING p1;
>> GRANT SELECT ON TABLE t1 TO role2 USING p2;
>
> Instead of doing it this way, we could instead do:
>
> ALTER ROLE role1 ADD POLICY p1;
> ALTER ROLE role2 ADD POLICY p2;
>
> We could possibly allow multiple policies to be set for the same user,
> but given an error (or OR the quals together) if there are conflicting
> policies for the same table. A user with no policies would see
> everything to which they've been granted access.
>

I'm a bit uneasy about allowing overlapping policies like this,
because I think it is more likely to lead to unintended consequences
than solve real use cases. For example, suppose you define policies p1
and p2 and set them up on table t1, and you grant role1 permissions on
t1 and allow role1 the use of policy p1. Then you set up policy p2 on
another table t2, and decide you want to allow role1 access to t2
using this policy. The only way to do it is to add p2 to role1, but
doing so also then gives role1 access to t1 using p2, which might not
have been what you intended.

With the GRANT ... USING policy syntax, you have greater flexibility
to pick and choose which policies each user has permission to use with
each table. To me at least, that seems much less error prone, since
you are being much more explicit about exactly what privileges you are
granting. The ALTER ROLE ... ADD POLICY syntax is potentially adding a
whole bunch of extra privileges to the role, and you have to work
quite hard to see exactly what it's adding.

> To support different policies on different operations, you could have
> something like:
>
> ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals;
>
> Without the ON clause, it would establish the given policy for all operations.
>

Yes, that makes sense. But as I was arguing above, I think the ACLs
should be attached to the specific RLS policy identified uniquely by
(table, policy, command). So, for example, if you did

ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals;
ALTER TABLE t1 SET POLICY p1 ON UPDATE TO t1_p1_upd_quals;

you could also do

GRANT SELECT ON TABLE t1 TO role1 USING p1;
GRANT UPDATE ON TABLE t1 TO role1 USING p1;

but it would be an error to do

GRANT DELETE ON TABLE t1 TO role1 USING p1;

because there is no p1 delete policy for t1;

Regards,
Dean


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-26 17:04:46
Message-ID: CA+TgmobC=R3aQ630CQCoHuWj96+Vf4uVsFhY_rApY+axYAEuQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 25, 2014 at 4:48 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> Instead of doing it this way, we could instead do:
>>
>> ALTER ROLE role1 ADD POLICY p1;
>> ALTER ROLE role2 ADD POLICY p2;
>>
>> We could possibly allow multiple policies to be set for the same user,
>> but given an error (or OR the quals together) if there are conflicting
>> policies for the same table. A user with no policies would see
>> everything to which they've been granted access.
>>
> I'm a bit uneasy about allowing overlapping policies like this,
> because I think it is more likely to lead to unintended consequences
> than solve real use cases. For example, suppose you define policies p1
> and p2 and set them up on table t1, and you grant role1 permissions on
> t1 and allow role1 the use of policy p1. Then you set up policy p2 on
> another table t2, and decide you want to allow role1 access to t2
> using this policy. The only way to do it is to add p2 to role1, but
> doing so also then gives role1 access to t1 using p2, which might not
> have been what you intended.

I guess that's true but it just seems like a configuration error. I
have it in mind that most people will define policies for
non-overlapping sets of tables and then apply those policies as
appropriate to each user. Whether that's true or not, I don't see it
as being materially different from granting membership in a role - you
could easily give the user permission to do stuff they shouldn't be
able to do, but if you don't carefully examine the bundle of
privileges that come with that GRANT before executing on it, that's
your fault, not the system's.

>> To support different policies on different operations, you could have
>> something like:
>>
>> ALTER TABLE t1 SET POLICY p1 ON INSERT TO t1_p1_quals;
>>
>> Without the ON clause, it would establish the given policy for all operations.
>
> Yes, that makes sense. But as I was arguing above, I think the ACLs
> should be attached to the specific RLS policy identified uniquely by
> (table, policy, command). So, for example, if you did
>
> ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals;
> ALTER TABLE t1 SET POLICY p1 ON UPDATE TO t1_p1_upd_quals;
>
> you could also do
>
> GRANT SELECT ON TABLE t1 TO role1 USING p1;
> GRANT UPDATE ON TABLE t1 TO role1 USING p1;
>
> but it would be an error to do
>
> GRANT DELETE ON TABLE t1 TO role1 USING p1;

As I see it, the downside of this is that it gets a lot more complex.
We have to revise the ACL representation, which is already pretty darn
complicated, to keep track not only of the grantee, grantor, and
permissions, but also the policies qualifying those permissions. The
changes to GRANT will need to propagate into GRANT ON ALL TABLES IN
SCHEMA and AFTER DEFAULT PRIVILEGES. There is administrative
complexity as well, because if you want to policy-protect an
additional table, you've got to add the table to the policy and then
update all the grants as well. I think what will happen in practice
is that people will grant to PUBLIC all rights on the policy, and then
do all the access control through the GRANT statements.

An interesting question we haven't much considered is: who can set up
policies and add then to users? Maybe we should flip this around, and
instead of adding users to policies, we should exempt users from
policies.

CREATE POLICY p1;

And then, if they own p1 and t1, they can do:

ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
(or maybe we should associate it to the policy instead of the table:
ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals)

And then the policy applies to everyone who doesn't have the grantable
EXEMPT privilege on the policy. The policy owner and superuser have
that privilege by default and it can be handed out to others like
this:

GRANT EXEMPT ON POLICY p1 TO snowden;

Then users who have row_level_security=on will bypass RLS if possible,
and otherwise it will be applied. Users who have
row_level_security=off will bypass RLS if possible, and otherwise
error. And users who have row_level_security=force will apply RLS
even if they are entitled to bypass it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-27 07:05:19
Message-ID: CAEZATCUGfujF8G2iWctCL56ETVV_1UW5x=Dh8Xqie2JzTDA+Lw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 June 2014 18:04, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals;
>> GRANT SELECT ON TABLE t1 TO role1 USING p1;
>
> As I see it, the downside of this is that it gets a lot more complex.
> We have to revise the ACL representation, which is already pretty darn
> complicated, to keep track not only of the grantee, grantor, and
> permissions, but also the policies qualifying those permissions. The
> changes to GRANT will need to propagate into GRANT ON ALL TABLES IN
> SCHEMA and AFTER DEFAULT PRIVILEGES.

No, it can be done without any changes to the permissions code by
storing the ACLs on the catalog entries where the RLS quals are held,
rather than modifying the ACL items on the table. I.e., instead of
thinking of "USING polname" as a modifier to the grant, think of it as
as an additional qualifier on the thing being granted.

That means the syntax I proposed earlier is wrong/misleading. Instead of

GRANT SELECT ON TABLE tbl TO role USING polname;

it should really be

GRANT SELECT USING polname ON TABLE tbl TO role;

> There is administrative
> complexity as well, because if you want to policy-protect an
> additional table, you've got to add the table to the policy and then
> update all the grants as well. I think what will happen in practice
> is that people will grant to PUBLIC all rights on the policy, and then
> do all the access control through the GRANT statements.
>

If you assume that most users will only have one policy through which
they can access any given table, then there is no more administrative
overhead than we have right now. Right now you have to grant each user
permissions on each table you define. The only difference is that now
you throw in a "USING polname". We could also simplify administration
by supporting

GRANT SELECT USING polname ON ALL TABLES IN SCHEMA sch TO role;

The important distinction is that this is only granting permissions on
tables that exist now, not on tables that might be created later.

> An interesting question we haven't much considered is: who can set up
> policies and add then to users? Maybe we should flip this around, and
> instead of adding users to policies, we should exempt users from
> policies.
>
> CREATE POLICY p1;
>
> And then, if they own p1 and t1, they can do:
>
> ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> (or maybe we should associate it to the policy instead of the table:
> ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals)
>
> And then the policy applies to everyone who doesn't have the grantable
> EXEMPT privilege on the policy. The policy owner and superuser have
> that privilege by default and it can be handed out to others like
> this:
>
> GRANT EXEMPT ON POLICY p1 TO snowden;
>
> Then users who have row_level_security=on will bypass RLS if possible,
> and otherwise it will be applied. Users who have
> row_level_security=off will bypass RLS if possible, and otherwise
> error. And users who have row_level_security=force will apply RLS
> even if they are entitled to bypass it.
>

That's interesting. I need to think some more about what that means.

Regards,
Dean


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-29 19:42:06
Message-ID: 20140629194206.GD16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert, Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 26 June 2014 18:04, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals;
> >> GRANT SELECT ON TABLE t1 TO role1 USING p1;
> >
> > As I see it, the downside of this is that it gets a lot more complex.
> > We have to revise the ACL representation, which is already pretty darn
> > complicated, to keep track not only of the grantee, grantor, and
> > permissions, but also the policies qualifying those permissions. The
> > changes to GRANT will need to propagate into GRANT ON ALL TABLES IN
> > SCHEMA and AFTER DEFAULT PRIVILEGES.
>
> No, it can be done without any changes to the permissions code by
> storing the ACLs on the catalog entries where the RLS quals are held,
> rather than modifying the ACL items on the table. I.e., instead of
> thinking of "USING polname" as a modifier to the grant, think of it as
> as an additional qualifier on the thing being granted.

Yeah, I agree that we could do it without changing the existing ACL
structure by using another table and having a flag in pg_class which
indicates if there are RLS policies on the table or not.

Regarding the concerns about users not using the RLS capabilities
correctly- I find that concern to be much more appropriate for the
current permissions system rather than RLS. If a user is going to the
level of even looking at RLS then, I'd hope at least, they'll be able to
understand and make good use of RLS to implement what they need and they
would appreciate the flexibility.

To try and clarify what this distinction is-

Dean's approach with GRANT allows specifying the policy to be
used when a given role queries a given table. Through this mechanism,
one role might have access to many different tables, possibly with a
different policy granting that access for each table.

Robert's approach defines a policy for a user and that policy is used
for all tables that user accesses. This ties the policy very closely to
the role.

With either approach, I wonder how we are going to address the role
membership question. Do you inherit policies through role membership?
What happens on 'set role'? Robert points out that we should be using
"OR" for these situations of overlapping policies and I tend to agree.
Therefore, we would look at the RLS policies for a table and extract out
all of them for all of the roles which the current user is a member of,
OR them together and that would be the set of quals used.

I'm leaning towards Dean's approach. With Robert's approach, one could
emulate Dean's approach but I suspect it would devolve quickly into one
policy per user with that policy simply being a proxy for the role
instead of being useful on its own. With Dean's approach though, I
don't think there's a need for a policy to be a stand-alone object. The
policy is simply a proxy for the set of quals to be added and therefore
the policy could really live as a per-table object.

> That means the syntax I proposed earlier is wrong/misleading. Instead of
>
> GRANT SELECT ON TABLE tbl TO role USING polname;
>
> it should really be
>
> GRANT SELECT USING polname ON TABLE tbl TO role;

This would work, though the 'polname' could be a per-table object, no?

This could even be:

GRANT SELECT USING (sec_level=manager) ON TABLE tbl TO role;

> > There is administrative
> > complexity as well, because if you want to policy-protect an
> > additional table, you've got to add the table to the policy and then
> > update all the grants as well. I think what will happen in practice
> > is that people will grant to PUBLIC all rights on the policy, and then
> > do all the access control through the GRANT statements.

I agree that if you want to policy protect a table that you'll need to
set the policies on the table (that's required either way) and that,
with Dean's approach, you'd have to modify the GRANTs done to that table
as well. I don't follow what you're suggesting with granting to PUBLIC
all rights on the policy though..?

With your approach though, if you have a policy which covers all
managers and one which covers all VPs and then you have one VP whose
access should be different, you'd have to create a new policy just for
that VP and then modify all of the tables which have manager/VP access
to also have that new VP's policy too, or something along those lines,
no?

> If you assume that most users will only have one policy through which
> they can access any given table, then there is no more administrative
> overhead than we have right now. Right now you have to grant each user
> permissions on each table you define. The only difference is that now
> you throw in a "USING polname". We could also simplify administration
> by supporting
>
> GRANT SELECT USING polname ON ALL TABLES IN SCHEMA sch TO role;
>
> The important distinction is that this is only granting permissions on
> tables that exist now, not on tables that might be created later.

Sure, that's the same as it is now.. Robert's correct, imv, that we'll
need to make GRANT .. ON ALL, and ALTER DEFAULT PRIVS work with this.

> > An interesting question we haven't much considered is: who can set up
> > policies and add then to users? Maybe we should flip this around, and
> > instead of adding users to policies, we should exempt users from
> > policies.
> >
> > CREATE POLICY p1;
> >
> > And then, if they own p1 and t1, they can do:
> >
> > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> > (or maybe we should associate it to the policy instead of the table:
> > ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals)
> >
> > And then the policy applies to everyone who doesn't have the grantable
> > EXEMPT privilege on the policy. The policy owner and superuser have
> > that privilege by default and it can be handed out to others like
> > this:
> >
> > GRANT EXEMPT ON POLICY p1 TO snowden;
> >
> > Then users who have row_level_security=on will bypass RLS if possible,
> > and otherwise it will be applied. Users who have
> > row_level_security=off will bypass RLS if possible, and otherwise
> > error. And users who have row_level_security=force will apply RLS
> > even if they are entitled to bypass it.
>
> That's interesting. I need to think some more about what that means.

I'm not a fan of the EXEMPT approach..

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-30 13:24:57
Message-ID: CA+TgmoZ=ZO+zzYVygPgtpUHuCnwf6inkBHWTqpcUHKeuFmcHtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 29, 2014 at 3:42 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > An interesting question we haven't much considered is: who can set up
>> > policies and add then to users? Maybe we should flip this around, and
>> > instead of adding users to policies, we should exempt users from
>> > policies.
>> >
>> > CREATE POLICY p1;
>> >
>> > And then, if they own p1 and t1, they can do:
>> >
>> > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
>> > (or maybe we should associate it to the policy instead of the table:
>> > ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals)
>> >
>> > And then the policy applies to everyone who doesn't have the grantable
>> > EXEMPT privilege on the policy. The policy owner and superuser have
>> > that privilege by default and it can be handed out to others like
>> > this:
>> >
>> > GRANT EXEMPT ON POLICY p1 TO snowden;
>> >
>> > Then users who have row_level_security=on will bypass RLS if possible,
>> > and otherwise it will be applied. Users who have
>> > row_level_security=off will bypass RLS if possible, and otherwise
>> > error. And users who have row_level_security=force will apply RLS
>> > even if they are entitled to bypass it.
>>
>> That's interesting. I need to think some more about what that means.
>
> I'm not a fan of the EXEMPT approach..

Just out of curiosity, why not?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-30 13:42:43
Message-ID: 20140630134243.GH16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Sun, Jun 29, 2014 at 3:42 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >> > An interesting question we haven't much considered is: who can set up
> >> > policies and add then to users? Maybe we should flip this around, and
> >> > instead of adding users to policies, we should exempt users from
> >> > policies.
> >> >
> >> > CREATE POLICY p1;
> >> >
> >> > And then, if they own p1 and t1, they can do:
> >> >
> >> > ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals;
> >> > (or maybe we should associate it to the policy instead of the table:
> >> > ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals)
> >> >
> >> > And then the policy applies to everyone who doesn't have the grantable
> >> > EXEMPT privilege on the policy. The policy owner and superuser have
> >> > that privilege by default and it can be handed out to others like
> >> > this:
> >> >
> >> > GRANT EXEMPT ON POLICY p1 TO snowden;
> >> >
> >> > Then users who have row_level_security=on will bypass RLS if possible,
> >> > and otherwise it will be applied. Users who have
> >> > row_level_security=off will bypass RLS if possible, and otherwise
> >> > error. And users who have row_level_security=force will apply RLS
> >> > even if they are entitled to bypass it.
> >>
> >> That's interesting. I need to think some more about what that means.
> >
> > I'm not a fan of the EXEMPT approach..
>
> Just out of curiosity, why not?

I don't see it as really solving the flexibility need and it feels quite
a bit more complicated to reason about. Would someone who is EXEMPT
from one policy on a given table still have other policies on that table
applied to them? Would a user be able to be EXEMPT from multiple
policies? I feel like that's what you're suggesting with this approach,
otherwise I don't see it as really different from the 'DIRECT SELECT'
privilege discussed previously..

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-30 14:09:39
Message-ID: CA+TgmobQTpJx9sD+wOALY=MdqVhdyqZqmt2YSRnUH8p8TWF=OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 30, 2014 at 9:42 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > I'm not a fan of the EXEMPT approach..
>>
>> Just out of curiosity, why not?
>
> I don't see it as really solving the flexibility need and it feels quite
> a bit more complicated to reason about. Would someone who is EXEMPT
> from one policy on a given table still have other policies on that table
> applied to them?

Yes; otherwise, EXEMPT couldn't be granted by non-superusers, and the
whole point of that proposal was to come up with something that would
be clearly safe for ordinary users to use.

> Would a user be able to be EXEMPT from multiple
> policies?

Yes, clearly. It would be a privilege on the policy object, so
different objects can have different privileges.

> I feel like that's what you're suggesting with this approach,
> otherwise I don't see it as really different from the 'DIRECT SELECT'
> privilege discussed previously..

Right. If you took that away, it wouldn't be different.

The number of possible approaches here has expanded beyond what I can
keep in my head; I'm assuming you are planning to think this over and
propose something comprehensive, or maybe Dean or someone else will do
that. But I'm not sure that all the approaches proposed would make it
safe for non-superusers to use RLS, and I think it would be good if
they could.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-06-30 14:31:22
Message-ID: 20140630143122.GJ16098@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Mon, Jun 30, 2014 at 9:42 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I don't see it as really solving the flexibility need and it feels quite
> > a bit more complicated to reason about. Would someone who is EXEMPT
> > from one policy on a given table still have other policies on that table
> > applied to them?
>
> Yes; otherwise, EXEMPT couldn't be granted by non-superusers, and the
> whole point of that proposal was to come up with something that would
> be clearly safe for ordinary users to use.

I'm confused on this part- granting EXEMPT and/or DIRECT SELECT would
definitely need to be supported by a non-superuser, though someone who
had the appropriate rights on the object involved (either the policy or
the table, depending on where we feel that definition should go).

> > Would a user be able to be EXEMPT from multiple
> > policies?
>
> Yes, clearly. It would be a privilege on the policy object, so
> different objects can have different privileges.

Ok.. then I'm not entirely sure how this is different from Dean's
proposal except that it's a way of defining the inverse, which we don't
do anywhere else in the system today..

> > I feel like that's what you're suggesting with this approach,
> > otherwise I don't see it as really different from the 'DIRECT SELECT'
> > privilege discussed previously..
>
> Right. If you took that away, it wouldn't be different.

Ok.

> The number of possible approaches here has expanded beyond what I can
> keep in my head; I'm assuming you are planning to think this over and
> propose something comprehensive, or maybe Dean or someone else will do
> that. But I'm not sure that all the approaches proposed would make it
> safe for non-superusers to use RLS, and I think it would be good if
> they could.

I've been thinking about it quite a bit over the past few days (weeks?)
and trying to continue to outline the proposals as they've changed.
I'll try and work up another comprehensive email which covers the
options currently under discussion as I understand them. Allowing
non-superuser to use RLS is absolutely key to this in any case- it'd be
great if you could voice any specific concerns you see there. We've
already been working through the GUCs previously discussed, as I feel
those will be necessary for any of these approaches (in particular the
"bypass RLS-or-error" GUC which pg_dump will enable by default).

Thanks,

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-01 07:33:46
Message-ID: CAEZATCVftksFH=X+9mVmBNMZo5KsUP+RK0kb4oRO92JOfjO29g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 June 2014 20:42, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> To try and clarify what this distinction is-
>
> Dean's approach with GRANT allows specifying the policy to be
> used when a given role queries a given table. Through this mechanism,
> one role might have access to many different tables, possibly with a
> different policy granting that access for each table.
>
> Robert's approach defines a policy for a user and that policy is used
> for all tables that user accesses. This ties the policy very closely to
> the role.
>

Actually I think they were both originally Robert's ideas in one form
or another, but at this point I'm losing track a bit :-)

> With either approach, I wonder how we are going to address the role
> membership question. Do you inherit policies through role membership?
> What happens on 'set role'? Robert points out that we should be using
> "OR" for these situations of overlapping policies and I tend to agree.
> Therefore, we would look at the RLS policies for a table and extract out
> all of them for all of the roles which the current user is a member of,
> OR them together and that would be the set of quals used.
>

Yes I think that's right. I had hoped to avoid overlapping policies,
but maybe they're more-or-less inevitable and we should just allow
them. It seems justifiable in terms of GRANTs --- one GRANT gives you
permission to access one set of rows from a table, another GRANT gives
you permission to access another set of rows, so in the end you have
access to the union of both sets.

> I'm leaning towards Dean's approach. With Robert's approach, one could
> emulate Dean's approach but I suspect it would devolve quickly into one
> policy per user with that policy simply being a proxy for the role
> instead of being useful on its own. With Dean's approach though, I
> don't think there's a need for a policy to be a stand-alone object. The
> policy is simply a proxy for the set of quals to be added and therefore
> the policy could really live as a per-table object.
>

Yes I think that's right too. I had thought that stand-alone policies
would be useful for selecting which policies to apply to each role,
but maybe that's not necessary if you rely entirely on GRANTs to
decide which policies apply.

>> That means the syntax I proposed earlier is wrong/misleading. Instead of
>>
>> GRANT SELECT ON TABLE tbl TO role USING polname;
>>
>> it should really be
>>
>> GRANT SELECT USING polname ON TABLE tbl TO role;
>
> This would work, though the 'polname' could be a per-table object, no?
>

Right.

> This could even be:
>
> GRANT SELECT USING (sec_level=manager) ON TABLE tbl TO role;
>

Maybe. The important thing is that it's granting the role access to a
{table,command,policy} set or equivalently a {table,command,quals} set
--- i.e., the right to access a sub-set of the table's rows with a
particular command.

Let's explore this further to see where it leads. In some ways, I
think it has ended up even simpler than I thought. To setup RLS, you
would just need to do 2 things:

1). Add a bunch of RLS policies to your tables (not connected to any
particular commands, since that is done using GRANTs). This could use
Robert's earlier syntax:

ALTER TABLE t1 ADD POLICY p1 WHERE p1_quals;
ALTER TABLE t1 ADD POLICY p2 WHERE p2_quals;
...
(or some similar syntax)

where the policy names p1 and p2 need only be unique within the table.

For maintenance purposes you'd also need to be able to do

ALTER TABLE t1 DROP POLICY pol;

and maybe in the future we'd support

ALTER TABLE t1 ALTER POLICY pol TO new_quals;

2). Once each table has the required set of policies, grant each role
permissions, specifying the allowed commands and policies together:

GRANT SELECT USING p1 ON TABLE t1 TO role1;
GRANT SELECT USING p2 ON TABLE t1 TO role1;
GRANT UPDATE USING p3 ON TABLE t1 TO role1;
...
(or some similar syntax)

So in this example, if role1 SELECTed from t1, the system would
automatically apply the combined quals (p1_quals OR p2_quals), whereas
when role1 UPDATEd t1, it would apply p3_quals. So that takes care of
the different-quals-for-different-commands requirement without even
needing any special syntax for it in ALTER TABLE.

A straight "GRANT SELECT ON TABLE .. TO .." would grant access to the
whole table without any RLS quals, as it always has done, which is
good because it means nothing changes for users who aren't interested
in RLS.

Finally, pg_dump would require a GUC to ensure that RLS was not in
effect. Perhaps something like SET require_direct_table_access = true,
which would cause an error to be thrown if the user hadn't been
granted straight select permissions on the tables in question.

That all seems relatively easy to understand, whilst giving a lot of
flexibility.

An annoying complication, however, is how this interacts with column
privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1
access to every row in col1, and I think that has to remain the case,
since GRANTs only ever give you more access. But that leads to a
situation where the RLS quals applied would depend on the columns
selected. That could be avoided by consistent use of

GRANT SELECT(col1,col2,...) USING p1 ON TABLE t1 TO role1;

so that the same policy applied to all accessible columns. But what if
different policies applied to different columns? Logically that would
require the sets of quals for each of the selected columns to be ANDed
together, or perhaps we would throw an error in that case. My
inclination is to allow it, because it's probably as much effort to
detect and forbid it.

Despite this complication, I still quite like this approach because it
seems to build naturally on existing technology, giving a lot of
flexibility, without requiring too much additional syntax.

Regards,
Dean


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-01 16:42:06
Message-ID: CA+TgmobOH+BvqmntaRFN5G+jZeZ2HUDAWDy42bMfRSRcKaroyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> An annoying complication, however, is how this interacts with column
> privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1
> access to every row in col1, and I think that has to remain the case,
> since GRANTs only ever give you more access. But that leads to a
> situation where the RLS quals applied would depend on the columns
> selected.

Wow, that seems pretty horrible to me. That means that if I do:

SELECT a FROM tab;

and then:

SELECT a, b FROM tab;

...the second one might return fewer rows than the first one.

I think there's a good argument that RLS is unlike other grantable
privileges, and that it really ought to be defined as something which
is imposed rather than a kind of access grant. If RLS is merely a
modifier to an access grant, then every access grant has to make sure
to include that modifier, or you have a security hole. But if it's a
separate constrain on access, then you just do it once, and exempt
people from it only as needed. That seems less error-prone to me --
it's sort of a default-deny policy, which is generally viewed as good
for security -- and it avoids weird cases like the above, which I
think could easily break application logic.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-01 19:20:39
Message-ID: CAEZATCV3shrCPt1cARrfWJZ4EhNaRFDAWoP5-1qcX7MBGTQ=JA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1 July 2014 17:42, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> An annoying complication, however, is how this interacts with column
>> privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1
>> access to every row in col1, and I think that has to remain the case,
>> since GRANTs only ever give you more access. But that leads to a
>> situation where the RLS quals applied would depend on the columns
>> selected.
>
> Wow, that seems pretty horrible to me. That means that if I do:
>
> SELECT a FROM tab;
>
> and then:
>
> SELECT a, b FROM tab;
>
> ...the second one might return fewer rows than the first one.
>
> I think there's a good argument that RLS is unlike other grantable
> privileges, and that it really ought to be defined as something which
> is imposed rather than a kind of access grant. If RLS is merely a
> modifier to an access grant, then every access grant has to make sure
> to include that modifier, or you have a security hole. But if it's a
> separate constrain on access, then you just do it once, and exempt
> people from it only as needed. That seems less error-prone to me --
> it's sort of a default-deny policy, which is generally viewed as good
> for security -- and it avoids weird cases like the above, which I
> think could easily break application logic.
>

That seems like a pretty strong argument.

If RLS quals are instead regarded as constraints on access, and
multiple policies apply, then it seems that the quals should now be
combined with AND rather than OR, right?

Regards,
Dean


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-01 19:51:48
Message-ID: CA+TgmoY0uOTqkjNYULuSpRaVKO2MTeJgT-xcFAA=zYVB-brzSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On 1 July 2014 17:42, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>> An annoying complication, however, is how this interacts with column
>>> privileges. Right now "GRANT SELECT(col1) ON t1 TO role1" gives role1
>>> access to every row in col1, and I think that has to remain the case,
>>> since GRANTs only ever give you more access. But that leads to a
>>> situation where the RLS quals applied would depend on the columns
>>> selected.
>>
>> Wow, that seems pretty horrible to me. That means that if I do:
>>
>> SELECT a FROM tab;
>>
>> and then:
>>
>> SELECT a, b FROM tab;
>>
>> ...the second one might return fewer rows than the first one.
>>
>> I think there's a good argument that RLS is unlike other grantable
>> privileges, and that it really ought to be defined as something which
>> is imposed rather than a kind of access grant. If RLS is merely a
>> modifier to an access grant, then every access grant has to make sure
>> to include that modifier, or you have a security hole. But if it's a
>> separate constrain on access, then you just do it once, and exempt
>> people from it only as needed. That seems less error-prone to me --
>> it's sort of a default-deny policy, which is generally viewed as good
>> for security -- and it avoids weird cases like the above, which I
>> think could easily break application logic.
>
> That seems like a pretty strong argument.
>
> If RLS quals are instead regarded as constraints on access, and
> multiple policies apply, then it seems that the quals should now be
> combined with AND rather than OR, right?

Yeah, maybe. I intuitively feel that OR would be more useful, so it
would be nice to find a design where that makes sense. But it depends
a lot, in my view, on what syntax we end up with. For example,
suppose we add just one command:

ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;

If the given role inherits from multiple roles that have different
filters, I think the user will naturally expect all of the filters to
be applied. But you could do it other ways. For example:

ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;

If a table is set to NO ROW LEVEL SECURITY then it behaves just like
it does now: anyone who accesses it sees all the rows, restricted to
those columns for which they have permission. If the table is set to
ROW LEVEL SECURITY then the default is to show no rows. The second
command then allows access to a subset of the rows for a give role
name. In this case, it is probably logical for access to be combined
via OR.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RLS Design
Date: 2014-07-02 08:33:56
Message-ID: 53B3C3F4.9080300@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/07/14 21:51, Robert Haas wrote:
> On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>
>> That seems like a pretty strong argument.
>>
>> If RLS quals are instead regarded as constraints on access, and
>> multiple policies apply, then it seems that the quals should now be
>> combined with AND rather than OR, right?
> Yeah, maybe. I intuitively feel that OR would be more useful, so it
> would be nice to find a design where that makes sense.
Looking at the use cases we described earlier in
http://www.postgresql.org/message-id/attachment/32196/mini-rim.sql I see
more OR than AND, for instance 'if the row is sensitive then the user
must be related to the row' which translates to (NOT sensitive) OR the
user is related.

An addition to that rule could be a breakglass method or other reasons
to access, e.g.
(NOT sensitive) OR user is related OR break glass OR legally required
access.

> But it depends
> a lot, in my view, on what syntax we end up with. For example,
> suppose we add just one command:
>
> ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;
>
> If the given role inherits from multiple roles that have different
> filters, I think the user will naturally expect all of the filters to
> be applied.

Suppose a building administrator gives a single person that has multiple
roles multiple key cards to access appropriate rooms in a building. You
could draw a venn diagram of the rooms those key cards open, and the
intuition here probably is that the person can enter any room if one of
the key cards matches, not all cards.

> But you could do it other ways. For example:
>
> ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;
>
> If a table is set to NO ROW LEVEL SECURITY then it behaves just like
> it does now: anyone who accesses it sees all the rows, restricted to
> those columns for which they have permission. If the table is set to
> ROW LEVEL SECURITY then the default is to show no rows. The second
> command then allows access to a subset of the rows for a give role
> name. In this case, it is probably logical for access to be combined
> via OR.
>
regards,
Yeb Havinga


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-02 13:47:47
Message-ID: 20140702134747.GE16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> > If RLS quals are instead regarded as constraints on access, and
> > multiple policies apply, then it seems that the quals should now be
> > combined with AND rather than OR, right?

I do feel that RLS quals are constraints on access, but I don't see how
it follows that multiple quals should be AND'd together because of that.
I view the RLS policies on each table as being independent and "standing
alone" regarding what can be seen. If you have access to a table today
through policy A, and then later policy B is added, using AND would mean
that the set of rows returned is less than if only policy A existed.
That doesn't seem correct to me.

> Yeah, maybe. I intuitively feel that OR would be more useful, so it
> would be nice to find a design where that makes sense. But it depends
> a lot, in my view, on what syntax we end up with. For example,
> suppose we add just one command:
>
> ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;
>
> If the given role inherits from multiple roles that have different
> filters, I think the user will naturally expect all of the filters to
> be applied.

Agreed.

> But you could do it other ways. For example:
>
> ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;
>
> If a table is set to NO ROW LEVEL SECURITY then it behaves just like
> it does now: anyone who accesses it sees all the rows, restricted to
> those columns for which they have permission. If the table is set to
> ROW LEVEL SECURITY then the default is to show no rows. The second
> command then allows access to a subset of the rows for a give role
> name. In this case, it is probably logical for access to be combined
> via OR.

I can see value is having a table-level option to indicate if RLS is
applied for that table or not, but I had been thinking we'd just
automatically manage that. That is to say that once you define an RLS
policy for a table, we go look and see what policy should be applied in
each case. With the user able to control that, what happens if they say
"row security" on the table and there are no policies? All access would
show the table as empty? What if policies exist and they decide to
'turn off' RLS for the table- suddenly everyone can see all the rows?

My answers to the above (which are making me like the idea more,
actually...) would be:

Yes, if they turn on RLS for the table and there aren't any policies,
then the table appears empty for anyone with normal SELECT rights (table
owner and superusers would still see everything).

If policies exist and the user asks to turn off RLS, I'd throw an ERROR
as there is a security risk there. We could support a CASCADE option
which would go and drop the policies from the table first.

Otherwise, I'm generally liking Dean's thoughts in
http://www.postgresql.org/message-id/CAEZATCVftksFH=X+9mVmBNMZo5KsUP+RK0kb4oRO92JOfjO29g@mail.gmail.com
along with the table-level "enable RLS" option.

Are we getting to a point where there is sufficient agreement that it'd
be worthwhile to really start implementing this? I'd suggest that we
either forgo or at least table the notion of per-column policy
definitions- RLS controls whole rows and so I don't feel that per-column
policies really make sense.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-02 15:01:41
Message-ID: CA+TgmobC2=rn_kNK0SWDiVCFwUVa1up7Sjg2jONxW+UYkT=CSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 2, 2014 at 9:47 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> But you could do it other ways. For example:
>>
>> ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
>> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;
>>
>> If a table is set to NO ROW LEVEL SECURITY then it behaves just like
>> it does now: anyone who accesses it sees all the rows, restricted to
>> those columns for which they have permission. If the table is set to
>> ROW LEVEL SECURITY then the default is to show no rows. The second
>> command then allows access to a subset of the rows for a give role
>> name. In this case, it is probably logical for access to be combined
>> via OR.
>
> I can see value is having a table-level option to indicate if RLS is
> applied for that table or not, but I had been thinking we'd just
> automatically manage that. That is to say that once you define an RLS
> policy for a table, we go look and see what policy should be applied in
> each case. With the user able to control that, what happens if they say
> "row security" on the table and there are no policies? All access would
> show the table as empty?

I said the same thing in the text you quoted immediately above this reply.

> What if policies exist and they decide to
> 'turn off' RLS for the table- suddenly everyone can see all the rows?

That'd be my vote. Sorta like disabling triggers.

> Are we getting to a point where there is sufficient agreement that it'd
> be worthwhile to really start implementing this?

I think we're converging, but it might be a good idea to summarize a
specific proposal before you start implementing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-02 15:42:54
Message-ID: 20140702154254.GH16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jul 2, 2014 at 9:47 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >> But you could do it other ways. For example:
> >>
> >> ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
> >> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;
> >>
> >> If a table is set to NO ROW LEVEL SECURITY then it behaves just like
> >> it does now: anyone who accesses it sees all the rows, restricted to
> >> those columns for which they have permission. If the table is set to
> >> ROW LEVEL SECURITY then the default is to show no rows. The second
> >> command then allows access to a subset of the rows for a give role
> >> name. In this case, it is probably logical for access to be combined
> >> via OR.
> >
> > I can see value is having a table-level option to indicate if RLS is
> > applied for that table or not, but I had been thinking we'd just
> > automatically manage that. That is to say that once you define an RLS
> > policy for a table, we go look and see what policy should be applied in
> > each case. With the user able to control that, what happens if they say
> > "row security" on the table and there are no policies? All access would
> > show the table as empty?
>
> I said the same thing in the text you quoted immediately above this reply.

huh. Somehow I managed to only read the first sentence in that
paragraph. Clearly I need to go get (more) coffee. Still- sounds like
agreement. :)

> > What if policies exist and they decide to
> > 'turn off' RLS for the table- suddenly everyone can see all the rows?
>
> That'd be my vote. Sorta like disabling triggers.

Hmm. Ok- how would you feel about at least spitting out a WARNING if
there are still policies on the table in that case..? Just makes me a
bit nervous to have a case where policies can be defined on a table but
are not actually being enforced..

> > Are we getting to a point where there is sufficient agreement that it'd
> > be worthwhile to really start implementing this?
>
> I think we're converging, but it might be a good idea to summarize a
> specific proposal before you start implementing.

Right- will do so later today.

Thanks!

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-02 15:48:41
Message-ID: CA+Tgmoah83X-0v6ZMTEfiRe-09TBVB-Zmhkn-FOcRanRQkin7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 2, 2014 at 11:42 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > What if policies exist and they decide to
>> > 'turn off' RLS for the table- suddenly everyone can see all the rows?
>>
>> That'd be my vote. Sorta like disabling triggers.
>
> Hmm. Ok- how would you feel about at least spitting out a WARNING if
> there are still policies on the table in that case..? Just makes me a
> bit nervous to have a case where policies can be defined on a table but
> are not actually being enforced..

Sounds like nanny-ism to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-02 15:49:34
Message-ID: 20140702154934.GJ16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jul 2, 2014 at 11:42 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >> > What if policies exist and they decide to
> >> > 'turn off' RLS for the table- suddenly everyone can see all the rows?
> >>
> >> That'd be my vote. Sorta like disabling triggers.
> >
> > Hmm. Ok- how would you feel about at least spitting out a WARNING if
> > there are still policies on the table in that case..? Just makes me a
> > bit nervous to have a case where policies can be defined on a table but
> > are not actually being enforced..
>
> Sounds like nanny-ism to me.

Alright, fair enough. Clearly, the individual changing the RLS on the
table will have to have appropriate rights to do so.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-03 05:14:32
Message-ID: 20140703051431.GM16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert, all,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> I think we're converging, but it might be a good idea to summarize a
> specific proposal before you start implementing.

Alright, apologies for it being a bit later than intended, but here's
what I've come up with thus far.

-- policies defined at a table scope
-- allows using the same policy name for different tables
-- with quals appropriate for each table
ALTER TABLE t1 ADD POLICY p1 USING p1_quals;
ALTER TABLE t1 ADD POLICY p2 USING p2_quals;

-- used to drop a policy definition from a table
ALTER TABLE t1 DROP POLICY p1;

-- cascade required when references exist for the policy
-- from roles
ALTER TABLE t1 DROP POLICY p1 CASCADE;

ALTER TABLE t1 ALTER POLICY p1 USING new_quals;

-- Controls if any RLS is applied to this table or not
-- If enabled, all users must access through some policy
ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;

-- Associates roles to policies
ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING p1;
ALTER TABLE table_name REVOKE ROW ACCESS FROM role_name USING p1;

-- "all" provides a policy which equates to full access (eg: 'true' or
-- 'direct' access). Used to explicitly state when RLS can be bypassed
-- and therefore a GUC can be set which says "bypass-RLS-or-error" and
-- not have an error if this policy is granted to the role.
ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING all;

-- Per-command-type control
ALTER TABLE table_name GRANT SELECT ROW ACCESS TO role_name USING all;
ALTER TABLE table_name GRANT UPDATE ROW ACCESS TO role_name USING all;

Policies for a table are checked against pg_has_role() and all which
apply are OR'd together.

Added to pg_class:

relrlsenabled boolean

pg_rowsecurity
oid oid
rlsrel oid
rlspol name
rlsquals text
rlsacls aclitem[]..? cmdtype(s) + role

If relrlsenabled then scan pg_rowsecurity for the policies associated
with the table, testing each to see if any apply for the current role
based on pg_has_role() against the aclitem array. Any which apply are
added and OR'd together.

Thoughts?

Thanks,

Stephen


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-04 06:00:12
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8FB7D04@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sorry for my late responding, now I'm catching up the discussion.

> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
> wrote:
> > > If RLS quals are instead regarded as constraints on access, and
> > > multiple policies apply, then it seems that the quals should now be
> > > combined with AND rather than OR, right?
>
> I do feel that RLS quals are constraints on access, but I don't see how
> it follows that multiple quals should be AND'd together because of that.
> I view the RLS policies on each table as being independent and "standing
> alone" regarding what can be seen. If you have access to a table today
> through policy A, and then later policy B is added, using AND would mean
> that the set of rows returned is less than if only policy A existed.
> That doesn't seem correct to me.
>
It seems to me direction of the constraints (RLS-policy) works to is reverse.

In case when we have no RLS-policy, 100% of rows are visible isn't it?
Addition of a constraint usually reduces the number of rows being visible,
or same number of rows at least. Constraint shall never work to the direction
to increase the number of rows being visible.

If multiple RLS-policies are connected with OR-operator, the first policy
works to the direction to reduce number of visible rows, but the second
policy works to the reverse direction.

If we would have OR'd RLS-policy, how does it merged with user given
qualifiers with?
For example, if RLS-policy of t1 is (t1.credential < get_user_credential)
and user's query is:
SELECT * FROM t1 WHERE t1.x = t1.x;
Do you think RLS-policy shall be merged with OR'd form?

> > Yeah, maybe. I intuitively feel that OR would be more useful, so it
> > would be nice to find a design where that makes sense. But it depends
> > a lot, in my view, on what syntax we end up with. For example,
> > suppose we add just one command:
> >
> > ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;
> >
> > If the given role inherits from multiple roles that have different
> > filters, I think the user will naturally expect all of the filters to
> > be applied.
>
> Agreed.
>
> > But you could do it other ways. For example:
> >
> > ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; ALTER TABLE
> > table_name GRANT ROW ACCESS TO role_name USING qual;
> >
> > If a table is set to NO ROW LEVEL SECURITY then it behaves just like
> > it does now: anyone who accesses it sees all the rows, restricted to
> > those columns for which they have permission. If the table is set to
> > ROW LEVEL SECURITY then the default is to show no rows. The second
> > command then allows access to a subset of the rows for a give role
> > name. In this case, it is probably logical for access to be combined
> > via OR.
>
> I can see value is having a table-level option to indicate if RLS is applied
> for that table or not, but I had been thinking we'd just automatically manage
> that. That is to say that once you define an RLS policy for a table, we
> go look and see what policy should be applied in each case. With the user
> able to control that, what happens if they say "row security" on the table
> and there are no policies? All access would show the table as empty? What
> if policies exist and they decide to 'turn off' RLS for the table- suddenly
> everyone can see all the rows?
>
> My answers to the above (which are making me like the idea more,
> actually...) would be:
>
> Yes, if they turn on RLS for the table and there aren't any policies, then
> the table appears empty for anyone with normal SELECT rights (table owner
> and superusers would still see everything).
>
> If policies exist and the user asks to turn off RLS, I'd throw an ERROR
> as there is a security risk there. We could support a CASCADE option which
> would go and drop the policies from the table first.
>
Hmm... This approach starts from the empty permission then adds permission
to reference a particular range of the configured table. It's one attitude.

However, I think it has a dark side we cannot ignore. Usually, the purpose
of security mechanism is to ensure which is readable/writable according to
the rules. Once multiple RLS-policies are merged with OR'd form, its results
are unpredicatable.
Please assume here are two individual applications that use RLS on table-X.
Even if application-1 want only rows being "public" become visible, it may
expose "credential" or "secret" rows by interaction of orthogonal policy
configured by application-2 (that may configure the policy according to the
source ip-address). It seems to me application-2 partially invalidated the
RLS-policy configured by application-1.

I think, an important characteristic is things to be invisible is invisible
even though multiple rules are configured.

> Otherwise, I'm generally liking Dean's thoughts in
> http://www.postgresql.org/message-id/CAEZATCVftksFH=X+9mVmBNMZo5KsUP+R
> K0kb4oRO92JOfjO29g(at)mail(dot)gmail(dot)com
> along with the table-level "enable RLS" option.
>
> Are we getting to a point where there is sufficient agreement that it'd
> be worthwhile to really start implementing this? I'd suggest that we either
> forgo or at least table the notion of per-column policy
> definitions- RLS controls whole rows and so I don't feel that per-column
> policies really make sense.
>
Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-04 15:45:12
Message-ID: CAOuzzgrNdEAzrKcHrRcmQeHoKan43Fo4LXQWVAieVNGMWptKWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kaigai,

On Thursday, July 3, 2014, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

> Sorry for my late responding, now I'm catching up the discussion.
>
> > * Robert Haas (robertmhaas(at)gmail(dot)com <javascript:;>) wrote:
> > > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com
> <javascript:;>>
> > wrote:
> > > > If RLS quals are instead regarded as constraints on access, and
> > > > multiple policies apply, then it seems that the quals should now be
> > > > combined with AND rather than OR, right?
> >
> > I do feel that RLS quals are constraints on access, but I don't see how
> > it follows that multiple quals should be AND'd together because of that.
> > I view the RLS policies on each table as being independent and "standing
> > alone" regarding what can be seen. If you have access to a table today
> > through policy A, and then later policy B is added, using AND would mean
> > that the set of rows returned is less than if only policy A existed.
> > That doesn't seem correct to me.
> >
> It seems to me direction of the constraints (RLS-policy) works to is
> reverse.
>
> In case when we have no RLS-policy, 100% of rows are visible isn't it?

No, as outlined later, the table would appear empty if no policies exist
and RLS is enabled for the table.

> Addition of a constraint usually reduces the number of rows being visible,
> or same number of rows at least. Constraint shall never work to the
> direction
> to increase the number of rows being visible.

Can you clarify where this is coming from..? It sounds like you're
referring to an existing implementation and, if so, it'd be good to get
more information on how that works exactly.

> If multiple RLS-policies are connected with OR-operator, the first policy
> works to the direction to reduce number of visible rows, but the second
> policy works to the reverse direction.

This isn't accurate, as mentioned. Each policy stands alone to define what
is visible through it and if no policy exists then no rows are visible.

> If we would have OR'd RLS-policy, how does it merged with user given
> qualifiers with?

The RLS quals are all applied together with OR's and the result is AND'd
with any user quals provided. This is only when multiple policies are being
applied for a given query and seems pretty straight forward to me.

> For example, if RLS-policy of t1 is (t1.credential < get_user_credential)
> and user's query is:
> SELECT * FROM t1 WHERE t1.x = t1.x;
> Do you think RLS-policy shall be merged with OR'd form?

Only the RLS policies are OR'd together, not user provided quals. The above
would result in:

Where t1.x = t1.x and (t1.credential < get_user_credential)

If another policy also applies for this query, such as t1.cred2 <
get_user_credential then we would have:

Where t1.x = t1.x and (t1.credential < get_user_credential OR t1.cred2 <
get_user_credential)

This is similar to how roles work- your overall access includes all access
granted to any roles you are a member of. You don't need SELECT rights
granted to every role you are a member of to select from the table.
Additionally, if an admin wants to AND the quals together then they can
simply create a policy which does that rather than have 2 policies.

> > Yeah, maybe. I intuitively feel that OR would be more useful, so it
> > > would be nice to find a design where that makes sense. But it depends
> > > a lot, in my view, on what syntax we end up with. For example,
> > > suppose we add just one command:
> > >
> > > ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;
> > >
> > > If the given role inherits from multiple roles that have different
> > > filters, I think the user will naturally expect all of the filters to
> > > be applied.
> >
> > Agreed.
> >
> > > But you could do it other ways. For example:
> > >
> > > ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; ALTER TABLE
> > > table_name GRANT ROW ACCESS TO role_name USING qual;
> > >
> > > If a table is set to NO ROW LEVEL SECURITY then it behaves just like
> > > it does now: anyone who accesses it sees all the rows, restricted to
> > > those columns for which they have permission. If the table is set to
> > > ROW LEVEL SECURITY then the default is to show no rows. The second
> > > command then allows access to a subset of the rows for a give role
> > > name. In this case, it is probably logical for access to be combined
> > > via OR.
> >
> > I can see value is having a table-level option to indicate if RLS is
> applied
> > for that table or not, but I had been thinking we'd just automatically
> manage
> > that. That is to say that once you define an RLS policy for a table, we
> > go look and see what policy should be applied in each case. With the
> user
> > able to control that, what happens if they say "row security" on the
> table
> > and there are no policies? All access would show the table as empty?
> What
> > if policies exist and they decide to 'turn off' RLS for the table-
> suddenly
> > everyone can see all the rows?
> >
> > My answers to the above (which are making me like the idea more,
> > actually...) would be:
> >
> > Yes, if they turn on RLS for the table and there aren't any policies,
> then
> > the table appears empty for anyone with normal SELECT rights (table owner
> > and superusers would still see everything).
> >
> > If policies exist and the user asks to turn off RLS, I'd throw an ERROR
> > as there is a security risk there. We could support a CASCADE option
> which
> > would go and drop the policies from the table first.
> >
> Hmm... This approach starts from the empty permission then adds permission
> to reference a particular range of the configured table. It's one attitude.
>
>
Right- just like how our grant system works.

> However, I think it has a dark side we cannot ignore. Usually, the purpose
> of security mechanism is to ensure which is readable/writable according to
> the rules. Once multiple RLS-policies are merged with OR'd form, its
> results
> are unpredicatable.

I don't see how it's unpredictable at all.

> Please assume here are two individual applications that use RLS on table-X.
> Even if application-1 want only rows being "public" become visible, it may
> expose "credential" or "secret" rows by interaction of orthogonal policy
> configured by application-2 (that may configure the policy according to the
> source ip-address). It seems to me application-2 partially invalidated the
> RLS-policy configured by application-1.

You are suggesting instead that if application 2 sets up policies on the
table and then application 1 adds another policy that it should reduce what
application 2's users can see? That doesn't make any sense to me. I'd
actually expect these applications to at least use different roles anyway,
which means they could each have a single role specific policy which only
returns what that application is allowed to see.

> I think, an important characteristic is things to be invisible is invisible
> even though multiple rules are configured.

This is addressed through the ability to associate roles to policies.

Thanks,

Stephen


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-04 21:38:05
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8FB8095@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Kaigai,
>
> On Thursday, July 3, 2014, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>
>
> Sorry for my late responding, now I'm catching up the discussion.
>
> > * Robert Haas (robertmhaas(at)gmail(dot)com <javascript:;> ) wrote:
> > > On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed
> <dean(dot)a(dot)rasheed(at)gmail(dot)com <javascript:;> >
> > wrote:
> > > > If RLS quals are instead regarded as constraints on access,
> and
> > > > multiple policies apply, then it seems that the quals should
> now be
> > > > combined with AND rather than OR, right?
> >
> > I do feel that RLS quals are constraints on access, but I don't
> see how
> > it follows that multiple quals should be AND'd together because
> of that.
> > I view the RLS policies on each table as being independent and
> "standing
> > alone" regarding what can be seen. If you have access to a table
> today
> > through policy A, and then later policy B is added, using AND
> would mean
> > that the set of rows returned is less than if only policy A existed.
> > That doesn't seem correct to me.
> >
> It seems to me direction of the constraints (RLS-policy) works to
> is reverse.
>
> In case when we have no RLS-policy, 100% of rows are visible isn't
> it?
>
>
> No, as outlined later, the table would appear empty if no policies exist
> and RLS is enabled for the table.
>
>
> Addition of a constraint usually reduces the number of rows being
> visible,
> or same number of rows at least. Constraint shall never work to
> the direction
> to increase the number of rows being visible.
>
>
> Can you clarify where this is coming from..? It sounds like you're
> referring to an existing implementation and, if so, it'd be good to get
> more information on how that works exactly.
>

Oracle VPD - Multiple Policies for Each Table, View, or Synonym
http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351

It says - Note that all policies applied to a table are enforced with AND syntax.

Not only Oracle VPD, it fits attitude of defense in depth.
Please assume a system that installs network firewall, unix permissions
and selinux. If somebody wants to reference an information asset within
a file, he has to connect the server from the network address being allowed
by the firewall configuration AND both of DAC and MAC has to allow his
access.
Usually, we have to pass all the access control to reference the target
information, not one of the access control stuffs being installed.

> For example, if RLS-policy of t1 is (t1.credential <
> get_user_credential)
> and user's query is:
> SELECT * FROM t1 WHERE t1.x = t1.x;
> Do you think RLS-policy shall be merged with OR'd form?
>
>
> Only the RLS policies are OR'd together, not user provided quals. The above
> would result in:
>
> Where t1.x = t1.x and (t1.credential < get_user_credential)
>
> If another policy also applies for this query, such as t1.cred2 <
> get_user_credential then we would have:
>
> Where t1.x = t1.x and (t1.credential < get_user_credential OR t1.cred2 <
> get_user_credential)
>
> This is similar to how roles work- your overall access includes all access
> granted to any roles you are a member of. You don't need SELECT rights granted
> to every role you are a member of to select from the table. Additionally,
> if an admin wants to AND the quals together then they can simply create
> a policy which does that rather than have 2 policies.
>
It seems to me a pain on database administration, if we have to pay attention
not to conflict each RLS-policy. I expect 90% of RLS-policy will be configured
to PUBLIC user, to apply everybody same rules on access. In this case, DBA
has to ensure the target table has no policy or existing policy does not
conflict with the new policy to be set.
I don't think it is a good idea to enforce DBA these checks.

> Please assume here are two individual applications that use RLS
> on table-X.
> Even if application-1 want only rows being "public" become visible,
> it may
> expose "credential" or "secret" rows by interaction of orthogonal
> policy
> configured by application-2 (that may configure the policy
> according to the
> source ip-address). It seems to me application-2 partially
> invalidated the
> RLS-policy configured by application-1.
>
>
> You are suggesting instead that if application 2 sets up policies on the
> table and then application 1 adds another policy that it should reduce what
> application 2's users can see? That doesn't make any sense to me. I'd
> actually expect these applications to at least use different roles anyway,
> which means they could each have a single role specific policy which only
> returns what that application is allowed to see.
>
I don't think this assumption is reasonable.
Please expect two applications: app-X that is a database security product
to apply access control based on remote ip-address of the client for any
table accesses by any database roles. app-Y that is a usual enterprise
package for daily business data, with RLS-policy.
What is the expected behavior in this case?

App-X provides overall access control towards whole of the database.
So, it expects any client out of 192.168.0.0/16 should not reference
any credential information for example.
How does it interact with the RLS-policy by app-Y? If RLS-policies
are merged with OR'd form, it seems to me it invalidate control of
app-Y if client connected from inside of 192.168.0.0/16 or if client
connects with a particular app-Y's role from out of 192.168.0.0/16.

How to solve the situation above?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-06 05:19:49
Message-ID: 20140706051949.GV16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kaigai,

* Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> > Can you clarify where this is coming from..? It sounds like you're
> > referring to an existing implementation and, if so, it'd be good to get
> > more information on how that works exactly.
>
> Oracle VPD - Multiple Policies for Each Table, View, or Synonym
> http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351
>
> It says - Note that all policies applied to a table are enforced with AND syntax.

While I'm not against using this as an example to consider, it's much
more complex than what we're talking about here- and it supports
application contexts which allow groups of RLS rights to be applied or
not applied; essentially it allows both "AND" and "OR" for sets of RLS
policies, along with "default" policies which are applied no matter
what.

> Not only Oracle VPD, it fits attitude of defense in depth.
> Please assume a system that installs network firewall, unix permissions
> and selinux. If somebody wants to reference an information asset within
> a file, he has to connect the server from the network address being allowed
> by the firewall configuration AND both of DAC and MAC has to allow his
> access.

These are not independent systems and your argument would apply to our
GRANT system also, which I hope it's agreed would make it far less
useful. Note also that SELinux brings in another complexity- it needs
to make system calls out to check the access.

> Usually, we have to pass all the access control to reference the target
> information, not one of the access control stuffs being installed.

This is true in some cases, and not in others. Only one role you are a
member of needs to have access to a relation, not all of them. There
are other examples of 'OR'-style security policies, this is merely one.
I'm simply not convinced that it applies in the specific case we're
talking about.

In the end, I expect that either way people will be upset because they
won't be able to specify fully which should be AND vs. which should be
OR with the kind of flexibility other systems provide. What I'm trying
to get to is an initial implementation which is generally useful and is
able to add such support later.

> > This is similar to how roles work- your overall access includes all access
> > granted to any roles you are a member of. You don't need SELECT rights granted
> > to every role you are a member of to select from the table. Additionally,
> > if an admin wants to AND the quals together then they can simply create
> > a policy which does that rather than have 2 policies.
> >
> It seems to me a pain on database administration, if we have to pay attention
> not to conflict each RLS-policy.

This notion of 'conflict' doesn't make much sense to me. What is
'conflicting' here? Each policy would simply need to stand on its own
for the role which it's being applied to. That's very simple and
straight-forward.

> I expect 90% of RLS-policy will be configured
> to PUBLIC user, to apply everybody same rules on access. In this case, DBA
> has to ensure the target table has no policy or existing policy does not
> conflict with the new policy to be set.
> I don't think it is a good idea to enforce DBA these checks.

If the DBA only uses PUBLIC then they have to ensure that each policy
they set up for PUBLIC can stand on its own- though, really, I expect if
they go that route they'd end up with just one policy that calls a
stored procedure...

> > You are suggesting instead that if application 2 sets up policies on the
> > table and then application 1 adds another policy that it should reduce what
> > application 2's users can see? That doesn't make any sense to me. I'd
> > actually expect these applications to at least use different roles anyway,
> > which means they could each have a single role specific policy which only
> > returns what that application is allowed to see.
> >
> I don't think this assumption is reasonable.
> Please expect two applications: app-X that is a database security product
> to apply access control based on remote ip-address of the client for any
> table accesses by any database roles. app-Y that is a usual enterprise
> package for daily business data, with RLS-policy.
> What is the expected behavior in this case?

That the DBA manage the rights on the tables. I expect that will be
required for quite a while with PG. It's nice to think of these
application products that will manage all access for users by setting up
their own policies, but we have yet to even discuss how they would have
appropriate rights on the table to be able to do so (and to not
interfere with each other..).

Let's at least get something which is generally useful in. I'm all for
trying to plan out how to get there and would welcome suggestions you
have which are specific to PG on what we could do here (I'm not keen on
just trying to mimic another product completely...), but at the level
we're talking about (either AND them all or OR them all), I don't think
we'd actually solve the use-cases you're describing with either answer.

Without getting to the full level of having the flexibility to choose
which policies should be AND'd and which should be OR'd, do you see an
issue with adding initial support where each policy has to stand on its
own and then working to address the more complex cases later?

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-07 17:55:34
Message-ID: CA+TgmoaFXm9EV4po+9FftHmqa_6+nCn-KoKHMRR1HcocGHEL9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 3, 2014 at 1:14 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Alright, apologies for it being a bit later than intended, but here's
> what I've come up with thus far.
>
> -- policies defined at a table scope
> -- allows using the same policy name for different tables
> -- with quals appropriate for each table
> ALTER TABLE t1 ADD POLICY p1 USING p1_quals;
> ALTER TABLE t1 ADD POLICY p2 USING p2_quals;
>
> -- used to drop a policy definition from a table
> ALTER TABLE t1 DROP POLICY p1;
>
> -- cascade required when references exist for the policy
> -- from roles
> ALTER TABLE t1 DROP POLICY p1 CASCADE;
>
> ALTER TABLE t1 ALTER POLICY p1 USING new_quals;
>
> -- Controls if any RLS is applied to this table or not
> -- If enabled, all users must access through some policy
> ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
>
> -- Associates roles to policies
> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING p1;
> ALTER TABLE table_name REVOKE ROW ACCESS FROM role_name USING p1;

If you're going to have predicates be table-level and access grants be
table-level, then what's the value in having policies? You could just
do:

ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals;

As I see it, the only value in having policies as separate objects is
that you can then, by granting access to the policy, give a particular
user a bundle of rights rather than having to grant each right
individually. But with this design, you've got to create the policy,
then add the quals to it for each table, and then you still have to
give access individually for every <row, table> combination, so what
value is the policy object itself providing?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-08 07:01:54
Message-ID: 53BB9762.9060602@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all

I was jotting notes about this last sleepless night, and was really glad
to see the suggestion of enabling RLS on a table being a requirement for
OR-style quals suggested in the thread when I woke.

The only sane way to do OR-ing of multiple rules is to require that
tables be switched to deny-by-default before RLS quals can be added to
then selectively enable access.

The next step is DENY rules that override ALLOW rules, and are also
ORed, so any DENY rule overrides any ALLOW rule. Like in ACLs. But that
can be a "later" - I just think room for it should be left in any
catalog definition.

My concern with the talk of policies, etc, is with making it possible to
impliment this for 9.5. I'd really like to see a robust declarative
row-security framework with access policies - but I'm not sure sure it's
a good idea to try to assemble policies directly out of low level row
security predicates.

Tying things into a policy model that isn't tried or tested might create
more problems than it solves unless we implement multiple real-world
test cases on top of the model to show it works.

For how I think we should be pursuing this in the long run, take a look
at how TeraData does it, with heirachical and non-heirachical rules -
basically bitmaps or thresholds - that get grouped into access policies.
It's a very good way to abstract the low level stuff. If we have low
level table predicate filters, we can build this sort of thing on top.

For 9.5, unless the basics turn out to be way easier than they look and
it's all done soon in the release process, surely we should be sticking
to just getting the basics of row security in place? Leaving room for
enhancement, sure, but sticking to the core feature which to my mind is:

- A row security on/off flag for a table;

- Room in the catalogs for multiple row security rules per table
and a type flag for them. The initial type flag, for ALLOW rules,
specifies that all ALLOW rules be ORed together.

- Syntax for creating and dropping row security predicates. If there
can be multiple ones per table they'll need names, like we have with
triggers, indexes, etc.

- psql support for listing row security predicates on a table if running
as superuser or if you've been explicitly GRANTed access to the
catalog table listing row security quals.

- The hooks for contribs to inject their own row security rules. The
API will need a tweak - right now it assumes these rules are ANDed
with any row security predicates in the catalogs, but we'd want the
option of treating them as ALLOW or DENY rules to get ORed with the
rest of the set *or* as a pre-filter predicate like currently.

- A row-security-exempt right, at the user-level,
to assuage the concerns about malicious predicates. I maintain that
in the first rev this should be simple: "superuser is row security
exempt". I don't think I'm going to win that one though, so a
user/role attribute that makes the role ignore row security
seems like the next simplest option.

- A way to test whether the current user is row-security exempt
so pg_dump can complain unless explicitly told it's allowed
to do a selective dump via a cmdline option;

Plus a number of fixes:

- Fixing the security barrier view isssue with row level lock pushdown
that's breaking the row security regression tests;

- Enhancing plan cache invalidation so that row-security exempt-ness
of a user is part of the plancache key;

- Adding session state like current_user to portals, so security_barrier
functions returning refcursor, and cursors created before SET SESSION
AUTHORIZATION or SET ROLE, get the correct values when they use
session information like current_user

Note that this doesn't even consider the "with check option" style
write-filtering side of row security and the corresponding challenges
with the semantics around RETURNING.

It's already a decent sized amount of work on top of the existing row
security patch.

If we start adding policy groups, etc, this will never get done.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-08 14:14:29
Message-ID: CADyhKSVzyWry1JHqY77sP0=npPPF4VkzRtqoSyBsT63yE44w-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014-07-06 14:19 GMT+09:00 Stephen Frost <sfrost(at)snowman(dot)net>:
> Kaigai,
>
> * Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
>> > Can you clarify where this is coming from..? It sounds like you're
>> > referring to an existing implementation and, if so, it'd be good to get
>> > more information on how that works exactly.
>>
>> Oracle VPD - Multiple Policies for Each Table, View, or Synonym
>> http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351
>>
>> It says - Note that all policies applied to a table are enforced with AND syntax.
>
> While I'm not against using this as an example to consider, it's much
> more complex than what we're talking about here- and it supports
> application contexts which allow groups of RLS rights to be applied or
> not applied; essentially it allows both "AND" and "OR" for sets of RLS
> policies, along with "default" policies which are applied no matter
> what.
>
>> Not only Oracle VPD, it fits attitude of defense in depth.
>> Please assume a system that installs network firewall, unix permissions
>> and selinux. If somebody wants to reference an information asset within
>> a file, he has to connect the server from the network address being allowed
>> by the firewall configuration AND both of DAC and MAC has to allow his
>> access.
>
> These are not independent systems and your argument would apply to our
> GRANT system also, which I hope it's agreed would make it far less
> useful. Note also that SELinux brings in another complexity- it needs
> to make system calls out to check the access.
>
>> Usually, we have to pass all the access control to reference the target
>> information, not one of the access control stuffs being installed.
>
> This is true in some cases, and not in others. Only one role you are a
> member of needs to have access to a relation, not all of them. There
> are other examples of 'OR'-style security policies, this is merely one.
> I'm simply not convinced that it applies in the specific case we're
> talking about.
>
> In the end, I expect that either way people will be upset because they
> won't be able to specify fully which should be AND vs. which should be
> OR with the kind of flexibility other systems provide. What I'm trying
> to get to is an initial implementation which is generally useful and is
> able to add such support later.
>
>> > This is similar to how roles work- your overall access includes all access
>> > granted to any roles you are a member of. You don't need SELECT rights granted
>> > to every role you are a member of to select from the table. Additionally,
>> > if an admin wants to AND the quals together then they can simply create
>> > a policy which does that rather than have 2 policies.
>> >
>> It seems to me a pain on database administration, if we have to pay attention
>> not to conflict each RLS-policy.
>
> This notion of 'conflict' doesn't make much sense to me. What is
> 'conflicting' here? Each policy would simply need to stand on its own
> for the role which it's being applied to. That's very simple and
> straight-forward.
>
>> I expect 90% of RLS-policy will be configured
>> to PUBLIC user, to apply everybody same rules on access. In this case, DBA
>> has to ensure the target table has no policy or existing policy does not
>> conflict with the new policy to be set.
>> I don't think it is a good idea to enforce DBA these checks.
>
> If the DBA only uses PUBLIC then they have to ensure that each policy
> they set up for PUBLIC can stand on its own- though, really, I expect if
> they go that route they'd end up with just one policy that calls a
> stored procedure...
>
>> > You are suggesting instead that if application 2 sets up policies on the
>> > table and then application 1 adds another policy that it should reduce what
>> > application 2's users can see? That doesn't make any sense to me. I'd
>> > actually expect these applications to at least use different roles anyway,
>> > which means they could each have a single role specific policy which only
>> > returns what that application is allowed to see.
>> >
>> I don't think this assumption is reasonable.
>> Please expect two applications: app-X that is a database security product
>> to apply access control based on remote ip-address of the client for any
>> table accesses by any database roles. app-Y that is a usual enterprise
>> package for daily business data, with RLS-policy.
>> What is the expected behavior in this case?
>
> That the DBA manage the rights on the tables. I expect that will be
> required for quite a while with PG. It's nice to think of these
> application products that will manage all access for users by setting up
> their own policies, but we have yet to even discuss how they would have
> appropriate rights on the table to be able to do so (and to not
> interfere with each other..).
>
> Let's at least get something which is generally useful in. I'm all for
> trying to plan out how to get there and would welcome suggestions you
> have which are specific to PG on what we could do here (I'm not keen on
> just trying to mimic another product completely...), but at the level
> we're talking about (either AND them all or OR them all), I don't think
> we'd actually solve the use-cases you're describing with either answer.
>
> Without getting to the full level of having the flexibility to choose
> which policies should be AND'd and which should be OR'd, do you see an
> issue with adding initial support where each policy has to stand on its
> own and then working to address the more complex cases later?
>
Let me sort out. Probably, the reason of opinion differences come from
the point where I and you focus on.
It seems to me you try to position the upcoming RLS feature in the context
of existing database role and acl mechanism. I think it is a straightforward
approach and never argue. On the other hand, I'm worrying about whether
we can utilize the RLS feature as a basis to implement different security
model that performs independently from database roles and acl.

As long as RLS-policy quals are connected with OR, it is a design choice
to fit behavior of database acl and grant / revoke.
Things I'd like you to pay attention is, how much flexible to use this RLS
feature as a basis of other security model. One candidate is selinux; that
does not pay attention on database roles, so row-level security policy
attached by selinux should not be over-written by database roles.

As you mentioned above, RLS-policy is connected with user-given quals
by AND'd manner, like:
SELECT * FROM t1 WHERE x like '%abc%';
being replaced to
SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS);

What I'd like to implement is adjustment of query like:
SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS)
AND (quals by extension-1) AND ... AND (quals by extension-N);
I never mind even if qualifiers in the second block are connected with OR'd
manner, however, I want RLS infrastructure to accept additional security
models provided by extensions.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-09 06:04:02
Message-ID: 20140709060402.GF16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig,

* Craig Ringer (craig(at)2ndquadrant(dot)com) wrote:
> I was jotting notes about this last sleepless night, and was really glad
> to see the suggestion of enabling RLS on a table being a requirement for
> OR-style quals suggested in the thread when I woke.

Thanks for your thoughts and input!

> The only sane way to do OR-ing of multiple rules is to require that
> tables be switched to deny-by-default before RLS quals can be added to
> then selectively enable access.

Right.

> The next step is DENY rules that override ALLOW rules, and are also
> ORed, so any DENY rule overrides any ALLOW rule. Like in ACLs. But that
> can be a "later" - I just think room for it should be left in any
> catalog definition.

I'm not convinced regarding DENY rules, and I've seen very little of
their use in practice.. The expectation is generally a deny-by-default
setups with access granted explicity.

> My concern with the talk of policies, etc, is with making it possible to
> impliment this for 9.5. I'd really like to see a robust declarative
> row-security framework with access policies - but I'm not sure sure it's
> a good idea to try to assemble policies directly out of low level row
> security predicates.

+1000%- we really need to solidify what should go into 9.5 and get that
committed, then work out if there is more we can do in this release
cycle. I'm fine with a simple approach to begin with, provided we can
build on it moving forward without causing upgrade headaches, provided
we can get to where we want to go, of course.

> Tying things into a policy model that isn't tried or tested might create
> more problems than it solves unless we implement multiple real-world
> test cases on top of the model to show it works.

To this I would say- the original single-policy-per-table approach has
been vetted by actual users to be valuable in their environments. It
does not solve all cases, certainly, but it's simple and usable as-is
and is the minimum which I would like to see in 9.5. Ideally, we can do
better than that, but lets not throw out that win because we insist on a
complete solution before it goes into core- because then we'll never get
there.

> For how I think we should be pursuing this in the long run, take a look
> at how TeraData does it, with heirachical and non-heirachical rules -
> basically bitmaps or thresholds - that get grouped into access policies.
> It's a very good way to abstract the low level stuff. If we have low
> level table predicate filters, we can build this sort of thing on top.

I keep thinking that a bitmap or similar might make sense here..
Consider a set of policies where we assign them numbers-per-table, a we
can then build a bitmap of them, and then store what bitmap is applied
to a given query. That then allows us to compare those bitmaps during
plan cache checking to make sure that the policies applied last time are
the same which we would be applying now, and therefore the existing
cached plan is sufficient. It gets a bit more complicated when you
allow AND-vs-OR and groups or hierarchies of policies, of course, but
I'd like to think we can come up with a sensible way to represent that
to allow for a quick check during plan cache lookup.

> For 9.5, unless the basics turn out to be way easier than they look and
> it's all done soon in the release process, surely we should be sticking
> to just getting the basics of row security in place? Leaving room for
> enhancement, sure, but sticking to the core feature which to my mind is:

Agreed..

> - A row security on/off flag for a table;

Yes; I like this approach in general.

> - Room in the catalogs for multiple row security rules per table
> and a type flag for them. The initial type flag, for ALLOW rules,
> specifies that all ALLOW rules be ORed together.

Works for me. I'm open to a per-table toggle which says "AND" instead
of "OR", provided we could implement that sanely and simply.

> - Syntax for creating and dropping row security predicates. If there
> can be multiple ones per table they'll need names, like we have with
> triggers, indexes, etc.

Agreed. To Robert's question about having policy names at all, rather
than just quals, I feel like we'll need them eventually anyway and
having them earlier will simplify things. Additionally, it's simpler to
reason about and to manage- one can expect a one-to-many relationship
between policies and roles, making it simpler to work with the policy
name when associating it it to a role rather than having to remember all
of the quals involved.

> - psql support for listing row security predicates on a table if running
> as superuser or if you've been explicitly GRANTed access to the
> catalog table listing row security quals.

We need psql support to list the RLS policies.. I don't wish to get
into the question about what kind of access that requires though. At
least initially, I wouldn't try to limit access to the policies or quals
in the catalog... Perhaps we need that but I'd like a bit more
discussion about it first- and we'll need to figure out how to address
that when it comes to both psql and the 'rlsenabled' flag.

> - The hooks for contribs to inject their own row security rules. The
> API will need a tweak - right now it assumes these rules are ANDed
> with any row security predicates in the catalogs, but we'd want the
> option of treating them as ALLOW or DENY rules to get ORed with the
> rest of the set *or* as a pre-filter predicate like currently.

I'm really not interested in contrib modules with this first go around..
We can work to address their requests later on. I don't think many
contrib authors will be very happy with the low-level support which
we'll provide in 9.5 anyway and it'd probably be better off for everyone
if we hold off on adding hooks, etc, for them until we have a better
idea about how this will be used and it will work.

> - A row-security-exempt right, at the user-level,
> to assuage the concerns about malicious predicates. I maintain that
> in the first rev this should be simple: "superuser is row security
> exempt". I don't think I'm going to win that one though, so a
> user/role attribute that makes the role ignore row security
> seems like the next simplest option.

Yes, we'll need this.

> - A way to test whether the current user is row-security exempt
> so pg_dump can complain unless explicitly told it's allowed
> to do a selective dump via a cmdline option;

Agreed. Adam has a patch for this already, more or less.

> Plus a number of fixes:
>
> - Fixing the security barrier view isssue with row level lock pushdown
> that's breaking the row security regression tests;

No- this is not the responsibility of this particular patch or
functionality. I agree that we will want to address it at some point,
but it's very complicated and not required at this time.

> - Enhancing plan cache invalidation so that row-security exempt-ness
> of a user is part of the plancache key;

We need to ensure that the plan cache is hanlded correctly. I'm not
convinced, at this point, that we actually need to inclue the user as
part of the key for looking up a plan cache. It might come to that, but
I'm not quite convinced it's necessary yet.

> - Adding session state like current_user to portals, so security_barrier
> functions returning refcursor, and cursors created before SET SESSION
> AUTHORIZATION or SET ROLE, get the correct values when they use
> session information like current_user

Yeah, we need to consider this and how it *should* behave. Have we
really thought about and documented that, ideally as regression tests?
We need to do so, to ensure that we have the correct behavior in this
case.

> Note that this doesn't even consider the "with check option" style
> write-filtering side of row security and the corresponding challenges
> with the semantics around RETURNING.

Yeah, not sure how we want to handle these. At this point, I'm open to
simply throwing an ERROR in cases which are not well defined or which do
not work as expected. Ideally we can do better than that, but throwing
an ERROR for cases which don't exist today and which are not yet
supported is reasonable, imv.

> It's already a decent sized amount of work on top of the existing row
> security patch.

Indeed.

> If we start adding policy groups, etc, this will never get done.

Agreed!

Thanks!

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-09 06:07:17
Message-ID: 20140709060717.GG16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

KaiGai,

* Kohei KaiGai (kaigai(at)kaigai(dot)gr(dot)jp) wrote:
> What I'd like to implement is adjustment of query like:
> SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS)
> AND (quals by extension-1) AND ... AND (quals by extension-N);
> I never mind even if qualifiers in the second block are connected with OR'd
> manner, however, I want RLS infrastructure to accept additional security
> models provided by extensions.

Would having a table-level 'AND'-vs-'OR' modifier for the RLS policies
on that table be sufficient for what you're looking for? That seems a
simple enough addition which would still allow more complex groups to be
developed later on...

Thanks!

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-09 06:13:49
Message-ID: 20140709061349.GI16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> If you're going to have predicates be table-level and access grants be
> table-level, then what's the value in having policies? You could just
> do:
>
> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals;

Yes, this would be possible (and is nearly identical to the original
patch, except that this includes per-role considerations), however, my
thinking is that it'd be simpler to work with policy names rather than
sets of quals, to use when mapping to roles, and they would potentially
be useful later for other things (eg: for setting up which policies
should be applied when, or which should be OR' or AND"d with other
policies, or having groups of policies, etc).

> As I see it, the only value in having policies as separate objects is
> that you can then, by granting access to the policy, give a particular
> user a bundle of rights rather than having to grant each right
> individually. But with this design, you've got to create the policy,
> then add the quals to it for each table, and then you still have to
> give access individually for every <row, table> combination, so what
> value is the policy object itself providing?

To clarify this part- the idea is that you would simply declare a policy
name to be a set of quals for a particular table, so you declare them
and then map a policy to roles for which it should be used. In this
arrangement, you don't declare the policy explicitly before setting the
quals, those are done at the same time.

Thanks,

Stephen


From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-09 06:27:39
Message-ID: CADyhKSXP-QQoBRSHM_cM7M94apdHCTm+EdGpALrii2LAoROTpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014-07-09 15:07 GMT+09:00 Stephen Frost <sfrost(at)snowman(dot)net>:
> KaiGai,
>
> * Kohei KaiGai (kaigai(at)kaigai(dot)gr(dot)jp) wrote:
>> What I'd like to implement is adjustment of query like:
>> SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS)
>> AND (quals by extension-1) AND ... AND (quals by extension-N);
>> I never mind even if qualifiers in the second block are connected with OR'd
>> manner, however, I want RLS infrastructure to accept additional security
>> models provided by extensions.
>
> Would having a table-level 'AND'-vs-'OR' modifier for the RLS policies
> on that table be sufficient for what you're looking for? That seems a
> simple enough addition which would still allow more complex groups to be
> developed later on...
>
Probably, things I'm considering is more simple.
If a table has multiple built-in RLS policies, its expression node will be
represented as a BoolExpr with OR_EXPR and every policies are linked
to its args field, isn't it? We assume the built-in RLS model merges
multiple policies by OR manner.
In case when an extension want to apply additional security model on
top of RLS infrastructure, a straightforward way is to add its own rules
in addition to the built-in rules. If extension can get control to modify
the above expression node and RLS infrastructure works well on the
modified expression node, I think it's sufficient to implement multiple
security models on the RLS infrastructure.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-09 15:45:48
Message-ID: 20140709154548.GJ16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

KaiGai,

* Kohei KaiGai (kaigai(at)kaigai(dot)gr(dot)jp) wrote:
> 2014-07-09 15:07 GMT+09:00 Stephen Frost <sfrost(at)snowman(dot)net>:
> > * Kohei KaiGai (kaigai(at)kaigai(dot)gr(dot)jp) wrote:
> >> What I'd like to implement is adjustment of query like:
> >> SELECT * FROM t1 WHERE (x like '%abc%') AND (quals by built-in RLS)
> >> AND (quals by extension-1) AND ... AND (quals by extension-N);
> >> I never mind even if qualifiers in the second block are connected with OR'd
> >> manner, however, I want RLS infrastructure to accept additional security
> >> models provided by extensions.
> >
> > Would having a table-level 'AND'-vs-'OR' modifier for the RLS policies
> > on that table be sufficient for what you're looking for? That seems a
> > simple enough addition which would still allow more complex groups to be
> > developed later on...
> >
> Probably, things I'm considering is more simple.
> If a table has multiple built-in RLS policies, its expression node will be
> represented as a BoolExpr with OR_EXPR and every policies are linked
> to its args field, isn't it? We assume the built-in RLS model merges
> multiple policies by OR manner.
> In case when an extension want to apply additional security model on
> top of RLS infrastructure, a straightforward way is to add its own rules
> in addition to the built-in rules. If extension can get control to modify
> the above expression node and RLS infrastructure works well on the
> modified expression node, I think it's sufficient to implement multiple
> security models on the RLS infrastructure.

Another way would be to have a single RLS policy which extensions can
modify, sure. That was actually along the lines of the originally
proposed patch.. That approach would work if we OR'd multiple policies
together too, provided the user took care to only have one policy
implemented.. Not sure how easy that would be to work with for
extension authors though.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-10 15:17:59
Message-ID: CA+TgmoZO=2UeswNfGwTHf_sLUcoVAOyR-Nnnk-Ckt9a4u1O01w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Robert,
>
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> If you're going to have predicates be table-level and access grants be
>> table-level, then what's the value in having policies? You could just
>> do:
>>
>> ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING quals;
>
> Yes, this would be possible (and is nearly identical to the original
> patch, except that this includes per-role considerations), however, my
> thinking is that it'd be simpler to work with policy names rather than
> sets of quals, to use when mapping to roles, and they would potentially
> be useful later for other things (eg: for setting up which policies
> should be applied when, or which should be OR' or AND"d with other
> policies, or having groups of policies, etc).

Hmm. I guess that's reasonable. Should the policy be a per-table
object (like rules, constraints, etc.) instead of a global object?

You could do:

ALTER TABLE table_name ADD POLICY policy_name (quals);
ALTER TABLE table_name POLICY FOR role_name IS policy_name;
ALTER TABLE table_name DROP POLICY policy_name;

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-11 08:55:53
Message-ID: CAOuzzgqJXoC4U5dzpa7gxoX3pn9sAFKZJ+EcQqyoDQF7yEijFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, July 10, 2014, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost <sfrost(at)snowman(dot)net
> <javascript:;>> wrote:
> > Yes, this would be possible (and is nearly identical to the original
> > patch, except that this includes per-role considerations), however, my
> > thinking is that it'd be simpler to work with policy names rather than
> > sets of quals, to use when mapping to roles, and they would potentially
> > be useful later for other things (eg: for setting up which policies
> > should be applied when, or which should be OR' or AND"d with other
> > policies, or having groups of policies, etc).
>
> Hmm. I guess that's reasonable. Should the policy be a per-table
> object (like rules, constraints, etc.) instead of a global object?
>
> You could do:
>
> ALTER TABLE table_name ADD POLICY policy_name (quals);
> ALTER TABLE table_name POLICY FOR role_name IS policy_name;
> ALTER TABLE table_name DROP POLICY policy_name;
>

Right, I was thinking they would be per table as they would specifically
provide a name for a set of quals, and quals are naturally table-specific.
I don't see a need to have them be global- that had been brought up before
with the notion of applications picking their policy, but we could also add
that later through another term (eg: contexts) which would then map to
policies or similar. We could even extend policies to be global by mapping
existing per-table ones to be global if we really needed to...

My feeling at the moment is that having them be per-table makes sense and
we'd still have flexibility to change later if we had some compelling
reason to do so.

Thanks!

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-11 17:43:01
Message-ID: CA+TgmoZ7waKyitMu_sgtp6Z_xcHJd1Rsdg=DRn1HnuVEry2u0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 11, 2014 at 4:55 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> On Thursday, July 10, 2014, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > Yes, this would be possible (and is nearly identical to the original
>> > patch, except that this includes per-role considerations), however, my
>> > thinking is that it'd be simpler to work with policy names rather than
>> > sets of quals, to use when mapping to roles, and they would potentially
>> > be useful later for other things (eg: for setting up which policies
>> > should be applied when, or which should be OR' or AND"d with other
>> > policies, or having groups of policies, etc).
>>
>> Hmm. I guess that's reasonable. Should the policy be a per-table
>> object (like rules, constraints, etc.) instead of a global object?
>>
>> You could do:
>>
>> ALTER TABLE table_name ADD POLICY policy_name (quals);
>> ALTER TABLE table_name POLICY FOR role_name IS policy_name;
>> ALTER TABLE table_name DROP POLICY policy_name;
>
> Right, I was thinking they would be per table as they would specifically
> provide a name for a set of quals, and quals are naturally table-specific. I
> don't see a need to have them be global- that had been brought up before
> with the notion of applications picking their policy, but we could also add
> that later through another term (eg: contexts) which would then map to
> policies or similar. We could even extend policies to be global by mapping
> existing per-table ones to be global if we really needed to...
>
> My feeling at the moment is that having them be per-table makes sense and
> we'd still have flexibility to change later if we had some compelling reason
> to do so.

I don't think you can really change it later. If policies are
per-table, then you could have a policy p1 on table t1 and also on
table t2; if they become global objects, then you can't have p1 in two
places. I hope I'm not beating a dead horse here, but changing syntax
after it's been released is very, very hard.

But that's not an argument against doing it this way; I think
per-table policies are probably simpler and better here. It means,
for example, that policies need not have their own permissions and
ownership structure - they're part of the table, just like a
constraint, trigger, or rule, and the table owner's permissions
control. I like that, and I think our users will, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adam Brightwell <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-11 18:30:17
Message-ID: CAOuzzgrxiXJkt1nkxH+TFhy8OHRffBpg-CeczcKzmXE_LsygLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

On Friday, July 11, 2014, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Jul 11, 2014 at 4:55 AM, Stephen Frost <sfrost(at)snowman(dot)net
> <javascript:;>> wrote:
> > My feeling at the moment is that having them be per-table makes sense and
> > we'd still have flexibility to change later if we had some compelling
> reason
> > to do so.
>
> I don't think you can really change it later. If policies are
> per-table, then you could have a policy p1 on table t1 and also on
> table t2; if they become global objects, then you can't have p1 in two
> places. I hope I'm not beating a dead horse here, but changing syntax
> after it's been released is very, very hard.

Fair enough. My thinking was we'd come up with a way to map them (eg:
table_policy), but I do agree that changing it later would really suck and
having them be per-table makes a lot of sense.

> But that's not an argument against doing it this way; I think
> per-table policies are probably simpler and better here. It means,
> for example, that policies need not have their own permissions and
> ownership structure - they're part of the table, just like a
> constraint, trigger, or rule, and the table owner's permissions
> control. I like that, and I think our users will, too.

Agreed and I believe this is more-or-less what I had proposed up-thread
(not at a computer at the moment). I hope to have a chance to review and
update the design and flush out the catalog definition this weekend.

Thanks!

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-17 00:04:49
Message-ID: 7874.1405555489@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com> writes:
>> You could do:
>>
>> ALTER TABLE table_name ADD POLICY policy_name (quals);
>> ALTER TABLE table_name POLICY FOR role_name IS policy_name;
>> ALTER TABLE table_name DROP POLICY policy_name;

> I am attempting to modify the grammar to support the above syntax.
> Unfortunately, I am encountering quite a number (280) shift/reduce
> errors/conflicts in bison. I have reviewed the bison documentation as well
> as the wiki page on resolving such conflicts. However, I am not entirely
> certain on the direction I should take in order to resolve these conflicts.
> I attempted to create a more redundant production like the wiki described,
> but unfortunately that was not successful. I have attached both the patch
> and bison report. Any help, recommendations or suggestions would be
> greatly appreciated.

20MB messages to the list aren't that friendly. Please don't do that
again, unless asked to.

FWIW, the above syntax is a nonstarter, at least unless we're willing to
make POLICY a reserved word (hint: we're not). The reason is that the
ADD/DROP COLUMN forms consider COLUMN to be optional, meaning that the
column name could directly follow ADD; and the column type name, which
could also be just a plain identifier, would directly follow that. So
there's no way to resolve the ambiguity with one token of lookahead.
This actually isn't just bison being stupid: in fact, you simply
cannot tell whether

ALTER TABLE tab ADD POLICY varchar(42);

is an attempt to add a column named "policy" of type varchar(42), or an
attempt to add a policy named "varchar" with quals "42".

Pick a different syntax.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-17 00:49:58
Message-ID: 20140717004958.GI16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Adam,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com> writes:
> >> ALTER TABLE table_name ADD POLICY policy_name (quals);
> >> ALTER TABLE table_name POLICY FOR role_name IS policy_name;
> >> ALTER TABLE table_name DROP POLICY policy_name;
[...]
> This actually isn't just bison being stupid: in fact, you simply
> cannot tell whether
>
> ALTER TABLE tab ADD POLICY varchar(42);
>
> is an attempt to add a column named "policy" of type varchar(42), or an
> attempt to add a policy named "varchar" with quals "42".
>
> Pick a different syntax.

Yeah, now that we're trying to bake this into ALTER TABLE we need to be
a bit more cautious. I'd think:

ALTER TABLE tab POLICY ADD ...

Would work though? (note: haven't looked/tested myself)

Thanks!

Stephen


From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-17 01:49:50
Message-ID: CAKRt6CSyeq4=YgoSo+nvEKSngk9313-xB4TuqEf2byM4B7tK0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

Thanks for the feedback.

20MB messages to the list aren't that friendly. Please don't do that
> again, unless asked to.
>

Apologies, I didn't realize it was so large until after it was sent. At
any rate, it won't happen again.

> FWIW, the above syntax is a nonstarter, at least unless we're willing to
> make POLICY a reserved word (hint: we're not). The reason is that the
> ADD/DROP COLUMN forms consider COLUMN to be optional, meaning that the
> column name could directly follow ADD; and the column type name, which
> could also be just a plain identifier, would directly follow that. So
> there's no way to resolve the ambiguity with one token of lookahead.
> This actually isn't just bison being stupid: in fact, you simply
> cannot tell whether
>
> ALTER TABLE tab ADD POLICY varchar(42);
>
> is an attempt to add a column named "policy" of type varchar(42), or an
> attempt to add a policy named "varchar" with quals "42".
>

Ok. Make sense and I was afraid that was the case.

Thanks,
Adam

--
Adam Brightwell - adam(dot)brightwell(at)crunchydatasolutions(dot)com
Database Engineer - www.crunchydatasolutions.com


From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-17 02:04:27
Message-ID: CAKRt6CSsPxwb2i8g2P+6NWO2GwmdCqLdTaKcYLgTZ=cDiUWjYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen,

Yeah, now that we're trying to bake this into ALTER TABLE we need to be
> a bit more cautious. I'd think:
>
> ALTER TABLE tab POLICY ADD ...
>
> Would work though? (note: haven't looked/tested myself)
>

Yes, I just tested it and the following would work from a grammar
perspective:

ALTER TABLE <table_name> POLICY ADD <policy_name> (policy_quals)
ALTER TABLE <table_name> POLICY DROP <policy_name>

Though, it would obviously require the addition of POLICY to the list of
unreserved keywords. I don't suspect that would be a concern, as it is not
"reserved", but thought I would point it out just in case.

Another thought I had was, would we also want the following, so that
policies could be modified?

ALTER TABLE <table_name> POLICY ALTER <policy_name> (policy_quals)

Thanks,
Adam

--
Adam Brightwell - adam(dot)brightwell(at)crunchydatasolutions(dot)com
Database Engineer - www.crunchydatasolutions.com


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-17 02:06:01
Message-ID: 20140717020601.GO16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Adam,

* Brightwell, Adam (adam(dot)brightwell(at)crunchydatasolutions(dot)com) wrote:
> > Yeah, now that we're trying to bake this into ALTER TABLE we need to be
> > a bit more cautious. I'd think:
> >
> > ALTER TABLE tab POLICY ADD ...
> >
> > Would work though? (note: haven't looked/tested myself)
>
> Yes, I just tested it and the following would work from a grammar
> perspective:
>
> ALTER TABLE <table_name> POLICY ADD <policy_name> (policy_quals)
> ALTER TABLE <table_name> POLICY DROP <policy_name>

Excellent, glad to hear it.

> Though, it would obviously require the addition of POLICY to the list of
> unreserved keywords. I don't suspect that would be a concern, as it is not
> "reserved", but thought I would point it out just in case.

Right, I don't anticipate anyone complaining too loudly about that..

> Another thought I had was, would we also want the following, so that
> policies could be modified?
>
> ALTER TABLE <table_name> POLICY ALTER <policy_name> (policy_quals)

Sounds like a good idea to me.

Thanks!

Stephen


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-17 14:26:07
Message-ID: 20140717142607.GJ11811@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> 20MB messages to the list aren't that friendly. Please don't do that
> again, unless asked to.

FWIW the message was not distributed to the list. I got a note from
Adam and dropped it from the moderation queue.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-18 18:33:26
Message-ID: CA+TgmoZPEBpGq-sXrk15W51eMJ2PD-ajzq9jv0v8XuW48xySqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 16, 2014 at 10:04 PM, Brightwell, Adam
<adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:

> Yes, I just tested it and the following would work from a grammar
> perspective:
>
> ALTER TABLE <table_name> POLICY ADD <policy_name> (policy_quals)
> ALTER TABLE <table_name> POLICY DROP <policy_name>
>
> Though, it would obviously require the addition of POLICY to the list of
> unreserved keywords. I don't suspect that would be a concern, as it is not
> "reserved", but thought I would point it out just in case.
>
> Another thought I had was, would we also want the following, so that
> policies could be modified?
>
> ALTER TABLE <table_name> POLICY ALTER <policy_name> (policy_quals)

I think we do want a way to modify policies. However, we tend to
avoid syntax that involves unnatural word order, as this certainly
does. Maybe it's better to follow the example of CREATE RULE and
CREATE TRIGGER and do something this instead:

CREATE POLICY policy_name ON table_name USING quals;
ALTER POLICY policy_name ON table_name USING quals;
DROP POLICY policy_name ON table_name;

The advantage of this is that you can regard "policy_name ON
table_name" as the identifier for the policy throughout the system.
You need some kind of identifier of that sort anyway to support
COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-18 23:01:13
Message-ID: CAKRt6CSAvMxf83eh88cu2crsQ9gibd=BumdhTHm2Wbym9KqHWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> I think we do want a way to modify policies. However, we tend to
> avoid syntax that involves unnatural word order, as this certainly
> does. Maybe it's better to follow the example of CREATE RULE and
> CREATE TRIGGER and do something this instead:
>
> CREATE POLICY policy_name ON table_name USING quals;
> ALTER POLICY policy_name ON table_name USING quals;
> DROP POLICY policy_name ON table_name;
>
> The advantage of this is that you can regard "policy_name ON
> table_name" as the identifier for the policy throughout the system.
> You need some kind of identifier of that sort anyway to support
> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.

Sounds good. I certainly think it makes a lot of sense to include the
ALTER functionality, if for no other reason than ease of use.

Another item to consider, though I believe it can come later, is per-action
policies. Following the above suggested syntax, perhaps that might look
like the following?

CREATE POLICY policy_name ON table_name FOR action USING quals;
ALTER POLICY policy_name ON table_name FOR action USING quals;
DROP POLICY policy_name ON table_name FOR action;

I was also giving some thought to the use of "POLICY", perhaps I am wrong,
but it does seem it could be at risk of becoming ambiguous down the road.
I can't think of any specific examples at the moment, but my concern is
what happens if we wanted to add another "type" of policy, whatever that
might be, later? Would it make more sense to go ahead and qualify this a
little more with "ROW SECURITY POLICY"?

Thanks,
Adam

--
Adam Brightwell - adam(dot)brightwell(at)crunchydatasolutions(dot)com
Database Engineer - www.crunchydatasolutions.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-07-21 15:38:06
Message-ID: CA+Tgmobhv8B_JEJonvSxHXCvq52minMrphF5kZruDcD0UJmDrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 18, 2014 at 7:01 PM, Brightwell, Adam
<adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
>> I think we do want a way to modify policies. However, we tend to
>> avoid syntax that involves unnatural word order, as this certainly
>> does. Maybe it's better to follow the example of CREATE RULE and
>> CREATE TRIGGER and do something this instead:
>>
>> CREATE POLICY policy_name ON table_name USING quals;
>> ALTER POLICY policy_name ON table_name USING quals;
>> DROP POLICY policy_name ON table_name;
>>
>> The advantage of this is that you can regard "policy_name ON
>> table_name" as the identifier for the policy throughout the system.
>> You need some kind of identifier of that sort anyway to support
>> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.
>
> Sounds good. I certainly think it makes a lot of sense to include the ALTER
> functionality, if for no other reason than ease of use.
>
> Another item to consider, though I believe it can come later, is per-action
> policies. Following the above suggested syntax, perhaps that might look
> like the following?
>
> CREATE POLICY policy_name ON table_name FOR action USING quals;
> ALTER POLICY policy_name ON table_name FOR action USING quals;
> DROP POLICY policy_name ON table_name FOR action;

That seems reasonable. You need to give some thought to what happens
if the user types:

CREATE POLICY pol1 ON tab1 FOR SELECT USING q1;
ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;

I guess you end up with q1 as the SELECT policy and q2 as the INSERT
policy. Similarly, had you typed:

CREATE POLICY pol1 ON tab1 USING q1;
ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;

...then I guess you end up with q2 for INSERTs and q1 for everything
else. I'm wondering if it might be better, though, not to allow the
quals to be specified in CREATE POLICY, or else to allow multiple
actions. Otherwise, getting pg_dump to DTRT might be complicated.

Perhaps:

CREATE POLICY pol1 ON tab1 ( [ [ FOR operation [ OR operation ] ... ]
USING quals ] ... );
where operation = SELECT | INSERT | UPDATE | DELETE

So that you can write things like:

CREATE POLICY pol1 ON tab1 (USING a = 1);
CREATE POLICY pol2 ON tab2 (FOR INSERT USING a = 1, FOR UPDATE USING b
= 1, FOR DELETE USING c = 1);

And then, for ALTER, just allow one change at a time, syntax as you
proposed. That way each policy can be dumped as a single CREATE
statement.

> I was also giving some thought to the use of "POLICY", perhaps I am wrong,
> but it does seem it could be at risk of becoming ambiguous down the road. I
> can't think of any specific examples at the moment, but my concern is what
> happens if we wanted to add another "type" of policy, whatever that might
> be, later? Would it make more sense to go ahead and qualify this a little
> more with "ROW SECURITY POLICY"?

I think that's probably over-engineering. I'm not aware of anything
else we might add that would be likely to be called a policy, and if
we did add something we could probably call it something else instead.
And long command names are annoying.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-08-19 02:19:09
Message-ID: CAKRt6CQnghzWUGwb5Pkwg5gfXwd+-joy8MmMEnqh+O6vpLYzfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

Attached is a patch for RLS that incorporates the following changes:

* Syntax:
- CREATE POLICY <policy_name> ON <table_name> FOR <command> USING (
<qual> )
- ALTER POLICY <policy_name> ON <table_name> FOR <command> USING ( <qual>
)
- DROP POLICY <policy_name> ON <table_name> FOR <command>

* "row_security" GUC Setting - enable/disable row level security.

* BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if
row_security GUC is set to OFF.

There are still some remaining issues but we hope to have those resolved
soon.

Any comments or suggestions would be greatly appreciated.

Thanks,
Adam

On Mon, Jul 21, 2014 at 11:38 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Jul 18, 2014 at 7:01 PM, Brightwell, Adam
> <adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
> >> I think we do want a way to modify policies. However, we tend to
> >> avoid syntax that involves unnatural word order, as this certainly
> >> does. Maybe it's better to follow the example of CREATE RULE and
> >> CREATE TRIGGER and do something this instead:
> >>
> >> CREATE POLICY policy_name ON table_name USING quals;
> >> ALTER POLICY policy_name ON table_name USING quals;
> >> DROP POLICY policy_name ON table_name;
> >>
> >> The advantage of this is that you can regard "policy_name ON
> >> table_name" as the identifier for the policy throughout the system.
> >> You need some kind of identifier of that sort anyway to support
> >> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.
> >
> > Sounds good. I certainly think it makes a lot of sense to include the
> ALTER
> > functionality, if for no other reason than ease of use.
> >
> > Another item to consider, though I believe it can come later, is
> per-action
> > policies. Following the above suggested syntax, perhaps that might look
> > like the following?
> >
> > CREATE POLICY policy_name ON table_name FOR action USING quals;
> > ALTER POLICY policy_name ON table_name FOR action USING quals;
> > DROP POLICY policy_name ON table_name FOR action;
>
> That seems reasonable. You need to give some thought to what happens
> if the user types:
>
> CREATE POLICY pol1 ON tab1 FOR SELECT USING q1;
> ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
>
> I guess you end up with q1 as the SELECT policy and q2 as the INSERT
> policy. Similarly, had you typed:
>
> CREATE POLICY pol1 ON tab1 USING q1;
> ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
>
> ...then I guess you end up with q2 for INSERTs and q1 for everything
> else. I'm wondering if it might be better, though, not to allow the
> quals to be specified in CREATE POLICY, or else to allow multiple
> actions. Otherwise, getting pg_dump to DTRT might be complicated.
>
> Perhaps:
>
> CREATE POLICY pol1 ON tab1 ( [ [ FOR operation [ OR operation ] ... ]
> USING quals ] ... );
> where operation = SELECT | INSERT | UPDATE | DELETE
>
> So that you can write things like:
>
> CREATE POLICY pol1 ON tab1 (USING a = 1);
> CREATE POLICY pol2 ON tab2 (FOR INSERT USING a = 1, FOR UPDATE USING b
> = 1, FOR DELETE USING c = 1);
>
> And then, for ALTER, just allow one change at a time, syntax as you
> proposed. That way each policy can be dumped as a single CREATE
> statement.
>
> > I was also giving some thought to the use of "POLICY", perhaps I am
> wrong,
> > but it does seem it could be at risk of becoming ambiguous down the
> road. I
> > can't think of any specific examples at the moment, but my concern is
> what
> > happens if we wanted to add another "type" of policy, whatever that might
> > be, later? Would it make more sense to go ahead and qualify this a
> little
> > more with "ROW SECURITY POLICY"?
>
> I think that's probably over-engineering. I'm not aware of anything
> else we might add that would be likely to be called a policy, and if
> we did add something we could probably call it something else instead.
> And long command names are annoying.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
Adam Brightwell - adam(dot)brightwell(at)crunchydatasolutions(dot)com
Database Engineer - www.crunchydatasolutions.com

Attachment Content-Type Size
rls_8-18-2014.patch text/x-patch 185.8 KB

From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-08-30 00:16:46
Message-ID: CAKRt6CRG8JJ_XtmByjxQHyCaCmMK4-SqQPF6oeWMTseHc9shRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

Attached is a patch for RLS that was create against master at
01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.

Overview:

This patch provides the capability to create multiple named row level
security policies for a table on a per command basis and assign them to be
applied to specific roles/users.

It contains the following changes:

* Syntax:

CREATE POLICY <name> ON <table>
[ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
[ TO { PUBLIC | <role> [, <role> ] } ]
USING (<condition>)

Creates a RLS policy named <name> on <table>. Specifying a command is
optional, but the default is ALL. Specifying a role is options, but the
default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC is
applied and a warning is raised.

ALTER POLICY <name> ON <table>
[ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
[ TO { PUBLIC | <role> [, <role> ] } ]
USING (<condition>)

Alter a RLS policy named <name> on <table>. Specifying a command is
optional, if provided then the policy's command is changed otherwise it is
left as-is. Specifying a role is optional, if provided then the policy's
role is changed otherwise it is left as-is. The <condition> must always be
provided and is therefore always replaced.

DROP POLICY <name> ON <table>

Drop a RLS policy named <name> on <table>.

* Plancache Invalidation: If a relation has a row-security policy and
row-security is enabled then the invalidation will occur when either the
row_security GUC is changed OR when a the current user changes. This
invalidation ONLY takes place for cached plans where the target relation
has a row security policy.

* Security Qual Expression: All row-security policies are OR'ed together.
In the case where another security qual is added, such as in the case of a
Security Barrier Views, the the row-security policies are AND'ed with those
quals.

Example:

If a table has policies p1 and p2 and a security barrier view is created
for that table called rls_sbv, then SELECT * FROM rls_sbv WHERE
<some_condition> would result in the following expression: <some_condition>
AND (p1 OR p2)

* row_security GUC - enable/disable row level security.

* BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if
row_security GUC is set to OFF. If a user sets row_security to OFF and
does not have this attribute, then an error is raised when attempting to
query a relation with a RLS policy.

* psql \d <table> support: psql describe support for listing policy
information per table.

* pg_policies system view: lists all row-security policy information.

Any feedback, comments or suggestions would be greatly appreciated.

Thanks,
Adam

On Mon, Aug 18, 2014 at 10:19 PM, Brightwell, Adam <
adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:

> All,
>
> Attached is a patch for RLS that incorporates the following changes:
>
> * Syntax:
> - CREATE POLICY <policy_name> ON <table_name> FOR <command> USING (
> <qual> )
> - ALTER POLICY <policy_name> ON <table_name> FOR <command> USING (
> <qual> )
> - DROP POLICY <policy_name> ON <table_name> FOR <command>
>
> * "row_security" GUC Setting - enable/disable row level security.
>
> * BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if
> row_security GUC is set to OFF.
>
> There are still some remaining issues but we hope to have those resolved
> soon.
>
> Any comments or suggestions would be greatly appreciated.
>
> Thanks,
> Adam
>
>
> On Mon, Jul 21, 2014 at 11:38 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
>
>> On Fri, Jul 18, 2014 at 7:01 PM, Brightwell, Adam
>> <adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
>> >> I think we do want a way to modify policies. However, we tend to
>> >> avoid syntax that involves unnatural word order, as this certainly
>> >> does. Maybe it's better to follow the example of CREATE RULE and
>> >> CREATE TRIGGER and do something this instead:
>> >>
>> >> CREATE POLICY policy_name ON table_name USING quals;
>> >> ALTER POLICY policy_name ON table_name USING quals;
>> >> DROP POLICY policy_name ON table_name;
>> >>
>> >> The advantage of this is that you can regard "policy_name ON
>> >> table_name" as the identifier for the policy throughout the system.
>> >> You need some kind of identifier of that sort anyway to support
>> >> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.
>> >
>> > Sounds good. I certainly think it makes a lot of sense to include the
>> ALTER
>> > functionality, if for no other reason than ease of use.
>> >
>> > Another item to consider, though I believe it can come later, is
>> per-action
>> > policies. Following the above suggested syntax, perhaps that might look
>> > like the following?
>> >
>> > CREATE POLICY policy_name ON table_name FOR action USING quals;
>> > ALTER POLICY policy_name ON table_name FOR action USING quals;
>> > DROP POLICY policy_name ON table_name FOR action;
>>
>> That seems reasonable. You need to give some thought to what happens
>> if the user types:
>>
>> CREATE POLICY pol1 ON tab1 FOR SELECT USING q1;
>> ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
>>
>> I guess you end up with q1 as the SELECT policy and q2 as the INSERT
>> policy. Similarly, had you typed:
>>
>> CREATE POLICY pol1 ON tab1 USING q1;
>> ALTER POLICY pol1 ON tab1 FOR INSERT USING q2;
>>
>> ...then I guess you end up with q2 for INSERTs and q1 for everything
>> else. I'm wondering if it might be better, though, not to allow the
>> quals to be specified in CREATE POLICY, or else to allow multiple
>> actions. Otherwise, getting pg_dump to DTRT might be complicated.
>>
>> Perhaps:
>>
>> CREATE POLICY pol1 ON tab1 ( [ [ FOR operation [ OR operation ] ... ]
>> USING quals ] ... );
>> where operation = SELECT | INSERT | UPDATE | DELETE
>>
>> So that you can write things like:
>>
>> CREATE POLICY pol1 ON tab1 (USING a = 1);
>> CREATE POLICY pol2 ON tab2 (FOR INSERT USING a = 1, FOR UPDATE USING b
>> = 1, FOR DELETE USING c = 1);
>>
>> And then, for ALTER, just allow one change at a time, syntax as you
>> proposed. That way each policy can be dumped as a single CREATE
>> statement.
>>
>> > I was also giving some thought to the use of "POLICY", perhaps I am
>> wrong,
>> > but it does seem it could be at risk of becoming ambiguous down the
>> road. I
>> > can't think of any specific examples at the moment, but my concern is
>> what
>> > happens if we wanted to add another "type" of policy, whatever that
>> might
>> > be, later? Would it make more sense to go ahead and qualify this a
>> little
>> > more with "ROW SECURITY POLICY"?
>>
>> I think that's probably over-engineering. I'm not aware of anything
>> else we might add that would be likely to be called a policy, and if
>> we did add something we could probably call it something else instead.
>> And long command names are annoying.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>
>
>
> --
> Adam Brightwell - adam(dot)brightwell(at)crunchydatasolutions(dot)com
> Database Engineer - www.crunchydatasolutions.com
>

--
Adam Brightwell - adam(dot)brightwell(at)crunchydatasolutions(dot)com
Database Engineer - www.crunchydatasolutions.com

Attachment Content-Type Size
rls_8-29-2014.patch text/x-patch 229.7 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-01 04:15:08
Message-ID: 20140901041507.GP16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Adam, all,

* Brightwell, Adam (adam(dot)brightwell(at)crunchydatasolutions(dot)com) wrote:
> Attached is a patch for RLS that was create against master at
> 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.

Many thanks for posting this. As others may realize already, I've
reviewed and modified this patch already, working with Adam to get it
ready. I'm continuing to review and test it, but in general I'm quite
happy with how it's shaping up- additional review, testing, and comments
are always appreciated though.

> Alter a RLS policy named <name> on <table>. Specifying a command is
> optional, if provided then the policy's command is changed otherwise it is
> left as-is. Specifying a role is optional, if provided then the policy's
> role is changed otherwise it is left as-is. The <condition> must always be
> provided and is therefore always replaced.

I'm pretty sure the <condition> is also optional in this patch (that was
a late change that I made), but the documentation needs to be updated.

> * Plancache Invalidation: If a relation has a row-security policy and
> row-security is enabled then the invalidation will occur when either the
> row_security GUC is changed OR when a the current user changes. This
> invalidation ONLY takes place for cached plans where the target relation
> has a row security policy.

I know there was a lot of discussion about this previously, but I'm fine
with the initial version simply invalidating plans which involve
RLS-enabled relations and role changes. This patch doesn't create any
regressions for individuals who are not using RLS. We can certainly
look into improving this in the future to have per-role plan caches but
it's a fair bit of additional non-trivial code that can be added
independently.

> * Security Qual Expression: All row-security policies are OR'ed together.

This was also a point of much discussion, but I continue to feel this is
the right approach for the initial version. We can add flexability here
later, if necessary, but OR'ing these together is in-line with how role
membership works today (you have right for all roles you are a member
of, persuant to inherit/noinherit status, of course).

> * row_security GUC - enable/disable row level security.

Note that, as discussed, pg_dump will set row_security off, unless
specifically asked to enable it. row_security will also be set to off
when the user logging in is a superuser or does a 'set role' to a
superuser. Currently, if a user logging in is *not* a superuser, or a
'set role' is done to a non-superuser, row_security gets re-set to
enabled. This is one aspect of the patch that I think we should change
(which is a matter of removing just a few lines of code and then
updating the regression tests to do 'set row_security = on;' before
running), because if you log in as a superuser and then 'set role' to a
non-superuser, it occurs to me now (it didn't really when I wrote this
originally) as a bit surprising that row_security gets set to 'on' when
doing a 'set role'.

One thing that I really like about this approach is that a superuser can
explicitly set 'row_security' to on and be able to see what happens.
Clearly, in an environment of untrusted users, that could be dangerous,
but it can also be an extremely useful way of testing things,
particularly in development environments where everyone is a superuser.

This deserves a bit more documentation also.

> * BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if
> row_security GUC is set to OFF. If a user sets row_security to OFF and
> does not have this attribute, then an error is raised when attempting to
> query a relation with a RLS policy.

(note that the superuser is always considered to have the bypassrls
attribute)

> * psql \d <table> support: psql describe support for listing policy
> information per table.

This works pretty well for me, but we may want to add some indication
that RLS is on a table in the \dp listing.

Thanks!

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-03 14:17:15
Message-ID: CA+TgmobqO0z87EiVfDEwjCac1dC4ahh5wCVoQoxrSaTeU1T-RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Aug 29, 2014 at 8:16 PM, Brightwell, Adam
<adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
> Attached is a patch for RLS that was create against master at
> 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.
>
> Overview:
>
> This patch provides the capability to create multiple named row level
> security policies for a table on a per command basis and assign them to be
> applied to specific roles/users.
>
> It contains the following changes:
>
> * Syntax:
>
> CREATE POLICY <name> ON <table>
> [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> [ TO { PUBLIC | <role> [, <role> ] } ]
> USING (<condition>)
>
> Creates a RLS policy named <name> on <table>. Specifying a command is
> optional, but the default is ALL. Specifying a role is options, but the
> default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC is
> applied and a warning is raised.
>
> ALTER POLICY <name> ON <table>
> [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> [ TO { PUBLIC | <role> [, <role> ] } ]
> USING (<condition>)
>
> Alter a RLS policy named <name> on <table>. Specifying a command is
> optional, if provided then the policy's command is changed otherwise it is
> left as-is. Specifying a role is optional, if provided then the policy's
> role is changed otherwise it is left as-is. The <condition> must always be
> provided and is therefore always replaced.

This is not a full review of this patch; as we're mid-CommitFest, I
assume this will get added to the next CommitFest.

In earlier discussions, it was proposed (and I thought the proposal
was viewed favorably) that when enabling row-level security for a
table (i.e. before doing CREATE POLICY), you'd have to first flip the
table to a default-deny mode:

ALTER TABLE <name> ENABLE ROW LEVEL SECURITY;

In this design, I'm not sure what happens when there are policies for
some but not all users or some but not all actions. Does creating a
INSERT policy for one particular user cause a default-deny policy to
be turned on for all other users and all other operations? That might
be OK, but at the very least it should be documented more clearly.
Does dropping the very last policy then instantaneously flip the table
back to default-allow?

As far as I can tell from the patch, and that's not too far since I've
only looked at briefly, there's a default-deny policy only if there is
at least 1 policy that applies to your user ID for this operation. As
far as making it easy to create a watertight combination of policies,
that seems like a bad plan.

+ elog(ERROR, "Table \"%s\" already has a policy named \"%s\"."
+ " Use a different name for the policy or to modify this policy"
+ " use ALTER POLICY %s ON %s USING (qual)",
+ RelationGetRelationName(target_table), stmt->policy_name,
+ RelationGetRelationName(target_table), stmt->policy_name);
+

That needs to be an ereport, be capitalized properly, and the hint, if
it's to be included at all, needs to go into errhint().

+ errhint("all roles are considered members
of public")));

Wrong message style for a hint. Also, not sure that's actually
appropriate for a hint.

+ case EXPR_KIND_ROW_SECURITY:
+ return "ROW SECURITY";

This is quite simply bizarre. That's not the SQL syntax of anything.

+ | ROW SECURITY row_security_option
+ {
+ VariableSetStmt *n = makeNode(VariableSetStmt);
+ n->kind = VAR_SET_VALUE;
+ n->name = "row_security";
+ n->args = list_make1(makeStringConst($3, @3));
+ $$ = n;
+ }

I object to this. There's no reason that we should bloat the parser
to allow SET ROW SECURITY in lieu of SET row_security unless this is a
standard-mandated syntax with standard-mandated semantics, which I bet
it isn't.

/*
+ * Although only "on" and"off" are documented, we accept all likely
variants of
+ * "on" and "off".
+ */
+ static const struct config_enum_entry row_security_options[] = {
+ {"off", ROW_SECURITY_OFF, false},
+ {"on", ROW_SECURITY_ON, false},
+ {"true", ROW_SECURITY_ON, true},
+ {"false", ROW_SECURITY_OFF, true},
+ {"yes", ROW_SECURITY_ON, true},
+ {"no", ROW_SECURITY_OFF, true},
+ {"1", ROW_SECURITY_ON, true},
+ {"0", ROW_SECURITY_OFF, true},
+ {NULL, 0, false}
+ };

Just make it a bool and you get all this for free.

+ /*
+ * is_rls_enabled -
+ * determines if row-security is enabled by checking the value of the system
+ * configuration "row_security".
+ */
+ bool
+ is_rls_enabled()
+ {
+ char const *rls_option;
+
+ rls_option = GetConfigOption("row_security", true, false);
+
+ return (strcmp(rls_option, "on") == 0);
+ }

Words fail me.

+ if (AuthenticatedUserIsSuperuser)
+ SetConfigOption("row_security", "off", PGC_INTERNAL, PGC_S_OVERRIDE);

Injecting this kind of magic into InitializeSessionUserId(),
SetSessionAuthorization(), and SetCurrentRoleId() seems 100%
unacceptable to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-03 14:20:49
Message-ID: CAOuzzgrVn1-2TjwbcmgyK+1cyjJ7PaMTLUbKfkejooMKobsjMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hey Robert,

On my phone at the moment but wanted to reply.

I'm working through a few of these issues already actually (noticed as I've
been going over it with Adam), but certainly appreciate the additional
review. We've not posted another update quite yet but plan to shortly.

Thanks!

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-03 14:40:45
Message-ID: CAOuzzgpG04ZZYVZK-_vMtgV5ih4GHXTOrthu9mcc97+=DJtCwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

Alright, I can't help it so I'll try and reply from my phone for a couple
of these. :)

On Wednesday, September 3, 2014, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Aug 29, 2014 at 8:16 PM, Brightwell, Adam
> <adam(dot)brightwell(at)crunchydatasolutions(dot)com <javascript:;>> wrote:
> > Attached is a patch for RLS that was create against master at
> > 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.
> >
> > Overview:
> >
> > This patch provides the capability to create multiple named row level
> > security policies for a table on a per command basis and assign them to
> be
> > applied to specific roles/users.
> >
> > It contains the following changes:
> >
> > * Syntax:
> >
> > CREATE POLICY <name> ON <table>
> > [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> > [ TO { PUBLIC | <role> [, <role> ] } ]
> > USING (<condition>)
> >
> > Creates a RLS policy named <name> on <table>. Specifying a command is
> > optional, but the default is ALL. Specifying a role is options, but the
> > default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC
> is
> > applied and a warning is raised.
> >
> > ALTER POLICY <name> ON <table>
> > [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> > [ TO { PUBLIC | <role> [, <role> ] } ]
> > USING (<condition>)
> >
> > Alter a RLS policy named <name> on <table>. Specifying a command is
> > optional, if provided then the policy's command is changed otherwise it
> is
> > left as-is. Specifying a role is optional, if provided then the policy's
> > role is changed otherwise it is left as-is. The <condition> must always
> be
> > provided and is therefore always replaced.
>
> This is not a full review of this patch; as we're mid-CommitFest, I
> assume this will get added to the next CommitFest.

As per usual, the expectation is that the patch is reviewed and updated
during the commitfest. Given that the commitfest isn't even over according
to the calendar it seems a bit premature to talk about the next one, but
certainly if it's not up to a commitable level before the end of this
commitfest then it'll be submitted for the next.

> In earlier discussions, it was proposed (and I thought the proposal
> was viewed favorably) that when enabling row-level security for a
> table (i.e. before doing CREATE POLICY), you'd have to first flip the
> table to a default-deny mode:

I do recall that (now that you remind me- clearly it had been lost during
the subsequent discussion, from my point of view at least) and agree that
it'd be useful. I don't believe it'll be difficult to address.

> ALTER TABLE <name> ENABLE ROW LEVEL SECURITY;

Sounds reasonable to me.

> + elog(ERROR, "Table \"%s\" already has a policy named \"%s\"."
> + " Use a different name for the policy or to modify this
> policy"
> + " use ALTER POLICY %s ON %s USING (qual)",
> + RelationGetRelationName(target_table), stmt->policy_name,
> + RelationGetRelationName(target_table), stmt->policy_name);
> +

That needs to be an ereport, be capitalized properly, and the hint, if
> it's to be included at all, needs to go into errhint().

Already addressed.

> + errhint("all roles are considered members
> of public")));
>
> Wrong message style for a hint. Also, not sure that's actually
> appropriate for a hint.

Fair enough. Will address.

> + case EXPR_KIND_ROW_SECURITY:
> + return "ROW SECURITY";
>
> This is quite simply bizarre. That's not the SQL syntax of anything.

Will address.

> + | ROW SECURITY row_security_option
> + {
> + VariableSetStmt *n = makeNode(VariableSetStmt);
> + n->kind = VAR_SET_VALUE;
> + n->name = "row_security";
> + n->args = list_make1(makeStringConst($3, @3));
> + $$ = n;
> + }
>
> I object to this. There's no reason that we should bloat the parser
> to allow SET ROW SECURITY in lieu of SET row_security unless this is a
> standard-mandated syntax with standard-mandated semantics, which I bet
> it isn't.

Agreed. Seemed like a nice idea but it's not necessary.

> /*
> + * Although only "on" and"off" are documented, we accept all likely
> variants of
> + * "on" and "off".
> + */
> + static const struct config_enum_entry row_security_options[] = {
> + {"off", ROW_SECURITY_OFF, false},
> + {"on", ROW_SECURITY_ON, false},
> + {"true", ROW_SECURITY_ON, true},
> + {"false", ROW_SECURITY_OFF, true},
> + {"yes", ROW_SECURITY_ON, true},
> + {"no", ROW_SECURITY_OFF, true},
> + {"1", ROW_SECURITY_ON, true},
> + {"0", ROW_SECURITY_OFF, true},
> + {NULL, 0, false}
> + };
>
> Just make it a bool and you get all this for free.

Right- holdover from an earlier attempt to make it more complicated but now
we've simplified it and so it should just be a bool.

> + if (AuthenticatedUserIsSuperuser)
> + SetConfigOption("row_security", "off", PGC_INTERNAL,
> PGC_S_OVERRIDE);
>
> Injecting this kind of magic into InitializeSessionUserId(),
> SetSessionAuthorization(), and SetCurrentRoleId() seems 100%
> unacceptable to me.
>

I was struggling with the right way to address this and welcome
suggestions. The primary issue is that I really want to support a superuser
turning it on, so we can't simply have it disabled for all superusers all
the time. The requirement that it not be enabled by default for superusers
makes sense, but how far does that extend and how do we address upgrades?
In particular, can we simply set row_security=off as a custom GUC setting
when superusers are created or roles altered to be made superusers? Would
we do that in pg_upgrade?

Thanks!

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-03 15:21:04
Message-ID: CA+TgmoYA=uixXmN390SFgfQgVmLL-As5bJaL0oM7yrpPVwNPxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 3, 2014 at 10:40 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> This is not a full review of this patch; as we're mid-CommitFest, I
>> assume this will get added to the next CommitFest.
>
> As per usual, the expectation is that the patch is reviewed and updated
> during the commitfest. Given that the commitfest isn't even over according
> to the calendar it seems a bit premature to talk about the next one, but
> certainly if it's not up to a commitable level before the end of this
> commitfest then it'll be submitted for the next.

The first version of this patch that was described as "ready for
review" was submitted on August 29th. The previous revision was
submitted on August 18th. Both of those dates are after the
CommitFest deadline of August 15th. So from where I sit this is not
timely submitted for this CommitFest. The last version before August
was submitted in April (there's a link to a version supposedly
submitted in June in the CommitFest application, but it doesn't point
to an email with a patch attached). I don't want to (and don't feel I
should have to) decide between dropping everything to review an
untimely-submitted patch and having it get committed with no review
from anyone who wasn't involved in writing it.

>> + if (AuthenticatedUserIsSuperuser)
>> + SetConfigOption("row_security", "off", PGC_INTERNAL,
>> PGC_S_OVERRIDE);
>>
>> Injecting this kind of magic into InitializeSessionUserId(),
>> SetSessionAuthorization(), and SetCurrentRoleId() seems 100%
>> unacceptable to me.
>
> I was struggling with the right way to address this and welcome suggestions.
> The primary issue is that I really want to support a superuser turning it
> on, so we can't simply have it disabled for all superusers all the time. The
> requirement that it not be enabled by default for superusers makes sense,
> but how far does that extend and how do we address upgrades? In particular,
> can we simply set row_security=off as a custom GUC setting when superusers
> are created or roles altered to be made superusers? Would we do that in
> pg_upgrade?

I think you need to have the GUC have one default value, not one
default for superusers and another default for everybody else. I
previously proposed making the GUC on/off/force, with "on" meaning
"apply row-level security unless we have permission to bypass it,
either because we are the table owner or the superuser", "off" meaning
"error out if we would be forced to apply row-level security", and
"force" meaning "always apply row-level security even if we have
permission to bypass it". I still think that's a good proposal.
There may be other reasonable alternatives as well, but making changes
to one GUC magically change other GUCs under the hood isn't one of
them.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-06 06:54:50
Message-ID: CAKRt6CR=XnRo4rJEGt0LQKFV=BvBq5+mHtwqdUZyDwcziwXgGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

Attached is a updated patch taking into account the recommendations
provided.

This patch created against master
at ad5d46a4494b0b480a3af246bb4227d9bdadca37

The following items have been addressed:

* Add ALTER TABLE <name> { ENABLE | DISABLE } ROW LEVEL SECURITY - set flag
on table to allow for a default-deny capability. If RLS is enabled on a
table and has no policies, then a default-deny policy is automatically
applied. If RLS is disabled on table and the table still has policies on
it then then an error is raised. Though if DISABLE is accompanied with
CASCADE, then all policies will be removed and no error is raised.

* Update CREATE POLICY to include WITH CHECK ( <expression> ). Therefore,
the syntax is now as follows:
CREATE POLICY <name> ON <table>
[ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
[ USING ( <expression> ) ]
[ WITH CHECK ( <expression> ) ]

A WITH CHECK expression is required for creating an INSERT policy and is
optional on UPDATE and ALL. The intended purpose is to provide a VIEW-like
WITH CHECK OPTION functionality to RLS.

* Add ALTER POLICY <name> ON <table> RENAME TO <new_name> - renames a
policy.

* Updated GUC row_security to allow ON | OFF | FORCE. Each option breaks
down as follows:
- ON - RLS is appled to all roles except the table owner and superusers.
- OFF - RLS can be bypassed, but only by roles with BYPASSRLS. If the
roles does not have BYPASSRLS, then an error is raised.
- FORCE - RLS is applied to all roles, regardless of ownership,
superuser or BYPASSRLS.

* Removed SET ROW SECURITY { ON | OFF } as requested.

* Removed all GetConfigOption for "row_security" GUC.

* Removed setting row_security GUC to OFF in SET SESSION/SET ROLE for
superuser.

* Add psql \dp support. Displays RLS information in new column "Policies".

* Updated documentation.

* Other cleanup and improvements.

There are still some minor issues being worked through, however, it is
expected that those will be resolved soon. However, any feedback, comments
or suggestions on the above and in general would be greatly appreciated.

Thanks,
Adam

On Wed, Sep 3, 2014 at 10:17 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Aug 29, 2014 at 8:16 PM, Brightwell, Adam
> <adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
> > Attached is a patch for RLS that was create against master at
> > 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.
> >
> > Overview:
> >
> > This patch provides the capability to create multiple named row level
> > security policies for a table on a per command basis and assign them to
> be
> > applied to specific roles/users.
> >
> > It contains the following changes:
> >
> > * Syntax:
> >
> > CREATE POLICY <name> ON <table>
> > [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> > [ TO { PUBLIC | <role> [, <role> ] } ]
> > USING (<condition>)
> >
> > Creates a RLS policy named <name> on <table>. Specifying a command is
> > optional, but the default is ALL. Specifying a role is options, but the
> > default is PUBLIC. If PUBLIC and other roles are specified, ONLY PUBLIC
> is
> > applied and a warning is raised.
> >
> > ALTER POLICY <name> ON <table>
> > [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
> > [ TO { PUBLIC | <role> [, <role> ] } ]
> > USING (<condition>)
> >
> > Alter a RLS policy named <name> on <table>. Specifying a command is
> > optional, if provided then the policy's command is changed otherwise it
> is
> > left as-is. Specifying a role is optional, if provided then the policy's
> > role is changed otherwise it is left as-is. The <condition> must always
> be
> > provided and is therefore always replaced.
>
> This is not a full review of this patch; as we're mid-CommitFest, I
> assume this will get added to the next CommitFest.
>
> In earlier discussions, it was proposed (and I thought the proposal
> was viewed favorably) that when enabling row-level security for a
> table (i.e. before doing CREATE POLICY), you'd have to first flip the
> table to a default-deny mode:
>
> ALTER TABLE <name> ENABLE ROW LEVEL SECURITY;
>
> In this design, I'm not sure what happens when there are policies for
> some but not all users or some but not all actions. Does creating a
> INSERT policy for one particular user cause a default-deny policy to
> be turned on for all other users and all other operations? That might
> be OK, but at the very least it should be documented more clearly.
> Does dropping the very last policy then instantaneously flip the table
> back to default-allow?
>
> As far as I can tell from the patch, and that's not too far since I've
> only looked at briefly, there's a default-deny policy only if there is
> at least 1 policy that applies to your user ID for this operation. As
> far as making it easy to create a watertight combination of policies,
> that seems like a bad plan.
>
> + elog(ERROR, "Table \"%s\" already has a policy named \"%s\"."
> + " Use a different name for the policy or to modify this
> policy"
> + " use ALTER POLICY %s ON %s USING (qual)",
> + RelationGetRelationName(target_table), stmt->policy_name,
> + RelationGetRelationName(target_table), stmt->policy_name);
> +
>
> That needs to be an ereport, be capitalized properly, and the hint, if
> it's to be included at all, needs to go into errhint().
>
> + errhint("all roles are considered members
> of public")));
>
> Wrong message style for a hint. Also, not sure that's actually
> appropriate for a hint.
>
> + case EXPR_KIND_ROW_SECURITY:
> + return "ROW SECURITY";
>
> This is quite simply bizarre. That's not the SQL syntax of anything.
>
> + | ROW SECURITY row_security_option
> + {
> + VariableSetStmt *n = makeNode(VariableSetStmt);
> + n->kind = VAR_SET_VALUE;
> + n->name = "row_security";
> + n->args = list_make1(makeStringConst($3, @3));
> + $$ = n;
> + }
>
> I object to this. There's no reason that we should bloat the parser
> to allow SET ROW SECURITY in lieu of SET row_security unless this is a
> standard-mandated syntax with standard-mandated semantics, which I bet
> it isn't.
>
> /*
> + * Although only "on" and"off" are documented, we accept all likely
> variants of
> + * "on" and "off".
> + */
> + static const struct config_enum_entry row_security_options[] = {
> + {"off", ROW_SECURITY_OFF, false},
> + {"on", ROW_SECURITY_ON, false},
> + {"true", ROW_SECURITY_ON, true},
> + {"false", ROW_SECURITY_OFF, true},
> + {"yes", ROW_SECURITY_ON, true},
> + {"no", ROW_SECURITY_OFF, true},
> + {"1", ROW_SECURITY_ON, true},
> + {"0", ROW_SECURITY_OFF, true},
> + {NULL, 0, false}
> + };
>
> Just make it a bool and you get all this for free.
>
> + /*
> + * is_rls_enabled -
> + * determines if row-security is enabled by checking the value of the
> system
> + * configuration "row_security".
> + */
> + bool
> + is_rls_enabled()
> + {
> + char const *rls_option;
> +
> + rls_option = GetConfigOption("row_security", true, false);
> +
> + return (strcmp(rls_option, "on") == 0);
> + }
>
> Words fail me.
>
> + if (AuthenticatedUserIsSuperuser)
> + SetConfigOption("row_security", "off", PGC_INTERNAL,
> PGC_S_OVERRIDE);
>
> Injecting this kind of magic into InitializeSessionUserId(),
> SetSessionAuthorization(), and SetCurrentRoleId() seems 100%
> unacceptable to me.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
Adam Brightwell - adam(dot)brightwell(at)crunchydatasolutions(dot)com
Database Engineer - www.crunchydatasolutions.com

Attachment Content-Type Size
rls_9-6-2014.patch text/x-patch 276.0 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-10 21:50:05
Message-ID: 20140910215005.GT16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

* Brightwell, Adam (adam(dot)brightwell(at)crunchydatasolutions(dot)com) wrote:
> Attached is a updated patch taking into account the recommendations
> provided.

Alright, attached is a patch which I've been over in a great deal more
detail, as we seem to have moved beyond grammar and simple
functionality. It's been much reworked and improved (particularly in
rewrite/rowsecurity.c, but also commands/policy.c). Other
improvements of note (not including the improvements made and
mentioned by Adam previously):

Lots of additional comments around what's happening
Improved SGML documentation
Better \d and \dp support
Explicit function for check row-security requirements
Correct handling for views run under policies
Simplified changes to copy.c
Use normal DROP and RENAME processes (eg: DropStmt and friends)
Default-deny policy implementation, and regression tests
Handle sub-queries in WITH CHECK
Avoid duplicate policy application
Corrected plancache invalidation
Improved and additional regression tests
tab completion

This addresses all of the comments brought up previously, as far as
I'm aware, along with quite a few other issues which I found while
doing my review and rework.

As always- testing, reviews, comments are welcome. We've done a fair
bit of testing internally, but it's great to see how others are
imaginging and trying to use new capabilities like these- especially
if they run into any problems! :)

This took quite a bit longer than I had expected, but I think the
rework, review and additional testing was well worth it.

I'm planning to break from this for a few days and resume helping with
the commitfest more-or-less full-time until I have to head out for
PostgresOpen.

Thanks!

Stephen

Attachment Content-Type Size
rls_9-10-2014.patch text/x-diff 320.8 KB

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Stephen Frost" <sfrost(at)snowman(dot)net>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Dean Rasheed" <dean(dot)a(dot)rasheed(at)gmail(dot)com>, "Craig Ringer" <craig(at)2ndquadrant(dot)com>, "Yeb Havinga" <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-11 07:35:24
Message-ID: 4727139a97940d6879293f487ce27de8.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, September 10, 2014 23:50, Stephen Frost wrote:
> [rls_9-10-2014.patch]

I can't get this to apply; I attach the complaints of patch.

Erik Rijkers

Attachment Content-Type Size
out.txt text/plain 3.8 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-11 11:25:21
Message-ID: 20140911112521.GX16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Erik,

* Erik Rijkers (er(at)xs4all(dot)nl) wrote:
> On Wed, September 10, 2014 23:50, Stephen Frost wrote:
> > [rls_9-10-2014.patch]
>
> I can't get this to apply; I attach the complaints of patch.

Thanks for taking a look at this!

[...]
> patching file src/include/catalog/catversion.h
> Hunk #1 FAILED at 53.
> 1 out of 1 hunk FAILED -- saving rejects to file src/include/catalog/catversion.h.rej

That's just the catversion bump- you can simply ignore it and everything
should be fine. Look forward to hearing how it works for you!

Thanks again,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-11 16:52:46
Message-ID: CA+TgmoaxyLRqVkjz9fqrc7aaMoHhC8jcZ-hFGVMSBMdX0w3qVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Sep 6, 2014 at 2:54 AM, Brightwell, Adam
<adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
> * Add ALTER TABLE <name> { ENABLE | DISABLE } ROW LEVEL SECURITY - set flag
> on table to allow for a default-deny capability. If RLS is enabled on a
> table and has no policies, then a default-deny policy is automatically
> applied. If RLS is disabled on table and the table still has policies on it
> then then an error is raised. Though if DISABLE is accompanied with
> CASCADE, then all policies will be removed and no error is raised.

This text doesn't make it clear that all of the cases have been
covered; in particular, you didn't specify whether an error is thrown
if you try to add a policy to a table with DISABLE ROW LEVEL SECURITY
in effect. Backing up a bit, I think there are two sensible designs
here:

1. Row level security policies can't exist for a table with DISABLE
ROW LEVEL SECURITY in effect. It sounds like this is what you have
implemented, modulo any hypothetical bugs. You can't add policies
without enabling RLS, and you can't disable RLS without dropping them
all.

2. Row level security policies can exist for a table with DISABLE ROW
LEVEL SECURITY in effect, but they don't do anything until RLS is
enabled. A possible advantage of this approach is that you could
*temporarily* shut off RLS for a table without having to drop all of
your policies and put them back. I kind of like this approach; we
have something similar for triggers, and I think it could be useful to
people.

If you stick with approach #1, make sure pg_dump is guaranteed to
enable RLS before applying the policies. And either way, you should
that pg_dump behaves sanely in the case where there are circular
dependencies, like you have two table A and B, and each has a RLS
policy that manages to use the other table's row-type. (You probably
also want to check that DROP and DROP .. CASCADE on either policy or
either table does the right thing in that situation, but that's
probably easier to get right.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-11 19:08:53
Message-ID: 20140911190853.GF16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Sat, Sep 6, 2014 at 2:54 AM, Brightwell, Adam
> <adam(dot)brightwell(at)crunchydatasolutions(dot)com> wrote:
> > * Add ALTER TABLE <name> { ENABLE | DISABLE } ROW LEVEL SECURITY - set flag
> > on table to allow for a default-deny capability. If RLS is enabled on a
> > table and has no policies, then a default-deny policy is automatically
> > applied. If RLS is disabled on table and the table still has policies on it
> > then then an error is raised. Though if DISABLE is accompanied with
> > CASCADE, then all policies will be removed and no error is raised.
>
> This text doesn't make it clear that all of the cases have been
> covered; in particular, you didn't specify whether an error is thrown
> if you try to add a policy to a table with DISABLE ROW LEVEL SECURITY
> in effect. Backing up a bit, I think there are two sensible designs
> here:

Ah, yeah, the text could certainly be clearer.

> 1. Row level security policies can't exist for a table with DISABLE
> ROW LEVEL SECURITY in effect. It sounds like this is what you have
> implemented, modulo any hypothetical bugs. You can't add policies
> without enabling RLS, and you can't disable RLS without dropping them
> all.

Right, this was the approach we were taking. Specifically, adding
policies would implicitly enable RLS for the relation.

> 2. Row level security policies can exist for a table with DISABLE ROW
> LEVEL SECURITY in effect, but they don't do anything until RLS is
> enabled. A possible advantage of this approach is that you could
> *temporarily* shut off RLS for a table without having to drop all of
> your policies and put them back. I kind of like this approach; we
> have something similar for triggers, and I think it could be useful to
> people.

I like the idea of being able to turn them off without dropping them.
We have that with row_security = off, but that would only work for the
owner or a superuser (or a user with bypassrls). This would allow
disabling RLS temporairly for everything accessing the table.

The one thing I'm wondering about with this design is- what happens when
a policy is initially added? Currently, we automatically turn on RLS
for the table when that happens. I'm not thrilled with the idea that
you have to add policies AND turn on RLS explicitly- someone might add
policies but then forget to turn RLS on..

> If you stick with approach #1, make sure pg_dump is guaranteed to
> enable RLS before applying the policies.

Currently, adding a policy automatically turns on RLS, so we don't have
any issue with pg_dump from that perspective. Handling cases where RLS
is disabled but policies exist would get more complicated for pg_dump if
we keep the current idea that adding policies implicitly turns on RLS-
it'd essentially have to go back and turn it off after the policies are
added. Not a big fan of that either.

> And either way, you should
> that pg_dump behaves sanely in the case where there are circular
> dependencies, like you have two table A and B, and each has a RLS
> policy that manages to use the other table's row-type. (You probably
> also want to check that DROP and DROP .. CASCADE on either policy or
> either table does the right thing in that situation, but that's
> probably easier to get right.)

Agreed, we'll double-check that this is working. As these are
attributes of the table which get added later on by pg_dump, similar to
permissions, I'd think it'd all work fine, but good to make sure (and
ditto with DROP/DROP CASCADE.. We have some checks for that, but good
to make sure it works in a circular-dependency case too).

If we want to be able to disable RLS w/o dropping the policies, then I
think we have to completely de-couple the two and users would then have
both add policies AND turn on RLS to have RLS actually be enabled for a
given table. I'm on the fence about that.

Thoughts?

Thanks!

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-11 20:22:20
Message-ID: CA+TgmobwKFxV1eTRf9mBNWY653EeS5gLp8O5g7t4L5axnMk35g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> 2. Row level security policies can exist for a table with DISABLE ROW
>> LEVEL SECURITY in effect, but they don't do anything until RLS is
>> enabled. A possible advantage of this approach is that you could
>> *temporarily* shut off RLS for a table without having to drop all of
>> your policies and put them back. I kind of like this approach; we
>> have something similar for triggers, and I think it could be useful to
>> people.
>
> I like the idea of being able to turn them off without dropping them.
> We have that with row_security = off, but that would only work for the
> owner or a superuser (or a user with bypassrls). This would allow
> disabling RLS temporairly for everything accessing the table.
>
> The one thing I'm wondering about with this design is- what happens when
> a policy is initially added? Currently, we automatically turn on RLS
> for the table when that happens. I'm not thrilled with the idea that
> you have to add policies AND turn on RLS explicitly- someone might add
> policies but then forget to turn RLS on..

Whoa. I think that's a bad idea. I think the default value for RLS
should be disabled, and users should have to turn it on explicitly if
they want to get it. It's arguable whether the behavior if you try to
create a policy beforehand should be (1) outright failure or (2)
command accepted but no effect, but I think (3) automagically enable
the feature is a POLA violation. When somebody adds a policy and then
drops it again, they will expect to be back in the same state they
started out in, and for good reason.

> If we want to be able to disable RLS w/o dropping the policies, then I
> think we have to completely de-couple the two and users would then have
> both add policies AND turn on RLS to have RLS actually be enabled for a
> given table. I'm on the fence about that.
>
> Thoughts?

A strong +1 for doing just that. Look, anybody who is going to use
row-level security but isn't careful enough to verify that it's
actually working as desired after configuring it is a lost cause
anyway. That is the moral equivalent of a locksmith who comes out and
replaces a lock for you and at no point while he's there does he ever
close the door and verify that it latches and won't reopen. I'm sure
somebody has done that, but if a security breach results, surely
everybody would agree that the locksmith is at fault, not the lock
manufacturer. Personally, I have to test every GRANT and REVOKE I
issue, because there's no error for granting a privilege that the
target already has or revoking one they don't, and with group
membership and PUBLIC it's quite easy to have not done what you
thought you did. Fixing that might be worthwhile but it doesn't take
away from the fact that, like any other configuration change you make,
security-relevant changes need to be tested.

There is another possible advantage of the explicit-enable approach as
well, which is that you might want to create several policies and then
turn them all on at once. With what you have now, creating the first
policy will enable RLS on the table and then everyone who wasn't the
beneficiary of that initial policy is locked out. Now, granted, you
can probably get around that by doing all of the operations in one
transaction, so it's a minor point. But it's still nice to think
about being able to add several policies and then flip them on. If it
doesn't work out, flip them off, adjust, and flip them back on again.
Now, again, the core design issue, IMHO, is that the switch from
default-allow to default-deny should be explicit and unmistakable, so
the rest of this is just tinkering around the edges. But we might as
well make those edges as nice as possible, and the usability of this
approach feels good to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-11 20:36:10
Message-ID: 20140911203610.GG16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > The one thing I'm wondering about with this design is- what happens when
> > a policy is initially added? Currently, we automatically turn on RLS
> > for the table when that happens. I'm not thrilled with the idea that
> > you have to add policies AND turn on RLS explicitly- someone might add
> > policies but then forget to turn RLS on..
>
> Whoa. I think that's a bad idea. I think the default value for RLS
> should be disabled, and users should have to turn it on explicitly if
> they want to get it. It's arguable whether the behavior if you try to
> create a policy beforehand should be (1) outright failure or (2)
> command accepted but no effect, but I think (3) automagically enable
> the feature is a POLA violation. When somebody adds a policy and then
> drops it again, they will expect to be back in the same state they
> started out in, and for good reason.

Yeah, that I can agree with. Prior to adding the ability to explicitly
enable RLS, that's what they got, but that's changed now that we've made
the ability to turn on/off RLS half-way independent of policies. Also..

> > If we want to be able to disable RLS w/o dropping the policies, then I
> > think we have to completely de-couple the two and users would then have
> > both add policies AND turn on RLS to have RLS actually be enabled for a
> > given table. I'm on the fence about that.
>
> A strong +1 for doing just that. Look, anybody who is going to use
> row-level security but isn't careful enough to verify that it's
> actually working as desired after configuring it is a lost cause
> anyway.

I had been thinking the same, which is why I was on the fence about if
it was really an issue or not. This all amounts to actually making the
patch smaller also, which isn't a bad thing.

> Personally, I have to test every GRANT and REVOKE I
> issue, because there's no error for granting a privilege that the
> target already has or revoking one they don't, and with group
> membership and PUBLIC it's quite easy to have not done what you
> thought you did. Fixing that might be worthwhile but it doesn't take
> away from the fact that, like any other configuration change you make,
> security-relevant changes need to be tested.

Hmm, pretty sure that'd end up going against the spec too, but that's
a whole different discussion anyway.

> There is another possible advantage of the explicit-enable approach as
> well, which is that you might want to create several policies and then
> turn them all on at once. With what you have now, creating the first
> policy will enable RLS on the table and then everyone who wasn't the
> beneficiary of that initial policy is locked out. Now, granted, you
> can probably get around that by doing all of the operations in one
> transaction, so it's a minor point. But it's still nice to think
> about being able to add several policies and then flip them on. If it
> doesn't work out, flip them off, adjust, and flip them back on again.
> Now, again, the core design issue, IMHO, is that the switch from
> default-allow to default-deny should be explicit and unmistakable, so
> the rest of this is just tinkering around the edges. But we might as
> well make those edges as nice as possible, and the usability of this
> approach feels good to me.

Fair enough.

Thanks!

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-14 15:38:34
Message-ID: 20140914153833.GY16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > If we want to be able to disable RLS w/o dropping the policies, then I
> > think we have to completely de-couple the two and users would then have
> > both add policies AND turn on RLS to have RLS actually be enabled for a
> > given table. I'm on the fence about that.
> >
> > Thoughts?
>
> A strong +1 for doing just that.

Alright, updated patch attached which does just that (thanks to Adam
for the updates for this and testing pg_dump- I just reviewed it and
added some documentation updates and other minor improvements), and
rebased to master. Also removed the catversion bump, so it should apply
cleanly for people, for a while anyway.

Thanks!

Stephen

Attachment Content-Type Size
rls_9-14-2014.patch text/x-diff 324.1 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 15:53:06
Message-ID: CA+TgmoasrDk8JmQqZNs_1uNnMyH7JAcat8BjoQm2huBg7iZcbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Sep 14, 2014 at 11:38 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > If we want to be able to disable RLS w/o dropping the policies, then I
>> > think we have to completely de-couple the two and users would then have
>> > both add policies AND turn on RLS to have RLS actually be enabled for a
>> > given table. I'm on the fence about that.
>> >
>> > Thoughts?
>>
>> A strong +1 for doing just that.
>
> Alright, updated patch attached which does just that (thanks to Adam
> for the updates for this and testing pg_dump- I just reviewed it and
> added some documentation updates and other minor improvements), and
> rebased to master. Also removed the catversion bump, so it should apply
> cleanly for people, for a while anyway.

I specifically asked you to hold off on committing this until there
was adequate opportunity for review, and explained my reasoning. You
committed it anyway.

I wonder if I am equally free to commit my own patches without
properly observing the CommitFest process, because it would be a whole
lot faster. My pg_background patches have been pending since before
the start of the August CommitFest and I accepted that I would have to
wait an extra two months to commit those because of a *clerical
error*, namely my failure to actually add them to the CommitFest.
This patch, on the other hand, was massively revised after the start
of the CommitFest after many months of inactivity and committed with
no thorough review by anyone who was truly independent of the
development effort. It was then committed with no warning over a
specific request, from another committer, that more time be allowed
for review.

I'm really disappointed by that. I feel I'm essentially getting
punished for trying to follow what I understand to the process, which
has involved me doing huge amounts of review of other people's patches
and waiting a very long time to get my own stuff committed, while you
bull ahead with your own patches.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:03:45
Message-ID: 20140919160345.GA13527@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-09-19 11:53:06 -0400, Robert Haas wrote:
> On Sun, Sep 14, 2014 at 11:38 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> >> On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >> > If we want to be able to disable RLS w/o dropping the policies, then I
> >> > think we have to completely de-couple the two and users would then have
> >> > both add policies AND turn on RLS to have RLS actually be enabled for a
> >> > given table. I'm on the fence about that.
> >> >
> >> > Thoughts?
> >>
> >> A strong +1 for doing just that.
> >
> > Alright, updated patch attached which does just that (thanks to Adam
> > for the updates for this and testing pg_dump- I just reviewed it and
> > added some documentation updates and other minor improvements), and
> > rebased to master. Also removed the catversion bump, so it should apply
> > cleanly for people, for a while anyway.
>
> I specifically asked you to hold off on committing this until there
> was adequate opportunity for review, and explained my reasoning. You
> committed it anyway.

I was also rather surprised by the push. I wanted to write something
about it, but:

> This patch, on the other hand, was massively revised after the start
> of the CommitFest after many months of inactivity and committed with
> no thorough review by anyone who was truly independent of the
> development effort. It was then committed with no warning over a
> specific request, from another committer, that more time be allowed
> for review.

says it better.

I think that's generally the case, but doubly so with sensitive stuff
like this.

> I wonder if I am equally free to commit my own patches without
> properly observing the CommitFest process, because it would be a whole
> lot faster. My pg_background patches have been pending since before
> the start of the August CommitFest and I accepted that I would have to
> wait an extra two months to commit those because of a *clerical
> error*, namely my failure to actually add them to the CommitFest.

FWIW, I think if a patch has been sent in time and has gotten a decent
amount of review *and* agreement it's fair for a committer to push
forward. That doesn't apply to this thread, but sometimes does for
others.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Thom Brown <thom(at)linux(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:29:29
Message-ID: CAA-aLv63QbiG9_yG0+dqHe5v6G=auTMuE5Xtf1nqQJMKXACPRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14 September 2014 16:38, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > On Thu, Sep 11, 2014 at 3:08 PM, Stephen Frost <sfrost(at)snowman(dot)net>
> wrote:
> > > If we want to be able to disable RLS w/o dropping the policies, then I
> > > think we have to completely de-couple the two and users would then have
> > > both add policies AND turn on RLS to have RLS actually be enabled for a
> > > given table. I'm on the fence about that.
> > >
> > > Thoughts?
> >
> > A strong +1 for doing just that.
>
> Alright, updated patch attached which does just that (thanks to Adam
> for the updates for this and testing pg_dump- I just reviewed it and
> added some documentation updates and other minor improvements), and
> rebased to master. Also removed the catversion bump, so it should apply
> cleanly for people, for a while anyway.
>

This is testing what has been committed:

# create table colours (id serial, name text, visible boolean);
CREATE TABLE

# insert into colours (name, visible) values
('blue',true),('yellow',true),('ultraviolet',false),('green',true),('infrared',false);
INSERT 0 5

# create policy visible_colours on colours for all to joe using (visible =
true);
CREATE POLICY

# grant all on colours to public;
GRANT

# grant all on sequence colours_id_seq to public;
GRANT

# alter table colours enable row level security ;
ALTER TABLE

\c - joe

> select * from colours;
id | name | visible
----+--------+---------
1 | blue | t
2 | yellow | t
4 | green | t
(3 rows)

> insert into colours (name, visible) values ('purple',true);
INSERT 0 1

> insert into colours (name, visible) values ('transparent',false);
ERROR: new row violates WITH CHECK OPTION for "colours"
DETAIL: Failing row contains (7, transparent, f).

> select * from pg_policies ;
policyname | tablename | roles | cmd | qual | with_check
-----------------+-----------+-------+-----+------------------+------------
visible_colours | colours | {joe} | ALL | (visible = true) |
(1 row)

There was no WITH CHECK OPTION.

--
Thom


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:32:30
Message-ID: 20140919163230.GG16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thom,

Thanks!

* Thom Brown (thom(at)linux(dot)com) wrote:
> On 14 September 2014 16:38, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> # create policy visible_colours on colours for all to joe using (visible =
> true);
> CREATE POLICY
[...]
> > insert into colours (name, visible) values ('transparent',false);
> ERROR: new row violates WITH CHECK OPTION for "colours"
> DETAIL: Failing row contains (7, transparent, f).
>
> > select * from pg_policies ;
> policyname | tablename | roles | cmd | qual | with_check
> -----------------+-----------+-------+-----+------------------+------------
> visible_colours | colours | {joe} | ALL | (visible = true) |
> (1 row)
>
> There was no WITH CHECK OPTION.

As I hope is clear if you look at the documentation- if the WITH CHECK
clause is omitted, then the USING clause is used for both filtering and
checking new records, otherwise you'd be able to add records which
aren't visible to you.

Thanks!

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:38:39
Message-ID: 20140919163839.GH16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Sun, Sep 14, 2014 at 11:38 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Alright, updated patch attached which does just that (thanks to Adam
> > for the updates for this and testing pg_dump- I just reviewed it and
> > added some documentation updates and other minor improvements), and
> > rebased to master. Also removed the catversion bump, so it should apply
> > cleanly for people, for a while anyway.
>
> I specifically asked you to hold off on committing this until there
> was adequate opportunity for review, and explained my reasoning. You
> committed it anyway.

Hum- my apologies, I honestly don't recall you specifically asking for
it to be held off indefinitely. :( There was discussion back and
forth, quite a bit of it with you, and I thank you for your help with
that and certainly welcome any additional comments.

> This patch, on the other hand, was massively revised after the start
> of the CommitFest after many months of inactivity and committed with
> no thorough review by anyone who was truly independent of the
> development effort. It was then committed with no warning over a
> specific request, from another committer, that more time be allowed
> for review.

I would not (nor do I feel that I did..) have committed it over a
specific request to not do so from another committer. I had been hoping
that there would be another review coming from somewhere, but there is
always a trade-off between waiting longer to get a review ahead of a
commit and having it committed and then available more easily for others
to work with, review, and generally moving forward.

> I'm really disappointed by that. I feel I'm essentially getting
> punished for trying to follow what I understand to the process, which
> has involved me doing huge amounts of review of other people's patches
> and waiting a very long time to get my own stuff committed, while you
> bull ahead with your own patches.

While I wasn't public about it, I actually specifically discussed this
question with others, a few times even, to try and make sure that I
wasn't stepping out of line by moving forward.

That said, I do see that Andres feels similairly. It certainly wasn't
my intent to surprise anyone by it but simply to continue to move
forward- in part, to allow me to properly break from it and work on
other things, including reviewing other patches in the commitfest.
I fear I've simply been overly focused on it these past few weeks, for a
variety of reasons that would likely best be discussed at the pub.

All-in-all, I feel appropriately chastised and certainly don't wish to
be surprising fellow committers. Perhaps we can discuss at the dev
meeting.

Thanks,

Stephen


From: Thom Brown <thom(at)linux(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:45:41
Message-ID: CAA-aLv4qUH9bwhwHK93XrYS4YfscYSc6mSMj-WzCYkEsR7-pfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19 September 2014 17:32, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Thom,
>
> Thanks!
>
> * Thom Brown (thom(at)linux(dot)com) wrote:
> > On 14 September 2014 16:38, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > # create policy visible_colours on colours for all to joe using (visible
> =
> > true);
> > CREATE POLICY
> [...]
> > > insert into colours (name, visible) values ('transparent',false);
> > ERROR: new row violates WITH CHECK OPTION for "colours"
> > DETAIL: Failing row contains (7, transparent, f).
> >
> > > select * from pg_policies ;
> > policyname | tablename | roles | cmd | qual |
> with_check
> >
> -----------------+-----------+-------+-----+------------------+------------
> > visible_colours | colours | {joe} | ALL | (visible = true) |
> > (1 row)
> >
> > There was no WITH CHECK OPTION.
>
> As I hope is clear if you look at the documentation- if the WITH CHECK
> clause is omitted, then the USING clause is used for both filtering and
> checking new records, otherwise you'd be able to add records which
> aren't visible to you.

I can see that now, although I do find the error message somewhat
confusing. Firstly, it looks like "OPTION" is part of the parameter name,
which it isn't.

Also, I seem to get an error message with the following:

# create policy nice_colours ON colours for all to joe using (visible =
true) with check (name in ('blue','green','yellow'));
CREATE POLICY

\c - joe

> insert into colours (name, visible) values ('blue',false);
ERROR: function with OID 0 does not exist

And if this did work, but I only violated the USING clause, would this
still say the WITH CHECK clause was the cause?

Thom


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:48:58
Message-ID: 20140919164858.GB13527@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-09-19 12:38:39 -0400, Stephen Frost wrote:
> I would not (nor do I feel that I did..) have committed it over a
> specific request to not do so from another committer. I had been hoping
> that there would be another review coming from somewhere, but there is
> always a trade-off between waiting longer to get a review ahead of a
> commit and having it committed and then available more easily for others
> to work with, review, and generally moving forward.

Sure, there is such a tradeoff. But others have to wait months to get
enough review. The first revision of the patch in the form you
committed was sent 2014-08-19, the first marked *ready for review* (not
my words) is from 2014-08-30. 19 days really isn't very far along the
tradeoff from waiting for a review to uselessly waiting.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:50:09
Message-ID: CA+TgmoaX+ptioOxx42rxJxsgrvxPfUVyndkpeR0JsRiTeZ36Ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 19, 2014 at 12:38 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> This patch, on the other hand, was massively revised after the start
>> of the CommitFest after many months of inactivity and committed with
>> no thorough review by anyone who was truly independent of the
>> development effort. It was then committed with no warning over a
>> specific request, from another committer, that more time be allowed
>> for review.
>
> I would not (nor do I feel that I did..) have committed it over a
> specific request to not do so from another committer.

Well, you're wrong. How could this email possibly have been any more clear?

http://www.postgresql.org/message-id/CA+TgmoYA=uixXmN390SFgfQgVmLL-As5bJaL0oM7yrpPVwNPxQ@mail.gmail.com

You can hardly tell me you didn't see that email when you incorporated
the technical content into the next patch version.

> While I wasn't public about it, I actually specifically discussed this
> question with others, a few times even, to try and make sure that I
> wasn't stepping out of line by moving forward.

And yet you completely ignored the only public commentary on the
issue, which was from me.

I *should not have had* to object to this patch going in. It was
clearly untimely for the August CommitFest, and as a long-time
community member, you ought to know full well that any such patch
should be resubmitted to a later CommitFest. This patch sat on the
shelf for 4 months because you were too busy to work on it, and was
committed 5 days from the last posted version, which version had zero
review comments. If you didn't have time to work on it for 4 months,
you can hardly expect everyone else who has an opinion to comment
within 5 days.

But, you know, because I could tell that you were fixated on pushing
this patch through to commit quickly, I took the time to send you a
message on that specific point, even though you should have known full
well. In fact I took the time to send TWO. Here's the other one:

http://www.postgresql.org/message-id/CA+TgmobqO0z87EiVfDEwjCac1dC4ahh5wCVoQoxrSaTeU1T-RA@mail.gmail.com

> All-in-all, I feel appropriately chastised and certainly don't wish to
> be surprising fellow committers. Perhaps we can discuss at the dev
> meeting.

No, I think we should discuss it right now, not nine months from now
when the issue has faded from everyone's mind.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-19 16:54:12
Message-ID: 20140919165411.GJ16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thom,

* Thom Brown (thom(at)linux(dot)com) wrote:
> On 19 September 2014 17:32, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Thom Brown (thom(at)linux(dot)com) wrote:
> > > On 14 September 2014 16:38, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > > # create policy visible_colours on colours for all to joe using (visible
> > =
> > > true);
> > > CREATE POLICY
> > [...]
> > > > insert into colours (name, visible) values ('transparent',false);
> > > ERROR: new row violates WITH CHECK OPTION for "colours"
> > > DETAIL: Failing row contains (7, transparent, f).
> > >
> > > > select * from pg_policies ;
> > > policyname | tablename | roles | cmd | qual |
> > with_check
> > >
> > -----------------+-----------+-------+-----+------------------+------------
> > > visible_colours | colours | {joe} | ALL | (visible = true) |
> > > (1 row)
> > >
> > > There was no WITH CHECK OPTION.
> >
> > As I hope is clear if you look at the documentation- if the WITH CHECK
> > clause is omitted, then the USING clause is used for both filtering and
> > checking new records, otherwise you'd be able to add records which
> > aren't visible to you.
>
> I can see that now, although I do find the error message somewhat
> confusing. Firstly, it looks like "OPTION" is part of the parameter name,
> which it isn't.

Hmm, the notion of 'with check option' is from the SQL standard, which
is why I felt the error message was appropriate as-is..

> Also, I seem to get an error message with the following:
>
> # create policy nice_colours ON colours for all to joe using (visible =
> true) with check (name in ('blue','green','yellow'));
> CREATE POLICY
>
> \c - joe
>
> > insert into colours (name, visible) values ('blue',false);
> ERROR: function with OID 0 does not exist

Now *that* one is interesting and I'll definitely go take a look at it.
We added quite a few regression tests to try and make sure these things
work.

> And if this did work, but I only violated the USING clause, would this
> still say the WITH CHECK clause was the cause?

WITH CHECK applies for INSERT and UPDATE for the new records going into
the table. You can't actually violate the USING clause for an INSERT
as USING is for filtering records, not checking that records being added
to the table are valid.

To try and clarify- by explicitly setting both USING and WITH CHECK, you
*are* able to INSERT records which are not visible to you. We felt that
was an important capability to support.

Thanks for taking a look at it!

Stephen


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-20 07:23:01
Message-ID: 541D2B55.50208@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/20/2014 12:38 AM, Stephen Frost wrote:

> I would not (nor do I feel that I did..) have committed it over a
> specific request to not do so from another committer. I had been hoping
> that there would be another review coming from somewhere, but there is
> always a trade-off between waiting longer to get a review ahead of a
> commit and having it committed and then available more easily for others
> to work with, review, and generally moving forward.

Y'know what helps with that? Publishing clean git branches for
non-trivial work, rather than just lobbing patches around.

I'm finding the reliance on a patch based workflow increasingly
frustrating for complex work, and wonder if it's time to revisit
introducing a git repo+ref to the commitfest app.

I find the need to find the latest patch on the list, apply it, and fix
it up really frustrating. "git am --3way" helps a lot, but only if the
patch is created with "git format-patch".

Perhaps it's time to look at whether git can do more to help us with the
testing and review process.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-20 18:18:21
Message-ID: 541DC4ED.6000302@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/20/2014 12:23 AM, Craig Ringer wrote:
> On 09/20/2014 12:38 AM, Stephen Frost wrote:
>
>> I would not (nor do I feel that I did..) have committed it over a
>> specific request to not do so from another committer. I had been hoping
>> that there would be another review coming from somewhere, but there is
>> always a trade-off between waiting longer to get a review ahead of a
>> commit and having it committed and then available more easily for others
>> to work with, review, and generally moving forward.
>
> Y'know what helps with that? Publishing clean git branches for
> non-trivial work, rather than just lobbing patches around.
>
> I'm finding the reliance on a patch based workflow increasingly
> frustrating for complex work, and wonder if it's time to revisit
> introducing a git repo+ref to the commitfest app.
>
> I find the need to find the latest patch on the list, apply it, and fix
> it up really frustrating. "git am --3way" helps a lot, but only if the
> patch is created with "git format-patch".
>
> Perhaps it's time to look at whether git can do more to help us with the
> testing and review process.

We discussed this at the last developer meeting, without coming up with
a written procedure. Your ideas can help ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Thom Brown <thom(at)linux(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-25 13:42:04
Message-ID: CAA-aLv6HuzSuYSceM0k9D3iOzod7GQYnCyXWMv6XUMVkS3idOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19 September 2014 17:54, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> Thom,
>
> * Thom Brown (thom(at)linux(dot)com) wrote:
> > On 19 September 2014 17:32, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > > * Thom Brown (thom(at)linux(dot)com) wrote:
> > > > On 14 September 2014 16:38, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > > > # create policy visible_colours on colours for all to joe using (visible
> > > =
> > > > true);
> > > > CREATE POLICY
> > > [...]
> > > > > insert into colours (name, visible) values ('transparent',false);
> > > > ERROR: new row violates WITH CHECK OPTION for "colours"
> > > > DETAIL: Failing row contains (7, transparent, f).
> > > >
> > > > > select * from pg_policies ;
> > > > policyname | tablename | roles | cmd | qual |
> > > with_check
> > > >
> > > -----------------+-----------+-------+-----+------------------+------------
> > > > visible_colours | colours | {joe} | ALL | (visible = true) |
> > > > (1 row)
> > > >
> > > > There was no WITH CHECK OPTION.
> > >
> > > As I hope is clear if you look at the documentation- if the WITH CHECK
> > > clause is omitted, then the USING clause is used for both filtering and
> > > checking new records, otherwise you'd be able to add records which
> > > aren't visible to you.
> >
> > I can see that now, although I do find the error message somewhat
> > confusing. Firstly, it looks like "OPTION" is part of the parameter name,
> > which it isn't.
>
> Hmm, the notion of 'with check option' is from the SQL standard, which
> is why I felt the error message was appropriate as-is..
>
> > Also, I seem to get an error message with the following:
> >
> > # create policy nice_colours ON colours for all to joe using (visible =
> > true) with check (name in ('blue','green','yellow'));
> > CREATE POLICY
> >
> > \c - joe
> >
> > > insert into colours (name, visible) values ('blue',false);
> > ERROR: function with OID 0 does not exist
>
> Now *that* one is interesting and I'll definitely go take a look at it.
> We added quite a few regression tests to try and make sure these things
> work.
>
> > And if this did work, but I only violated the USING clause, would this
> > still say the WITH CHECK clause was the cause?
>
> WITH CHECK applies for INSERT and UPDATE for the new records going into
> the table. You can't actually violate the USING clause for an INSERT
> as USING is for filtering records, not checking that records being added
> to the table are valid.
>
> To try and clarify- by explicitly setting both USING and WITH CHECK, you
> *are* able to INSERT records which are not visible to you. We felt that
> was an important capability to support.

I find it a bit of a limitation that I can't specify both INSERT and
UPDATE for a policy. I'd want to be able to specify something like
this:

CREATE POLICY no_greys_allowed
ON colours
FOR INSERT, UPDATE
WITH CHECK (name NOT IN ('grey','gray'));

I would expect this to be rather common to prevent certain values
making their way into a table. Instead I'd have to create 2 policies
as it stands.

In order to debug issues with accessing table data, perhaps it would
be useful to output the name of the policy that was violated. If a
table had 20 policies on, it could become time-consuming to debug.

I keep getting tripped up by overlapping policies. On the one hand, I
created a policy to ensure rows being added or selected have a
"visible" column set to true. On the other hand, I have a policy that
ensures that the name of a colour doesn't appear in a list. Policy 1
is violated until policy 2 is added:

(using the table I created in a previous post on this thread...)

# create policy must_be_visible ON colours for all to joe using
(visible = true) with check (visible = true);
CREATE POLICY

\c - joe

> insert into colours (name, visible) values ('pink',false);
ERROR: new row violates WITH CHECK OPTION for "colours"
DETAIL: Failing row contains (28, pink, f).

\c - thom

# create policy no_greys_allowed on colours for insert with check
(name not in ('grey','gray'));
CREATE POLICY

\c - joe

# insert into colours (name, visible) values ('pink',false);
INSERT 0 1

I expected this to still trigger an error due to the first policy. Am
I to infer from this that the policy model is permissive rather than
restrictive?

I've also attached a few corrections for the docs.

Thom

Attachment Content-Type Size
policy_doc_corrections.diff text/plain 2.5 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-25 14:26:13
Message-ID: 20140925142613.GZ16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thom,

* Thom Brown (thom(at)linux(dot)com) wrote:
> I find it a bit of a limitation that I can't specify both INSERT and
> UPDATE for a policy. I'd want to be able to specify something like
> this:
>
> CREATE POLICY no_greys_allowed
> ON colours
> FOR INSERT, UPDATE
> WITH CHECK (name NOT IN ('grey','gray'));
>
> I would expect this to be rather common to prevent certain values
> making their way into a table. Instead I'd have to create 2 policies
> as it stands.

That's not actually the case...

CREATE POLICY no_greys_allowed
ON colours
FOR ALL
USING (true) -- assuming this is what you intended
WITH CHECK (name NOT IN ('grey','gray'));

Right? That said, I'm not against the idea of supporting mulitple
commands with one policy (similar to how ALL is done). It wouldn't be
difficult or much of a change- make the 'cmd' a bitfield instead. If
others feel the same then I'll look at doing that.

> In order to debug issues with accessing table data, perhaps it would
> be useful to output the name of the policy that was violated. If a
> table had 20 policies on, it could become time-consuming to debug.

Good point. That'll involve a bit more as I'll need to look at the
existing with check options structure, but I believe it's just adding
the field to the structure, populating it when adding the WCO entries,
and then checking for it in the ereport() call. The policy name is
already stashed in the relcache entry, so it's already pretty easily
available.

> I keep getting tripped up by overlapping policies. On the one hand, I
> created a policy to ensure rows being added or selected have a
> "visible" column set to true. On the other hand, I have a policy that
> ensures that the name of a colour doesn't appear in a list. Policy 1
> is violated until policy 2 is added:
>
> (using the table I created in a previous post on this thread...)
>
> # create policy must_be_visible ON colours for all to joe using
> (visible = true) with check (visible = true);
> CREATE POLICY
>
> \c - joe
>
> > insert into colours (name, visible) values ('pink',false);
> ERROR: new row violates WITH CHECK OPTION for "colours"
> DETAIL: Failing row contains (28, pink, f).
>
> \c - thom
>
> # create policy no_greys_allowed on colours for insert with check
> (name not in ('grey','gray'));
> CREATE POLICY
>
> \c - joe
>
> # insert into colours (name, visible) values ('pink',false);
> INSERT 0 1
>
> I expected this to still trigger an error due to the first policy. Am
> I to infer from this that the policy model is permissive rather than
> restrictive?

That's correct and I believe pretty clear in the documentation- policies
are OR'd together, just the same as how roles are handled. As a
logged-in user, you have the rights of all of the roles you are a member
of (subject to inheiritance rules, of course), and similairly, you are
able to view and add all rows which match any policy which applies to
you (either through role membership or through different policies).

> I've also attached a few corrections for the docs.

Thanks! I'll plan to include these with a few other typos and the fix
for the bug that Andres pointed out, once I finish testing (and doing
another CLOBBER_CACHE_ALWAYS run..).

Thanks again,

Stephen


From: Thom Brown <thom(at)linux(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Brightwell, Adam" <adam(dot)brightwell(at)crunchydatasolutions(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Yeb Havinga <yeb(dot)havinga(at)portavita(dot)nl>
Subject: Re: RLS Design
Date: 2014-09-25 16:04:15
Message-ID: CAA-aLv7phXW+AvFN0q0pqHR_iG-b1642Y9ZdX-P_x+_uxWqYAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 25 September 2014 15:26, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> I expected this to still trigger an error due to the first policy. Am
>> I to infer from this that the policy model is permissive rather than
>> restrictive?
>
> That's correct and I believe pretty clear in the documentation- policies
> are OR'd together, just the same as how roles are handled. As a
> logged-in user, you have the rights of all of the roles you are a member
> of (subject to inheiritance rules, of course), and similairly, you are
> able to view and add all rows which match any policy which applies to
> you (either through role membership or through different policies).

Okay, I see now. This is a mindset issue for me as I'm looking at
them like constraints rather than permissions. Thanks for the
explanation.

Thom