Re: INSERT ... ON CONFLICT UPDATE and RLS

Lists: pgsql-hackers
From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>
Subject: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-05 17:54:02
Message-ID: CAM3SWZRFwmmk4ZctRb5V_V4nqgZ2wZ7B=HJeArjoDPXv_ModXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The patch that implements INSERT ... ON CONFLICT UPDATE has support
and tests for per-column privileges (which are not relevant to the
IGNORE variant, AFAICT). However, RLS support is another thing
entirely. It has not been properly thought out, and unlike per-column
privileges requires careful consideration, as the correct behavior
isn't obvious.

I've documented the current problems with RLS here:

https://wiki.postgresql.org/wiki/UPSERT#RLS

It's not clear whether or not the auxiliary UPDATE within an INSERT...
ON CONFLICT UPDATE statement should have security quals appended.
Stephen seemed to think that that might not be the best solution [1].
I am not sure. I'd like to learn what other people think.

What is the best way of integrating RLS with ON CONFLICT UPDATE? What
behavior is most consistent with the guarantees of RLS? In particular,
should the implementation append security quals to the auxiliary
UPDATE, or fail sooner?

[1] http://www.postgresql.org/message-id/20141121205926.GK28859@tamriel.snowman.net
--
Peter Geoghegan


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-06 17:39:12
Message-ID: CA+TgmoZSZxpies3UJAYKZknacKoEB9Z_Xs88CsnH5i650V368A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 5, 2015 at 12:54 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> The patch that implements INSERT ... ON CONFLICT UPDATE has support
> and tests for per-column privileges (which are not relevant to the
> IGNORE variant, AFAICT). However, RLS support is another thing
> entirely. It has not been properly thought out, and unlike per-column
> privileges requires careful consideration, as the correct behavior
> isn't obvious.
>
> I've documented the current problems with RLS here:
>
> https://wiki.postgresql.org/wiki/UPSERT#RLS
>
> It's not clear whether or not the auxiliary UPDATE within an INSERT...
> ON CONFLICT UPDATE statement should have security quals appended.
> Stephen seemed to think that that might not be the best solution [1].
> I am not sure. I'd like to learn what other people think.
>
> What is the best way of integrating RLS with ON CONFLICT UPDATE? What
> behavior is most consistent with the guarantees of RLS? In particular,
> should the implementation append security quals to the auxiliary
> UPDATE, or fail sooner?

I think the INSERT .. ON CONFLICT UPDATE shouldn't be able to attempt
an update unless the UPDATE policies of the table are such that a
regular UPDATE would find the affected row. The post-image of the row
needs to satisfy any UPDATE CHECK OPTION. If the INSERT fails due to
a conflict with an unseen row, and the UPDATE can't find that row
either due to RLS, then it should probably error out; the alternative
is to silently do nothing, but that feels wrong.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-06 19:34:54
Message-ID: CAM3SWZTUcyq6-Pf8PeKWzXiLjLDD0Cye1F-M7hAJbqSp3SZ=Cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 6, 2015 at 9:39 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think the INSERT .. ON CONFLICT UPDATE shouldn't be able to attempt
> an update unless the UPDATE policies of the table are such that a
> regular UPDATE would find the affected row. The post-image of the row
> needs to satisfy any UPDATE CHECK OPTION. If the INSERT fails due to
> a conflict with an unseen row, and the UPDATE can't find that row
> either due to RLS, then it should probably error out; the alternative
> is to silently do nothing, but that feels wrong.

I can certainly see the argument for that behavior. I'm inclined to
agree that this is better.

With th existing implementation, UPDATE check options are effective at
preventing updates. However, the implementation is not effective at
preventing row locking from finding a row that the user would not
otherwise be able to find (with a conventional UPDATE). So I guess
what I have to do now is figure out a way of also having the "... FOR
UPDATE .... USING ( )" qual be enforced after row locking in respect
of the row locked (locked, but not yet used to generate a post-image)
in the target table. If it isn't satisfied, throw an error that
doesn't leak anything about the target row, rather than silently not
affecting the row. So, as you say, a divergence from what regular RLS
UPDATEs do that happens to make more sense here.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-06 20:44:38
Message-ID: CAM3SWZReVFYQkUDTELwkJTYjZM_GmH6AnQnLiowUeUhADTbBSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Another issue that I see is that there is only one resultRelation
between the INSERT and UPDATE. That means that without some additional
special measure, both UPDATE and INSERT "WITH CHECK ( ... ) " options
are applied regardless of whether the INSERT path was taken, or the
UPDATE path. Maybe that's actually sensible (note that even this
doesn't happen right now, since the auxiliary UPDATE is currently
minimally processed by the rewriter). What do people think about that?
It seems like it might be less surprising to have that fail earlier
when there are separate policies for INSERT and UPDATE -- for UPSERT,
all policies are applied regardless of which path is taken.

The task of making the security qual enforced like a check option
before we go to update a locked row (at the start of the UPDATE path,
for any UPDATE policy with a USING security qual) looks complicated.
I'd appreciate a little help on the implementation details.

Apparently Oracle does not offer "fine-grained access control" with
MERGE, which I believe means they just don't support this kind of
thing at all. Obviously I'd rather avoid that here, but the correct
semantics are not obvious. ON CONFLICT UPDATE could almost justify
making CREATE POLICY FOR INSERT accept a USING expression, since
that's really where the row to UPDATE comes from.
--
Peter Geoghegan


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 09:04:24
Message-ID: CAEZATCXohHQhvU4x=V5WpM30_fSTMiViUCb-F4hTp47CgnYNog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6 January 2015 at 20:44, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Another issue that I see is that there is only one resultRelation
> between the INSERT and UPDATE. That means that without some additional
> special measure, both UPDATE and INSERT "WITH CHECK ( ... ) " options
> are applied regardless of whether the INSERT path was taken, or the
> UPDATE path. Maybe that's actually sensible (note that even this
> doesn't happen right now, since the auxiliary UPDATE is currently
> minimally processed by the rewriter). What do people think about that?
> It seems like it might be less surprising to have that fail earlier
> when there are separate policies for INSERT and UPDATE -- for UPSERT,
> all policies are applied regardless of which path is taken.
>

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

As to whether the UPDATE USING policy should cause an error to be
thrown if it is not satisfied, my inclination would be to not error,
and use the command tag to report that no rows were updated, since
that is what would happen with a regular UPDATE.

So overall INSERT .. ON CONFLICT UPDATE would be consistent with
either an INSERT or an UPDATE, depending on whether the row existed
beforehand, which is easier to explain than having some special UPSERT
semantics.

Regards,
Dean


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 16:01:52
Message-ID: CA+TgmoZrPMp=M10pEybzMuqXni9BH6Rbb=iGkd+T6bZt0phmzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> I think the policies applied should depend on the path taken, so if it
> does an INSERT, then only the INSERT CHECK policy should be applied
> (after the insert), but if it ends up doing an UPDATE, I would expect
> the UPDATE USING policy to be applied (before the update) and the
> UPDATE CHECK policy to be applied (after the update). I would not
> expect the INSERT CHECK policy to be applied on the UPDATE path.

I agree.

> As to whether the UPDATE USING policy should cause an error to be
> thrown if it is not satisfied, my inclination would be to not error,
> and use the command tag to report that no rows were updated, since
> that is what would happen with a regular UPDATE.

My inclination would be to error, but I'd be OK with your proposal.

> So overall INSERT .. ON CONFLICT UPDATE would be consistent with
> either an INSERT or an UPDATE, depending on whether the row existed
> beforehand, which is easier to explain than having some special UPSERT
> semantics.

Yeah. We won't escape the question so easily where triggers are
concerned, but at least for RLS it seems like it should be possible to
avoid confusing, one-off semantics.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 19:42:23
Message-ID: 20150107194223.GX3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 6 January 2015 at 20:44, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> > Another issue that I see is that there is only one resultRelation
> > between the INSERT and UPDATE. That means that without some additional
> > special measure, both UPDATE and INSERT "WITH CHECK ( ... ) " options
> > are applied regardless of whether the INSERT path was taken, or the
> > UPDATE path. Maybe that's actually sensible (note that even this
> > doesn't happen right now, since the auxiliary UPDATE is currently
> > minimally processed by the rewriter). What do people think about that?
> > It seems like it might be less surprising to have that fail earlier
> > when there are separate policies for INSERT and UPDATE -- for UPSERT,
> > all policies are applied regardless of which path is taken.
>
> I think the policies applied should depend on the path taken, so if it
> does an INSERT, then only the INSERT CHECK policy should be applied
> (after the insert), but if it ends up doing an UPDATE, I would expect
> the UPDATE USING policy to be applied (before the update) and the
> UPDATE CHECK policy to be applied (after the update). I would not
> expect the INSERT CHECK policy to be applied on the UPDATE path.

You're making a distinction where I'm not convinced there should be one.
While I understand that, internally, there are two paths (INSERT vs.
UPDATE), it's still only one command for the user and their goal is
simply "update if this exists, insert if it doesn't." I also don't like
the idea that the policy to be applied depends on if it ends up being an
INSERT or an UPDATE. I liked Peter's suggestion that the row must pass
both WITH CHECK conditions- at least that's consistent and
understandable.

> As to whether the UPDATE USING policy should cause an error to be
> thrown if it is not satisfied, my inclination would be to not error,
> and use the command tag to report that no rows were updated, since
> that is what would happen with a regular UPDATE.

Yes, for a regular UPDATE that's what would happen- but this isn't a
regular UPDATE, it's an UPSERT. Consider how users handle this now-
they have routines which continually try to do one or the other until
either the INSERT or the UPDATE is successful. I've never seen such a
function, properly written, that says "try to INSERT, if that fails, try
to UPDATE and if that doesn't update anything, then simply exit." What
is the use-case for that?

> So overall INSERT .. ON CONFLICT UPDATE would be consistent with
> either an INSERT or an UPDATE, depending on whether the row existed
> beforehand, which is easier to explain than having some special UPSERT
> semantics.

Any semantics which result in a no-op would be surprising for users of
UPSERT. I like the idea of failing earlier- if you try to INSERT .. ON
CONFLICT UPDATE a row which you're not allowed to INSERT, erroring
strikes me as the right result. If you try to INSERT .. ON CONFLICT
UPDATE a row which you're not allowed to UPDATE, I'd also think an error
would be the right answer, even if you don't end up doing an actual
UPDATE- you ran a command asking for an UPDATE to happen under a certain
condition (the row already exists) and the policy states that you're not
allowed to do such an UPDATE.

As for the UPDATE's USING clause, if you're not allowed to view the
records to be updated (which evidently exists because the INSERT
failed), I'd handle that the same way we handle the case that a SELECT
policy might not allow a user to see rows that they can attempt to
INSERT- they'll get an error back in that case too.

In the end, these are edge cases as policies are generally self
consistent and so I think we have some leeway, but I don't think we have
the right to turn an INSERT .. ON CONFLICT UPDATE into a no-op. That
would be like allowing an INSERT to be a no-op.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 20:01:20
Message-ID: 20150107200120.GY3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> > I think the policies applied should depend on the path taken, so if it
> > does an INSERT, then only the INSERT CHECK policy should be applied
> > (after the insert), but if it ends up doing an UPDATE, I would expect
> > the UPDATE USING policy to be applied (before the update) and the
> > UPDATE CHECK policy to be applied (after the update). I would not
> > expect the INSERT CHECK policy to be applied on the UPDATE path.
>
> I agree.

I can certainly understand the appeal of this approach, but I don't
think it makes sense. Consider what happens later on down the road,
after the code has been written and deployed using INSERT .. ON CONFLICT
UPDATE where 99% of the time only one path or the other is taken. Then
the other path is taken and suddenly the exact same command and row ends
up returning errors. Additional testing should have been done to check
if that happens, of course, but I really don't like the idea that the
exact same command, with the exact same policies, would succeed or fail,
due to policies, based on the data in the database.

That's not to say that, generally, speaking, the data in the database
shouldn't impact policy failures, but in this case, the policies might
not even reference any other tables in the database.

> > As to whether the UPDATE USING policy should cause an error to be
> > thrown if it is not satisfied, my inclination would be to not error,
> > and use the command tag to report that no rows were updated, since
> > that is what would happen with a regular UPDATE.
>
> My inclination would be to error, but I'd be OK with your proposal.

I'm pretty strongly in favor of erroring. :)

> > So overall INSERT .. ON CONFLICT UPDATE would be consistent with
> > either an INSERT or an UPDATE, depending on whether the row existed
> > beforehand, which is easier to explain than having some special UPSERT
> > semantics.
>
> Yeah. We won't escape the question so easily where triggers are
> concerned, but at least for RLS it seems like it should be possible to
> avoid confusing, one-off semantics.

Triggers are another question..

One approach that I don't recall seeing (and I'm not convinced that it's
a good idea either, but thought it deserved mention) is the idea of
having an UPSERT-specific set of policies (and perhaps an
UPSERT-specific trigger...?). That gets pretty grotty when you consider
existing systems which have INSERT and UPDATE triggers already, but
UPSERT is a pretty big deal and a big change and for users who are just
starting to use it, requiring that they write triggers and policies
specifically to address that case might be acceptable. That doesn't
address the cases where users have direct SQL access and might be able
to then bypass existing triggers, but we might be able to work out an
answer to that case..

Other databases have this capability and have triggers and at least one
ends up firing both INSERT and UPDATE triggers, with many complaints
from users about how that ends up making the performance suck. Perhaps
we could use that as a fallback but support the explicit single trigger
option too.. Just some thoughts, apologies if it's already been
convered in depth previously.

Thanks!

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 20:17:06
Message-ID: CA+TgmoaNMqVE+jwHzWjOD2fuo88ouHStZTeRr=ygKwPC=p5Y6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 7, 2015 at 3:01 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> > I think the policies applied should depend on the path taken, so if it
>> > does an INSERT, then only the INSERT CHECK policy should be applied
>> > (after the insert), but if it ends up doing an UPDATE, I would expect
>> > the UPDATE USING policy to be applied (before the update) and the
>> > UPDATE CHECK policy to be applied (after the update). I would not
>> > expect the INSERT CHECK policy to be applied on the UPDATE path.
>>
>> I agree.
>
> I can certainly understand the appeal of this approach, but I don't
> think it makes sense. Consider what happens later on down the road,
> after the code has been written and deployed using INSERT .. ON CONFLICT
> UPDATE where 99% of the time only one path or the other is taken. Then
> the other path is taken and suddenly the exact same command and row ends
> up returning errors.

I'd say: that's life. If you don't test your policies, they might not work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 20:26:22
Message-ID: 20150107202622.GC3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Jan 7, 2015 at 3:01 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> >> On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> >> > I think the policies applied should depend on the path taken, so if it
> >> > does an INSERT, then only the INSERT CHECK policy should be applied
> >> > (after the insert), but if it ends up doing an UPDATE, I would expect
> >> > the UPDATE USING policy to be applied (before the update) and the
> >> > UPDATE CHECK policy to be applied (after the update). I would not
> >> > expect the INSERT CHECK policy to be applied on the UPDATE path.
> >>
> >> I agree.
> >
> > I can certainly understand the appeal of this approach, but I don't
> > think it makes sense. Consider what happens later on down the road,
> > after the code has been written and deployed using INSERT .. ON CONFLICT
> > UPDATE where 99% of the time only one path or the other is taken. Then
> > the other path is taken and suddenly the exact same command and row ends
> > up returning errors.
>
> I'd say: that's life. If you don't test your policies, they might not work.

I agree with that up to a point- this case feels more like a POLA
violation though rather than a run-of-the-mill "you need to test all
that." I'm a bit worried this might lead to cases where users can't use
UPSERT because they don't know which set of policies will end up getting
applied, although the above approach is strictly a subset of the
approach I'm advocating, but at least with the 'throw it all together'
case, you know exactly what you're getting with UPSERT.

Thanks,

Stephen


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 23:14:03
Message-ID: CAM3SWZRzgH3u5QDHNGtwSJ8VAj-g4a2LSYyYdigMModEwLbigQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 7, 2015 at 12:01 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Other databases have this capability and have triggers and at least one
> ends up firing both INSERT and UPDATE triggers, with many complaints
> from users about how that ends up making the performance suck. Perhaps
> we could use that as a fallback but support the explicit single trigger
> option too.. Just some thoughts, apologies if it's already been
> convered in depth previously.

I would like to expose whether or not statement-level triggers are
being called in the context of an INSERT ... ON CONFLICT UPDATE, FWIW.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-07 23:34:08
Message-ID: CAM3SWZSkHtQtioyzQV4JQBq==ZhXoM9NbM+nCZYEyb3S6-cNjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 7, 2015 at 12:26 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> I agree with that up to a point- this case feels more like a POLA
> violation though rather than a run-of-the-mill "you need to test all
> that." I'm a bit worried this might lead to cases where users can't use
> UPSERT because they don't know which set of policies will end up getting
> applied, although the above approach is strictly a subset of the
> approach I'm advocating, but at least with the 'throw it all together'
> case, you know exactly what you're getting with UPSERT.

I think that it makes sense to bunch the policies together, because
INSERT ... ON CONFLICT UPDATE is mostly an INSERT - informally
speaking, the "UPDATE part" is just enough to resolve a conflict -
this is ultimately just a new clause of INSERT. Besides, I think that
making the permissions as restrictive as possible is the least
surprising behavior. It's clearly the most secure behavior.

I think that we collectively lean towards actively throwing an error
as the preferred behavior - I think that's good, because the intent of
the user to UPDATE some particular tuple that they're not allowed to
UPDATE is clear and unambiguous. As long as we're doing that, clearly
writing a query that will fail based on the "wrong" path being taken
is wrong-headed. Now, you can construct an UPSERT case that a "bunch
together policies" approach doesn't serve well. I could be mistaken,
but I think that case is likely to be more than a little contrived.

We're now going to be testing the TARGET.* tuple before going to
update, having failed to insert (and after row locking the TARGET.*
tuple) - the target tuple is really how the update finds the existing
row to update, so it should be tested, without necessarily being
blamed directly (certainly, we cannot leak the TARGET.* tuple values,
so maybe we should try blaming the EXCLUDED.* tuple first for error
messaging reasons). If it fails because of the insert check after row
locking but before update, well, it was liable to anyway, when the
insert path was taken. Conversely, if it failed after actually
inserting where that would not happen with a vanilla INSERT (due to a
bunched-together update "USING()" security barrier qual or explicit
user-supplied CHECK OPTION), well, you'd have gotten the same thing if
you took the UPDATE path anyway.

That just leaves the regular ExecUpdate() call of
ExecWithCheckOptions(). When that is called with "bunched together"
policies, you're now testing the final tuple. So it might be a bit
confusing that the policy of final tuples originating from INSERTs is
also enforced here, for the final tuple of an "UPDATE". But in order
for even that to be the case, you'd have to have a final tuple that
was substantially different from the one originally proposed for
insertion that made the difference between a qual passing and not
passing. How likely is that in realistic use cases? And besides, isn't
the policy for INSERTs still quite relevant even here? We're talking
about a final tuple that originated from an INSERT statement, after
all.

--
Peter Geoghegan


From: David Fetter <david(at)fetter(dot)org>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-08 01:16:31
Message-ID: 20150108011631.GA13548@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 07, 2015 at 03:01:20PM -0500, Stephen Frost wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> > > I think the policies applied should depend on the path taken, so if it
> > > does an INSERT, then only the INSERT CHECK policy should be applied
> > > (after the insert), but if it ends up doing an UPDATE, I would expect
> > > the UPDATE USING policy to be applied (before the update) and the
> > > UPDATE CHECK policy to be applied (after the update). I would not
> > > expect the INSERT CHECK policy to be applied on the UPDATE path.
> >
> > I agree.
>
> I can certainly understand the appeal of this approach, but I don't
> think it makes sense. Consider what happens later on down the road,
> after the code has been written and deployed using INSERT .. ON CONFLICT
> UPDATE where 99% of the time only one path or the other is taken. Then
> the other path is taken and suddenly the exact same command and row ends
> up returning errors. Additional testing should have been done to check
> if that happens, of course, but I really don't like the idea that the
> exact same command, with the exact same policies, would succeed or fail,
> due to policies, based on the data in the database.

There's precedent. Unique constraints, for example.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-08 01:18:58
Message-ID: CAM3SWZRzht4WM4fjHq23C9C+-WDGJk+-66NX47f33DhttJZnNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 7, 2015 at 5:16 PM, David Fetter <david(at)fetter(dot)org> wrote:
> There's precedent. Unique constraints, for example.

I don't see that as any kind of precedent.

--
Peter Geoghegan


From: David Fetter <david(at)fetter(dot)org>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-08 01:33:43
Message-ID: 20150108013343.GA14183@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 07, 2015 at 05:18:58PM -0800, Peter Geoghegan wrote:
> On Wed, Jan 7, 2015 at 5:16 PM, David Fetter <david(at)fetter(dot)org> wrote:
> > There's precedent. Unique constraints, for example.
>
> I don't see that as any kind of precedent.

In the part you sliced off, Stephen described a situation where the
contents of a database either do or don't cause a query to violate a
constraint. Uniqueness constraints are one example of this.

CREATE TABLE foo(id INTEGER PRIMARY KEY);

INSERT INTO foo(id) VALUES(1); /* Works the first time */
INSERT INTO foo(id) VALUES(1); /* Fails the next time */

Same database, same constraints, different outcome.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-08 01:55:18
Message-ID: CAM3SWZTZV+okAm_23A5u0CLdYSD+CFEqzmUoWdoN_ggHQjGS3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 7, 2015 at 5:33 PM, David Fetter <david(at)fetter(dot)org> wrote:
> Same database, same constraints, different outcome.

You're missing the point, I think. UPSERT means that the user doesn't
care about whether or not a given tuple proposed for insertion was
handled with an insert or an update. If you have distinct INSERT and
UPDATE policies (not that I think that many people will), and if they
differ in such a way as to make the difference between some actual
INSERT ... ON CONFLICT UPDATE in the app throwing an error or not
throwing an error depending only on whether the UPDATE (or INSERT)
path was taken, then that combination (the combination of that UPSERT
command, that UPDATE policy and that INSERT policy) is wrong-headed.
It's not based on some particular set of data in the database, but any
and all possible sets. Stephen's objection isn't that the update
operation throws an error; it's that the insert doesn't. I'm with
Stephen; ISTM that selectively enforcing RLS like that is a foot gun.
For column level privileges, you wouldn't expect to only get an error
about not having the relevant update permissions at runtime, when the
update path happens to be taken. And so it is for RLS.

--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 00:49:24
Message-ID: 20150109004924.GK3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter,

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> For column level privileges, you wouldn't expect to only get an error
> about not having the relevant update permissions at runtime, when the
> update path happens to be taken. And so it is for RLS.

Right, that's the precedent we should be considering. Column-level
privileges is a great example- you need both insert and update
privileges for the columns involved for the command to succeed. It
shouldn't depend on which path actually ends up being taken.

Thanks!

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 08:19:10
Message-ID: CAEZATCVPvEbZG6L_Le4+m2t0XWjLJYnEE_35rVE+vK71NPoMTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9 January 2015 at 00:49, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Peter,
>
> * Peter Geoghegan (pg(at)heroku(dot)com) wrote:
>> For column level privileges, you wouldn't expect to only get an error
>> about not having the relevant update permissions at runtime, when the
>> update path happens to be taken. And so it is for RLS.
>
> Right, that's the precedent we should be considering. Column-level
> privileges is a great example- you need both insert and update
> privileges for the columns involved for the command to succeed. It
> shouldn't depend on which path actually ends up being taken.
>

It's not the same as column-level privileges though, because RLS
policies depend on the new/existing data, not just the table metadata.
So for example an "UPDATE ... WHERE false" can fail column-level
privilege checks even though it isn't updating anything, but it cannot
fail a RLS policy check.

I was trying to think up an example where you might actually have
different INSERT and UPDATE policies, and the best I can think of is
some sort of mod_count column where you have an INSERT CHECK
(mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
checking both policies would make an UPSERT impossible, whereas if you
think of it as doing either an INSERT or an UPDATE, as the syntax
suggests, it becomes possible.

(I'm assuming here from your description that you intend for the
policy expressions to be AND'ed together, so it might end up being an
AND of 2 OR expressions.)

Regards,
Dean


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 08:49:00
Message-ID: CAM3SWZQzoK9vuEjgJCMDhKmgj0F9f-kkxAmEhPS8wOONGgGS6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> I was trying to think up an example where you might actually have
> different INSERT and UPDATE policies, and the best I can think of is
> some sort of mod_count column where you have an INSERT CHECK
> (mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
> checking both policies would make an UPSERT impossible, whereas if you
> think of it as doing either an INSERT or an UPDATE, as the syntax
> suggests, it becomes possible.

Why does this user want to do this upsert? If they're upserting, then
the inserted row could only reasonably have a value of (mod_count =
0). If updating, then they must have a constant value for the update
path (a value that's greater than 0, naturally - say 2), which doesn't
make any sense in the context of an upsert's auxiliary update - what
happened to the 0 value? Sorry, but I don't think your example makes
sense - I can't see what would motivate anyone to write a query like
that with those RLS policies in place. It sounds like you're talking
about an insert and a separate update that may or may not affect the
same row, and not an upsert. Then those policies make sense, but in
practice they render the upsert you describe contradictory.

FWIW, I'm not suggesting that there couldn't possibly be a use case
that doesn't do well with this convention where we enforce RLS
deepening on the path taken. The cases are just very marginal, as I
think your difficulty in coming up with a convincing counter-argument
shows. I happen to think what Stephen and I favor ("bunching together"
USING() barrier quals and check options from INSERT and UPDATE
policies) is likely to be the best alternative available on balance.

More generally, you could point out that I'm actually testing
different tuples at different points in query processing under that
regime (e.g. the post-insert tuple, or the before-update conflicting,
existing tuple from the target, or the post update tuple) and so
things could fail when the update path is taken despite the fact that
they didn't fail when the insert path was taken. That's technically
true, of course, but with idiomatic usage it isn't true, and that's
what I care about.

Does anyone have another counter-example of a practical upsert
statement that cannot be used with certain RLS policies due to the
fact that we chose to "bunch together" INSERT and UPDATE RLS policies?
--
Peter Geoghegan


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 10:22:03
Message-ID: CAEZATCWo6iN+XLdruuE=XgJV0pS_LtrMudnhZeRM0sSewTND-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9 January 2015 at 08:49, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> I was trying to think up an example where you might actually have
>> different INSERT and UPDATE policies, and the best I can think of is
>> some sort of mod_count column where you have an INSERT CHECK
>> (mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
>> checking both policies would make an UPSERT impossible, whereas if you
>> think of it as doing either an INSERT or an UPDATE, as the syntax
>> suggests, it becomes possible.
>
> Why does this user want to do this upsert? If they're upserting, then
> the inserted row could only reasonably have a value of (mod_count =
> 0). If updating, then they must have a constant value for the update
> path (a value that's greater than 0, naturally - say 2), which doesn't
> make any sense in the context of an upsert's auxiliary update - what
> happened to the 0 value? Sorry, but I don't think your example makes
> sense - I can't see what would motivate anyone to write a query like
> that with those RLS policies in place. It sounds like you're talking
> about an insert and a separate update that may or may not affect the
> same row, and not an upsert. Then those policies make sense, but in
> practice they render the upsert you describe contradictory.
>

Whoa, hang on. I think you're being a bit quick to dismiss that
example. Why shouldn't I want an upsert where the majority of the
table columns follow the usual "make it so" pattern of an upsert, but
there is also this kind of audit column to be maintained? Then I would
write something like

INSERT INTO tbl (<some values>, 0)
ON CONFLICT UPDATE SET <same values>, mod_count=mod_count+1;

The root of the problem is the way that you're proposing to combine
the RLS policies (using AND), which runs contrary to the way RLS
policies are usually combined (using OR), which is why this kind of
example fails -- RLS policies in general aren't intended to all be
true simultaneously.

Regards,
Dean


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 20:26:07
Message-ID: 20150109202607.GE3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean, Peter,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 9 January 2015 at 08:49, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> > On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> >> I was trying to think up an example where you might actually have
> >> different INSERT and UPDATE policies, and the best I can think of is
> >> some sort of mod_count column where you have an INSERT CHECK
> >> (mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
> >> checking both policies would make an UPSERT impossible, whereas if you
> >> think of it as doing either an INSERT or an UPDATE, as the syntax
> >> suggests, it becomes possible.
> >
> > Why does this user want to do this upsert? If they're upserting, then
> > the inserted row could only reasonably have a value of (mod_count =
> > 0). If updating, then they must have a constant value for the update
> > path (a value that's greater than 0, naturally - say 2), which doesn't
> > make any sense in the context of an upsert's auxiliary update - what
> > happened to the 0 value? Sorry, but I don't think your example makes
> > sense - I can't see what would motivate anyone to write a query like
> > that with those RLS policies in place. It sounds like you're talking
> > about an insert and a separate update that may or may not affect the
> > same row, and not an upsert. Then those policies make sense, but in
> > practice they render the upsert you describe contradictory.
> >
>
> Whoa, hang on. I think you're being a bit quick to dismiss that
> example. Why shouldn't I want an upsert where the majority of the
> table columns follow the usual "make it so" pattern of an upsert, but
> there is also this kind of audit column to be maintained? Then I would
> write something like
>
> INSERT INTO tbl (<some values>, 0)
> ON CONFLICT UPDATE SET <same values>, mod_count=mod_count+1;

That's a good point- it doesn't make sense to compare the INSERT values
against the UPDATE policy as the result of the UPDATE could be a
completely different tuple, one we can't know the contents of until we
actually find a conflicting tuple. We can't simply combine the policies
and run them at the same time, but I'm not sure that then means we
should let an INSERT .. ON CONFLICT UPDATE pass in values to the INSERT
which are not permitted by any INSERT policy on the relation.

Further, forgetting about RLS, what happens if a user doesn't have
INSERT rights on the mod_count column at all? We still allow the above
to execute if the row already exists? That doesn't strike me as a good
idea. In this case, I do think it makes sense to consider RLS and our
regular permissions system as they are both 'OR' cases. I might have
UPDATE rights for a particular column through multiple different roles
for a given relation, but if none of them give me INSERT rights for that
column, then I'd expect to get an error if I try to do an INSERT into
that column.

> The root of the problem is the way that you're proposing to combine
> the RLS policies (using AND), which runs contrary to the way RLS
> policies are usually combined (using OR), which is why this kind of
> example fails -- RLS policies in general aren't intended to all be
> true simultaneously.

I agree that RLS policies, in general, are not intended to all be true
simultaneously. I don't think it then follows that an INSERT .. ON
CONFLICT UPDATE should be allowed if, for example, there are no INSERT
policies on an RLS-enabled relation, because the call happens to end up
doing an UPDATE.

To try and outline another possible use-case, consider that you have
tuples coming in from two independent sets of users, orange and blue.
All users are allowed to INSERT, but the 'color' column must equal the
user's. For UPDATEs, the blue users can see both orange and blue
records while orange users can only see orange records. Rows resulting
from an UPDATE can be orange or blue for blue users, but must be orange
for orange users.

Further, in this environment, attempts by orange users to insert blue
records are flagged to be audited for review because orange users should
never have access to nor be handling blue data. With the approach
you're outlining, a command like:

INSERT INTO data VALUES (... , 'blue')
ON CONFLICT UPDATE set ... = ...;

run by an orange user wouldn't error if it happened to result in an
UPDATE (presuming, of course, that the UPDATE didn't try to change the
existing color), but from a security perspective, it's clearly an error
for an orange user to be attempting to insert blue data.

I don't see this as impacting the more general case of how RLS policies
are applied. An individual blue user might also be a member of some
orange-related role and it wouldn't make any sense for that user to then
be only able to see orange records (were we to AND RLS policies
together).

Where this leaves me, at least, is feeling like we should always apply
the INSERT WITH CHECK policy, then if there is a conflict, check the
UPDATE USING policy and throw an error if the row isn't visible but
otherwise perform the UPDATE and then check the UPDATE WITH CHECK
policy. I see your point that this runs counter to the 'mod_count'
example use-case and could cause problems for users using RLS with such
a strategy. For my part, I expect users of RLS to expect errors in such
a case rather than allowing it, but it's certainly a judgement call.

The only reasonable way that I can see to support both sides would be to
allow UPSERT to be a top-level policy definition in its own right and
let users specify exactly what is allowed in the UPSERT case (possibly
requiring three different expressions to represent the initial INSERT,
what the UPDATE can see, and what the UPDATE results in..). I tend to
doubt it would be worth it unless we end up supporting UPSERT-specific
triggers and permissions..

Thanks,

Stephen


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 20:43:57
Message-ID: CAM3SWZQt-GaTRMbmrC5Xxx+QHzePhRtA1_kuh4d7beFRtaQUVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 9, 2015 at 2:22 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> Whoa, hang on. I think you're being a bit quick to dismiss that
> example. Why shouldn't I want an upsert where the majority of the
> table columns follow the usual "make it so" pattern of an upsert, but
> there is also this kind of audit column to be maintained? Then I would
> write something like
>
> INSERT INTO tbl (<some values>, 0)
> ON CONFLICT UPDATE SET <same values>, mod_count=mod_count+1;
>
> The root of the problem is the way that you're proposing to combine
> the RLS policies (using AND), which runs contrary to the way RLS
> policies are usually combined (using OR), which is why this kind of
> example fails -- RLS policies in general aren't intended to all be
> true simultaneously.

In case I wasn't clear, I'm proposing that we AND together the already
OR'd together UPDATE and INSERT quals.

--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 20:53:18
Message-ID: 20150109205318.GG3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter,

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Fri, Jan 9, 2015 at 2:22 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> > Whoa, hang on. I think you're being a bit quick to dismiss that
> > example. Why shouldn't I want an upsert where the majority of the
> > table columns follow the usual "make it so" pattern of an upsert, but
> > there is also this kind of audit column to be maintained? Then I would
> > write something like
> >
> > INSERT INTO tbl (<some values>, 0)
> > ON CONFLICT UPDATE SET <same values>, mod_count=mod_count+1;
> >
> > The root of the problem is the way that you're proposing to combine
> > the RLS policies (using AND), which runs contrary to the way RLS
> > policies are usually combined (using OR), which is why this kind of
> > example fails -- RLS policies in general aren't intended to all be
> > true simultaneously.
>
> In case I wasn't clear, I'm proposing that we AND together the already
> OR'd together UPDATE and INSERT quals.

I'm not sure how that would work exactly though, since the tuple the
UPDATE results in might be different from what the INSERT has, as Dean
pointed out. The INSERT tuple might even pass the UPDATE policy where
the resulting tuple from the actual UPDATE part doesn't.

Thanks!

Stephen


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 20:53:22
Message-ID: CAM3SWZS88OhkJ-auub7=CYga78A6X9CUPn+=qaFxXgJRrKUn5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 9, 2015 at 12:26 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Where this leaves me, at least, is feeling like we should always apply
> the INSERT WITH CHECK policy, then if there is a conflict, check the
> UPDATE USING policy and throw an error if the row isn't visible but
> otherwise perform the UPDATE and then check the UPDATE WITH CHECK
> policy. I see your point that this runs counter to the 'mod_count'
> example use-case and could cause problems for users using RLS with such
> a strategy. For my part, I expect users of RLS to expect errors in such
> a case rather than allowing it, but it's certainly a judgement call.

I mostly agree, but I don't know that I fully agree. Specifically, I
also think we should check the update policy even when we only insert,
on the theory that if we did go to update, the would-be inserted row
would be a proxy for what we'd check then (the existing, target
relation's tuple). What do you think of that?

I certainly agree that the correct behavior here is at least a bit
subjective. We cannot exactly generalize from other areas of the code,
nor can we look for a precedent set by other systems (AFAICT there is
none).

> The only reasonable way that I can see to support both sides would be to
> allow UPSERT to be a top-level policy definition in its own right and
> let users specify exactly what is allowed in the UPSERT case (possibly
> requiring three different expressions to represent the initial INSERT,
> what the UPDATE can see, and what the UPDATE results in..). I tend to
> doubt it would be worth it unless we end up supporting UPSERT-specific
> triggers and permissions..

Agreed. That would technically make everyone happy, in that it defers
to the user, but is unlikely to be worth it.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 21:02:41
Message-ID: CAM3SWZTwXSOLzyMot0chABoCktFgZR676D93LXWyasC4_+oWuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 9, 2015 at 12:53 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> I'm not sure how that would work exactly though, since the tuple the
> UPDATE results in might be different from what the INSERT has, as Dean
> pointed out. The INSERT tuple might even pass the UPDATE policy where
> the resulting tuple from the actual UPDATE part doesn't.

I'm not suggesting that Dean's example is totally without merit (his
revised explanation made it clearer what he meant). I just think, on
balance, that enforcing all INSERT + UPDATE policies at various points
is the best behavior. Part of the reason for that is that if you ask
people how they think it works, you'll get as many answers as the
number of people you ask. My proposal (which may or may not be the
same as yours, but if not is only slightly more restrictive) is
unlikely to affect many users negatively, and is relatively easy to
explain and reason about. There are workarounds for Dean's case, too.

--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 21:09:20
Message-ID: 20150109210919.GI3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Fri, Jan 9, 2015 at 12:26 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Where this leaves me, at least, is feeling like we should always apply
> > the INSERT WITH CHECK policy, then if there is a conflict, check the
> > UPDATE USING policy and throw an error if the row isn't visible but
> > otherwise perform the UPDATE and then check the UPDATE WITH CHECK
> > policy. I see your point that this runs counter to the 'mod_count'
> > example use-case and could cause problems for users using RLS with such
> > a strategy. For my part, I expect users of RLS to expect errors in such
> > a case rather than allowing it, but it's certainly a judgement call.
>
> I mostly agree, but I don't know that I fully agree. Specifically, I
> also think we should check the update policy even when we only insert,
> on the theory that if we did go to update, the would-be inserted row
> would be a proxy for what we'd check then (the existing, target
> relation's tuple). What do you think of that?

To flip it around a bit, I don't think we can avoid checking the
*resulting* tuple from the UPDATE against the UPDATE policy. Therefore,
I'm not sure that I see the point in checking the INSERT tuple against
the UPDATE policy. I also don't like the idea that a completely
legitimate command (one which allows the to-be-inserted row via the
INSERT policy AND allows the tuple-post-update via the UPDATE policy) to
throw an error because the to-be-inserted row violated the UPDATE
policy. That doesn't make sense to me.

Thanks!

Stephen


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 21:25:13
Message-ID: CAM3SWZTvJCP3f0rW-JsYEOEWebUh1SFGUhgGEsjZ1vKPJPCNdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 9, 2015 at 1:09 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> To flip it around a bit, I don't think we can avoid checking the
> *resulting* tuple from the UPDATE against the UPDATE policy.

We can avoid it - by not updating. What I'm suggesting is that an
enforcement occurs that is more or less equivalent to the enforcement
that occurs when the UPDATE path is taken, without it actually being
taken. I accept that that isn't perfect, but it has its advantages.

> Therefore,
> I'm not sure that I see the point in checking the INSERT tuple against
> the UPDATE policy.

I guess it wouldn't be hard to modify the struct WithCheckOption to
store the cmd (e.g. RowSecurityPolicy-style ACL_*_CHR permissions),
usable only in this context. That way, when ExecWithCheckOptions() is
called from the INSERT, we can tell it to only enforce
INSERT-applicable policy cmds (in particular, not
UPDATE-only-applicable policy cmds that happen to end up associated
with the same ResultRelation). Whereas, at the end of ExecUpdate(),
that final ExecWithCheckOptions() call (call 3 of 3 when the UPDATE
path is taken), INSERT-based policies can still be enforced against
the final tuple (as can USING() quals that would ordinarily not be
checked at that point either). A given ResultRelation's policies
wouldn't necessarily always need to be enforced within
ExecWithCheckOptions(), which is a change. Does that make sense to
you?

Note that I've already taught ExecWithCheckOptions() to not leak the
existing, target tuple when the UPDATE path is taken (the tuple that
can be referenced in the UPDATE using the TARGET.* alias), while still
performing enforcement against it. I can teach it this too.

--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-09 21:40:41
Message-ID: 20150109214041.GK3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Fri, Jan 9, 2015 at 1:09 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Therefore,
> > I'm not sure that I see the point in checking the INSERT tuple against
> > the UPDATE policy.
>
> I guess it wouldn't be hard to modify the struct WithCheckOption to
> store the cmd (e.g. RowSecurityPolicy-style ACL_*_CHR permissions),
> usable only in this context. That way, when ExecWithCheckOptions() is
> called from the INSERT, we can tell it to only enforce
> INSERT-applicable policy cmds (in particular, not
> UPDATE-only-applicable policy cmds that happen to end up associated
> with the same ResultRelation). Whereas, at the end of ExecUpdate(),
> that final ExecWithCheckOptions() call (call 3 of 3 when the UPDATE
> path is taken), INSERT-based policies can still be enforced against
> the final tuple (as can USING() quals that would ordinarily not be
> checked at that point either). A given ResultRelation's policies
> wouldn't necessarily always need to be enforced within
> ExecWithCheckOptions(), which is a change. Does that make sense to
> you?

That sounds reasonable on first blush. I'll note that I've not looked
deeply at the UPSERT patch, but if the above means that the INSERT
policy is always checked against the INSERT tuple and the UPDATE policy
is always checked against the tuple-resulting-from-an-UPDATE, and that
tuples which are not visible due to the UPDATE policy throw an error in
that case, then it's working as I'd expect.

This does mean that the UPDATE policy won't be checked when the INSERT
path is used but that seems appropriate to me, as we can't check a
policy against a tuple that never exists (the result of the update
applied to an existing tuple).

> Note that I've already taught ExecWithCheckOptions() to not leak the
> existing, target tuple when the UPDATE path is taken (the tuple that
> can be referenced in the UPDATE using the TARGET.* alias), while still
> performing enforcement against it. I can teach it this too.

Ok.

Thanks!

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-10 11:12:07
Message-ID: CAEZATCXXY+kF2aP=JyiKvUJ=--4Q3gVPDi2DWWVZzB2gFNN-cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9 January 2015 at 20:26, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Where this leaves me, at least, is feeling like we should always apply
> the INSERT WITH CHECK policy, then if there is a conflict, check the
> UPDATE USING policy and throw an error if the row isn't visible but
> otherwise perform the UPDATE and then check the UPDATE WITH CHECK
> policy.

Yeah, I can see the logic in that, but I'm starting to question the
final "then" in the above, i.e., the order in which the checks are
done.

Currently we're applying RLS CHECKs after the INSERT or UPDATE, like
WITH CHECK OPTIONs on views. The SQL spec says that WITH CHECK OPTIONs
on views have to be applied after the INSERT/UPDATE on the base
relation, but we're free to do something different for RLS CHECKs if
that makes more sense. If we want RLS to be more like column-level
privilege checking, then it does make sense to do the checks sooner,
so perhaps we should be checking the RLS policies before the
INSERT/UPDATE, like CHECK constraints.

Doing that, it makes sense to first check INSERT CHECK policies,
potentially throwing an error before any PK conflict is detected, so
an error would result even in the INSERT .. ON CONFLICT IGNORE case.
That way INSERT CHECK policies would always be applied regardless of
the path taken, as you suggest.

I would still say that the RLS UPDATE USING/CHECK policies should only
be applied if the UPDATE path is taken (before the UPDATE is applied),
and they need to be applied to the (old/new) update tuples only. It's
a bit like there is a WHERE <record already exists> clause attached to
the UPDATE. The RLS UPDATE policies should only be applied to the rows
affected by the UPDATE.

> I see your point that this runs counter to the 'mod_count'
> example use-case and could cause problems for users using RLS with such
> a strategy. For my part, I expect users of RLS to expect errors in such
> a case rather than allowing it, but it's certainly a judgement call.
>

Actually I don't think this is a problem for the mod_count example.
This isn't the same as AND'ing together the INSERT and UPDATE quals,
because each set of policies is being applied to a different tuple.
There are 3 tuples in play here:-

1). The insert tuple, which has INSERT CHECK polcies applied to it
2). The existing (old) update tuple, which has UPDATE USING policies
applied to it
3). The (new) update tuple, which has UPDATE CHECK policies applied to it

NOTE: If we do change RLS CHECKs to be executed prior to modifying any
data, that's potentially a change that could be made independently of
the UPSERT patch. We should also probably then stop referring to them
as WITH CHECK OPTIONs in the docs and in error messages because
they're not the same thing, and the code might be neater if they had
their own data-structure rather than overloading WithCheckOption.

Regards,
Dean


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-10 15:12:40
Message-ID: 20150110151240.GP3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 9 January 2015 at 20:26, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Where this leaves me, at least, is feeling like we should always apply
> > the INSERT WITH CHECK policy, then if there is a conflict, check the
> > UPDATE USING policy and throw an error if the row isn't visible but
> > otherwise perform the UPDATE and then check the UPDATE WITH CHECK
> > policy.
>
> Yeah, I can see the logic in that, but I'm starting to question the
> final "then" in the above, i.e., the order in which the checks are
> done.
>
> Currently we're applying RLS CHECKs after the INSERT or UPDATE, like
> WITH CHECK OPTIONs on views. The SQL spec says that WITH CHECK OPTIONs
> on views have to be applied after the INSERT/UPDATE on the base
> relation, but we're free to do something different for RLS CHECKs if
> that makes more sense. If we want RLS to be more like column-level
> privilege checking, then it does make sense to do the checks sooner,
> so perhaps we should be checking the RLS policies before the
> INSERT/UPDATE, like CHECK constraints.

I've been wondering about the placement of the checks also. I keep
meaning to go back to the SQL spec and review the WITH CHECK requirement
as it feels a bit odd to me, but then again, it is the spec. As you
say, we're not obligated to follow it for RLS in any case.

> Doing that, it makes sense to first check INSERT CHECK policies,
> potentially throwing an error before any PK conflict is detected, so
> an error would result even in the INSERT .. ON CONFLICT IGNORE case.
> That way INSERT CHECK policies would always be applied regardless of
> the path taken, as you suggest.

Right.

> I would still say that the RLS UPDATE USING/CHECK policies should only
> be applied if the UPDATE path is taken (before the UPDATE is applied),
> and they need to be applied to the (old/new) update tuples only. It's
> a bit like there is a WHERE <record already exists> clause attached to
> the UPDATE. The RLS UPDATE policies should only be applied to the rows
> affected by the UPDATE.

Agreed. I don't see how we could possibly do otherwise. The only tuple
we have at the outset is the to-be-INSERT'd tuple. We won't find the
conflicting tuple until we try and even then that conflicting tuple
might be different from the to-be-INSERT'd tuple, so we can't check that
until we've discovered the conflict exists. Further, we won't have the
tuple to check for the UPDATE's CHECK policy until all of the
expressions (if any) are run on the UPDATE side.

> > I see your point that this runs counter to the 'mod_count'
> > example use-case and could cause problems for users using RLS with such
> > a strategy. For my part, I expect users of RLS to expect errors in such
> > a case rather than allowing it, but it's certainly a judgement call.
>
> Actually I don't think this is a problem for the mod_count example.
> This isn't the same as AND'ing together the INSERT and UPDATE quals,
> because each set of policies is being applied to a different tuple.
> There are 3 tuples in play here:-
>
> 1). The insert tuple, which has INSERT CHECK polcies applied to it
> 2). The existing (old) update tuple, which has UPDATE USING policies
> applied to it
> 3). The (new) update tuple, which has UPDATE CHECK policies applied to it

Right.

> NOTE: If we do change RLS CHECKs to be executed prior to modifying any
> data, that's potentially a change that could be made independently of
> the UPSERT patch. We should also probably then stop referring to them
> as WITH CHECK OPTIONs in the docs and in error messages because
> they're not the same thing, and the code might be neater if they had
> their own data-structure rather than overloading WithCheckOption.

I agree that's an independent change. I'm not sure if it would end up
being neater or not offhand but I can see some advantages to breaking it
up from WithCheck as it might be simpler for users to understand
(particularly those users who are not already familiar with WithCheck).
Were you thinking about working up a patch for such a change? If not,
I'll see about finding time to do it, unless someone else wants to
volunteer. :)

Thanks!

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-10 21:38:55
Message-ID: CAEZATCWc1mx19Oy_W5mVOaEGGM_fWgZq8Pq6FvdHsiACqzN+JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10 January 2015 at 15:12, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> NOTE: If we do change RLS CHECKs to be executed prior to modifying any
>> data, that's potentially a change that could be made independently of
>> the UPSERT patch. We should also probably then stop referring to them
>> as WITH CHECK OPTIONs in the docs and in error messages because
>> they're not the same thing, and the code might be neater if they had
>> their own data-structure rather than overloading WithCheckOption.
>
> I agree that's an independent change. I'm not sure if it would end up
> being neater or not offhand but I can see some advantages to breaking it
> up from WithCheck as it might be simpler for users to understand
> (particularly those users who are not already familiar with WithCheck).

A new RlsCheck struct wouldn't need the cascade option that
WithCheckOption has, and it's not nice the way viewname is being
abused. For UPSERT it will need a field indicating the command type,
as Peter pointed out, so I think it's different enough to warrant it's
own struct, even if we weren't changing the firing time.

> Were you thinking about working up a patch for such a change?

OK, I'll take a look at it. I started reading the existing code that
expands RLS policies and spotted a couple of bugs that should be fixed
first:-

1). In prepend_row_security_policies(), defaultDeny should be
initialised to false at the top.

2). In fireRIRrules(), activeRIRs isn't being reset properly after
recursing, which will cause a self-join to fail.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-12 12:28:26
Message-ID: CAEZATCWoRxoyVx3vtUeuZPnn_vQ0970mFoFS7dBnVmmz3Sz90g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10 January 2015 at 21:38, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> OK, I'll take a look at it. I started reading the existing code that
> expands RLS policies and spotted a couple of bugs that should be fixed
> first:-
>
> 1). In prepend_row_security_policies(), defaultDeny should be
> initialised to false at the top.
>
> 2). In fireRIRrules(), activeRIRs isn't being reset properly after
> recursing, which will cause a self-join to fail.
>

So as I starting looking into these bugs, which looked fairly trivial,
the test case I came up with revealed another more subtle bug with
RLS, using a more obscure query -- given an update of the form UPDATE
t1 ... FROM t2 ..., if t1 is part of an inheritance hierarchy and both
t1 and t2 have RLS enabled, the inheritance planner was doing the
wrong thing. There is code in there to copy subquery RTEs into each
child plan, which needed to do the same for non-target RTEs with
security barrier quals, which haven't yet been turned into subqueries.
Similarly the rowmark preprocessing code needed to be taught about
(non-target) RTEs with security barrier quals to handle this kind of
UPDATE with a FROM clause.

The attached patch fixes those issues.

This bug can't happen with updatable security barrier views, since
non-target security barrier views are expanded in the rewriter, so
technically this doesn't need bach-patching to 9.4. However, I think
the planner changes should probably be back-patched anyway, to keep
the code in the 2 branches in sync, and more maintainable. Also it
feels like the planner ought to be able to handle any legal query
thrown at it, even if it looks like the 9.4 rewriter couldn't generate
such a query.

Regards,
Dean

Attachment Content-Type Size
rls.patch text/x-diff 17.2 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-12 14:24:43
Message-ID: 20150112142443.GZ3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 10 January 2015 at 21:38, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> > OK, I'll take a look at it. I started reading the existing code that
> > expands RLS policies and spotted a couple of bugs that should be fixed
> > first:-
> >
> > 1). In prepend_row_security_policies(), defaultDeny should be
> > initialised to false at the top.
> >
> > 2). In fireRIRrules(), activeRIRs isn't being reset properly after
> > recursing, which will cause a self-join to fail.
>
> So as I starting looking into these bugs, which looked fairly trivial,
> the test case I came up with revealed another more subtle bug with
> RLS, using a more obscure query -- given an update of the form UPDATE
> t1 ... FROM t2 ..., if t1 is part of an inheritance hierarchy and both
> t1 and t2 have RLS enabled, the inheritance planner was doing the
> wrong thing. There is code in there to copy subquery RTEs into each
> child plan, which needed to do the same for non-target RTEs with
> security barrier quals, which haven't yet been turned into subqueries.
> Similarly the rowmark preprocessing code needed to be taught about
> (non-target) RTEs with security barrier quals to handle this kind of
> UPDATE with a FROM clause.

Interesting, thanks for the work! I had been suspicious that there was
an issue with the recursion handling.

> The attached patch fixes those issues.

Excellent. Will take a look.

> This bug can't happen with updatable security barrier views, since
> non-target security barrier views are expanded in the rewriter, so
> technically this doesn't need bach-patching to 9.4. However, I think
> the planner changes should probably be back-patched anyway, to keep
> the code in the 2 branches in sync, and more maintainable. Also it
> feels like the planner ought to be able to handle any legal query
> thrown at it, even if it looks like the 9.4 rewriter couldn't generate
> such a query.

I'm not anxious to back-patch if there's no issue in existing branches,
but I understand your point about making sure it can handle legal
queries even if we don't believe current 9.4 can't generate them. We
don't tend to back-patch just to keep things in sync as that could
possibly have unintended side-effects.

Looking at the regression tests a bit though, I notice that this removes
the lower-level LockRows.. There had been much discussion about that
last spring which I believe you were a part of..? I specifically recall
discussing it with Craig, at least.

Thanks!

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-12 19:23:33
Message-ID: CAEZATCVfn+BUKmWi+tLyi=jHK=WAUYi6zG7c5-xODCOOkTiC_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12 January 2015 at 14:24, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Looking at the regression tests a bit though, I notice that this removes
> the lower-level LockRows.. There had been much discussion about that
> last spring which I believe you were a part of..? I specifically recall
> discussing it with Craig, at least.
>

Ah, yes you're right. Looking back over that discussion it shouldn't
be removing those lower-level LockRows. I was a bit aggressive with my
change to the rowmark preprocessing -- the first loop applies to
requested locks, like SELECT .. FOR UPDATE, so it shouldn't be messing
with that, presecurity.c handles that fine. It's only the second loop
that needs to be taught about RTEs with security quals that will
become subqueries.

Here's an updated patch, that passes with the original regression test results.

Regards,
Dean

Attachment Content-Type Size
rls.v2.patch text/x-diff 14.0 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-14 08:43:25
Message-ID: CAEZATCWHXodErT13=f6VaQybWsZKPSs6cDbAfv4baZvCHkmOkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12 January 2015 at 14:24, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Interesting, thanks for the work! I had been suspicious that there was
> an issue with the recursion handling.
>

So continuing to review the RLS code, I spotted the following in
prepend_row_security_policies():

/*
* We may end up getting called multiple times for the same RTE, so check
* to make sure we aren't doing double-work.
*/
if (rte->securityQuals != NIL)
return false;

which looked suspicious. I assume that's just a hang-up from an
earlier attempt to prevent infinite recursion in RLS expansion, but
actually it's wrong because in the case of an UPDATE to a security
barrier view on top of a table with RLS enabled, the view's
securityQuals will get added to the RTE first, and that shouldn't
prevent the underlying table's RLS securityQuals from being added.

AFAICT, it should be safe to just remove the above code. I can't see
how prepend_row_security_policies() could end up getting called more
than once for the same RTE.

Also, I'm thinking that it would be better to refactor things a bit
and have prepend_row_security_policies() just return the new
securityQuals and withCheckOptions to add. Then fireRIRrules() would
only have to recurse into the new quals being added, not the
already-processed quals.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-14 19:02:00
Message-ID: CAEZATCUprUf+8Xb3vumaTt59FcuAF6ewMfraHtJoFdmpa5jm7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14 January 2015 at 08:43, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On 12 January 2015 at 14:24, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> Interesting, thanks for the work! I had been suspicious that there was
>> an issue with the recursion handling.
>>
>
> So continuing to review the RLS code, I spotted the following in
> prepend_row_security_policies():
>
> /*
> * We may end up getting called multiple times for the same RTE, so check
> * to make sure we aren't doing double-work.
> */
> if (rte->securityQuals != NIL)
> return false;
>
> which looked suspicious. I assume that's just a hang-up from an
> earlier attempt to prevent infinite recursion in RLS expansion, but
> actually it's wrong because in the case of an UPDATE to a security
> barrier view on top of a table with RLS enabled, the view's
> securityQuals will get added to the RTE first, and that shouldn't
> prevent the underlying table's RLS securityQuals from being added.
>
> AFAICT, it should be safe to just remove the above code. I can't see
> how prepend_row_security_policies() could end up getting called more
> than once for the same RTE.
>

Turns out it wasn't as simple as that. prepend_row_security_policies()
really could get called multiple times for the same RTE, because the
call to query_tree_walker() at the end of fireRIRrules() would descend
into the just-added quals again. The simplest fix seems to be to
process RLS in a separate loop at the end, so that it can have it's
own infinite recursion detection, which is different from that needed
for pre-existing security quals and with check options from security
barrier views. This approach simplifies things a bit, and ensures that
we only try to expand RLS once for each RTE.

> Also, I'm thinking that it would be better to refactor things a bit
> and have prepend_row_security_policies() just return the new
> securityQuals and withCheckOptions to add. Then fireRIRrules() would
> only have to recurse into the new quals being added, not the
> already-processed quals.
>

Turns out that refactoring actually became necessary in order to fix
this bug, but I think it makes things cleaner and more efficient.

Here's an updated patch with a new test for this bug. I've been
developing the fixes for these RLS issues as one big patch, but I
suppose it would be easy to split up, if that's preferred.

Regards,
Dean

Attachment Content-Type Size
rls.v3.patch text/x-diff 30.5 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-14 19:23:07
Message-ID: 20150114192307.GC3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> Turns out it wasn't as simple as that. prepend_row_security_policies()
> really could get called multiple times for the same RTE, because the
> call to query_tree_walker() at the end of fireRIRrules() would descend
> into the just-added quals again. The simplest fix seems to be to
> process RLS in a separate loop at the end, so that it can have it's
> own infinite recursion detection, which is different from that needed
> for pre-existing security quals and with check options from security
> barrier views. This approach simplifies things a bit, and ensures that
> we only try to expand RLS once for each RTE.

Right, I specifically recall having prepend_row_security_policies()
getting called multiple times for the same RTE. I like this approach of
using a separate loop though and it strikes me that it lends more
weight to the argument that we're better off with these as independent
considerations.

> > Also, I'm thinking that it would be better to refactor things a bit
> > and have prepend_row_security_policies() just return the new
> > securityQuals and withCheckOptions to add. Then fireRIRrules() would
> > only have to recurse into the new quals being added, not the
> > already-processed quals.

Hmm, good point.

> Turns out that refactoring actually became necessary in order to fix
> this bug, but I think it makes things cleaner and more efficient.

Sounds good, I'll take a look.

> Here's an updated patch with a new test for this bug. I've been
> developing the fixes for these RLS issues as one big patch, but I
> suppose it would be easy to split up, if that's preferred.

I'm alright with it as-is for now.

Thanks!

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-19 08:21:35
Message-ID: CAEZATCUBHjQs2H5w8pxr9ZSExHRRAQ0cNReYKbKhLhn5Wvjiiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10 January 2015 at 15:12, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
>> Currently we're applying RLS CHECKs after the INSERT or UPDATE, like
>> WITH CHECK OPTIONs on views. The SQL spec says that WITH CHECK OPTIONs
>> on views have to be applied after the INSERT/UPDATE on the base
>> relation, but we're free to do something different for RLS CHECKs if
>> that makes more sense. If we want RLS to be more like column-level
>> privilege checking, then it does make sense to do the checks sooner,
>> so perhaps we should be checking the RLS policies before the
>> INSERT/UPDATE, like CHECK constraints.
>
> Were you thinking about working up a patch for such a change? If not,
> I'll see about finding time to do it, unless someone else wants to
> volunteer. :)
>

Attached is a patch to make RLS checks run before attempting to
insert/update any data rather than afterwards.

In the end I decided not to create a new structure for RLS checks
because most of the code that handles them treats them the same as
WCOs. Instead, I just added a new 'kind' enum field to the existing
structure and renamed/reworded things a bit.

The patch also changes the error message for a RLS check violation, to
make the cause of the error clearer. One thing I'm not sure about is
what sqlstate code to use for this error, but I don't think that using
WITH_CHECK_OPTION_VIOLATION is appropriate, because that seems to be
specifically intended for views.

Regards,
Dean

Attachment Content-Type Size
rls-timing.patch text/x-diff 18.3 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-01-20 02:56:14
Message-ID: 20150120025614.GX3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> Attached is a patch to make RLS checks run before attempting to
> insert/update any data rather than afterwards.

Excellent, many thanks!

> In the end I decided not to create a new structure for RLS checks
> because most of the code that handles them treats them the same as
> WCOs. Instead, I just added a new 'kind' enum field to the existing
> structure and renamed/reworded things a bit.

Makes sense to me. I like being able to easily differentiate the two
from each other now while also not needing to duplicate a bunch of code.
I also like the reworded error messages.

I am wondering if we should, perhaps, rename ri_WithCheckOptions or
ExecWithCheckOptions() (or even more..) to indicate that they're used
for both view-based WITH CHECK options and for RLS.

We'll need to update this patch once the fixes for the WITH CHECK leaks
go in, of course.

> The patch also changes the error message for a RLS check violation, to
> make the cause of the error clearer. One thing I'm not sure about is
> what sqlstate code to use for this error, but I don't think that using
> WITH_CHECK_OPTION_VIOLATION is appropriate, because that seems to be
> specifically intended for views.

Hmm, agreed. Any thoughts on what to use? We could fall back on
something like ERRCODE_INSUFFICIENT_PRIVILEGE if necessary, I suppose.

Thanks!

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-02-26 05:41:12
Message-ID: 20150226054112.GK29780@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> Here's an updated patch with a new test for this bug. I've been
> developing the fixes for these RLS issues as one big patch, but I
> suppose it would be easy to split up, if that's preferred.

Thanks for working on all of this!

I've brought this up to date with master and addressed the little bit
of bitrot. I'm still reviewing it but I'm generally happy with the
approach and would certainly welcome any additional thoughts from you or
feedback from others.

Thanks!

Stephen

Attachment Content-Type Size
rls.v4.patch text/x-diff 27.3 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-02-26 05:43:59
Message-ID: 20150226054359.GL29780@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean,

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> Attached is a patch to make RLS checks run before attempting to
> insert/update any data rather than afterwards.

Excellent, this I really like and it's a pretty straight-forward change.
I wonder if there are some documentation updates which need to be done
for this also? I'm planning to look as I vauguely recall mentioning the
ordering of operations somewhere along the way.

I also addressed the bitrot from the column-priv leak patch. Would be
great to have you take a look at the latest and let me know if you have
any further comments or suggestions. I'm definitely looking forward to
getting these changes in soon.

Thanks!

Stephen

Attachment Content-Type Size
rls-timing-v2.patch text/x-diff 15.3 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-02-26 09:27:31
Message-ID: CAEZATCUXu9RMqa2hOPKKK19RCAR5Jy=x4mtTSX5FLkzcaj1+VA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2015 at 05:41, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Dean,
>
> * Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
>> Here's an updated patch with a new test for this bug. I've been
>> developing the fixes for these RLS issues as one big patch, but I
>> suppose it would be easy to split up, if that's preferred.
>
> Thanks for working on all of this!
>
> I've brought this up to date with master and addressed the little bit
> of bitrot. I'm still reviewing it but I'm generally happy with the
> approach and would certainly welcome any additional thoughts from you or
> feedback from others.
>

Thanks for doing that. I had been meaning to update it myself, but
kept getting distracted by my day job.

I think this should probably be committed as 2 separate patches,
because there are really 2 separate issues being fixed here, and the
commit comment only mentions the second one:

1). The planner changes and the first new regression test (update with
from clause self join). This is a fix to the inheritance planner code,
which wasn't properly handling the case of a query containing both
target and non-target relations with RLS and inheritance.

2). The rewriter changes and the other regression tests (S.b. view on
top of Row-level security). This is more than just a code tidy-up --
there's a real bug there. If you run those tests against HEAD, the
update against the s.b. view fails to apply the RLS policy of the
underlying table because the old recursion detection code thinks it
has already handled that RTE.

I should get more time to look at this over the weekend.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-02-26 09:50:24
Message-ID: CAEZATCVCtPMM22Q+=hy41NUuWCBFXWV1bm2ayT_+AUfkf91q_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2015 at 05:43, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Dean,
>
> * Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
>> Attached is a patch to make RLS checks run before attempting to
>> insert/update any data rather than afterwards.
>
> Excellent, this I really like and it's a pretty straight-forward change.
> I wonder if there are some documentation updates which need to be done
> for this also? I'm planning to look as I vauguely recall mentioning the
> ordering of operations somewhere along the way.
>
> I also addressed the bitrot from the column-priv leak patch. Would be
> great to have you take a look at the latest and let me know if you have
> any further comments or suggestions. I'm definitely looking forward to
> getting these changes in soon.
>

Thanks for the update. I don't recall any mention of ordering in the
docs, but if there isn't anything there it makes sense to add
something.

I haven't thought about this patch in a while, but I still like the
look of it. It's definitely neater to do these checks earlier and it's
good to be able to get RLS-specific errors too. I'll take a look at
the latest updates.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-03-02 10:16:27
Message-ID: CAEZATCVoaNiy5qLGda4K-nmP7GbJJgpVGucBAtA1ckVm-uWvgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2015 at 09:27, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> I think this should probably be committed as 2 separate patches...

On closer inspection, the 2 parts are interrelated since the new
self-join test that tests the inheritance planner changes also
requires the rewriter changes, otherwise it falls over with an
infinite recursion error in the rewriter. I suppose the rewriter
changes could be committed first with the self-join without
inheritance test to test the fix to the infinite recursion problem,
but I think it's probably easier to just treat this as a single set of
related bug fixes.

Attached is an updated patch. There are no more code changes, but I've
added a few more regression tests.

Regards,
Dean

Attachment Content-Type Size
rls.v5.patch text/x-diff 38.6 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-03-06 09:04:30
Message-ID: CAEZATCVz1u3dyjdzXN3F26rxks2BYXDz--_2rTZfRhuU0zfWSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 February 2015 at 09:50, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On 26 February 2015 at 05:43, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> I wonder if there are some documentation updates which need to be done
>> for this also? I'm planning to look as I vauguely recall mentioning the
>> ordering of operations somewhere along the way.
>>

I couldn't find any mention of the timing of the check in the existing
documentation, although it does vaguely imply that the check is done
before inserting any new data. There is an existing paragraph
describing the timing of USING conditions, so I added a new paragraph
immediately after that to explain when CHECK expressions are enforced,
since that seemed like the best place for it.

>> I also addressed the bitrot from the column-priv leak patch. Would be
>> great to have you take a look at the latest and let me know if you have
>> any further comments or suggestions.

I looked it over again, and I'm happy with these updates, except there
was a missing "break" in the switch statement in
ExecWithCheckOptions().

Here's an updated patch.

Regards,
Dean

Attachment Content-Type Size
rls-timing-v3.patch text/x-diff 18.5 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Date: 2015-03-06 14:23:44
Message-ID: 20150306142344.GO29780@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Dean Rasheed (dean(dot)a(dot)rasheed(at)gmail(dot)com) wrote:
> On 26 February 2015 at 09:50, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> > On 26 February 2015 at 05:43, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >> I wonder if there are some documentation updates which need to be done
> >> for this also? I'm planning to look as I vauguely recall mentioning the
> >> ordering of operations somewhere along the way.
>
> I couldn't find any mention of the timing of the check in the existing
> documentation, although it does vaguely imply that the check is done
> before inserting any new data. There is an existing paragraph
> describing the timing of USING conditions, so I added a new paragraph
> immediately after that to explain when CHECK expressions are enforced,
> since that seemed like the best place for it.

Ah, ok, I must have been thinking about the USING discussion. Thanks
for adding the paragraph about the CHECK timing.

> >> I also addressed the bitrot from the column-priv leak patch. Would be
> >> great to have you take a look at the latest and let me know if you have
> >> any further comments or suggestions.
>
> I looked it over again, and I'm happy with these updates, except there
> was a missing "break" in the switch statement in
> ExecWithCheckOptions().

Oh, thanks for catching that!

> Here's an updated patch.

Excellent, will review.

Thanks again!

Stephen