Re: Proof of concept: auto updatable views

Lists: pgsql-hackers
From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Proof of concept: auto updatable views
Date: 2012-07-01 22:35:54
Message-ID: CAEZATCXOFPmQFDssO=dncqJ1a+jw74gUfPZE=pgRm4OQ1iSHdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I've been playing around with the idea of supporting automatically
updatable views, and I have a working proof of concept. I've taken a
different approach than the previous attempts to implement this
feature (e.g., http://archives.postgresql.org/pgsql-hackers/2009-01/msg01746.php),
instead doing all the work in the rewriter, substituting the view for
its base relation rather than attempting to auto-generate any rules or
triggers.

Basically what it does is this: in the first stage of query rewriting,
just after any non-SELECT rules are applied, the new code kicks in -
if the target relation is a view, and there were no unqualified
INSTEAD rules, and there are no INSTEAD OF triggers, it tests if the
view is simply updatable. If so, the target view is replaced by its
base relation and columns are re-mapped. Then the remainder of the
rewriting process continues, recursively handling any further
non-SELECT rules or additional simply updatable views. This handles
the case of views on top of views, with or without rules and/or
triggers.

Here's a simple example:

CREATE TABLE my_table(id int primary key, val text);
CREATE VIEW my_view AS SELECT * FROM my_table WHERE id > 0;

then any modifications to the view get redirected to underlying table:

EXPLAIN ANALYSE INSERT INTO my_view VALUES(1, 'Test row');
QUERY PLAN
------------------------------------------------------------------------------------------------
Insert on my_table (cost=0.00..0.01 rows=1 width=0) (actual
time=0.208..0.208 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual
time=0.004..0.004 rows=1 loops=1)
Total runtime: 0.327 ms
(3 rows)

EXPLAIN ANALYSE UPDATE my_view SET val='Updated' WHERE id=1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Update on my_table (cost=0.00..8.27 rows=1 width=10) (actual
time=0.039..0.039 rows=0 loops=1)
-> Index Scan using my_table_pkey on my_table (cost=0.00..8.27
rows=1 width=10) (actual time=0.014..0.015 rows=1 loops=1)
Index Cond: ((id > 0) AND (id = 1))
Total runtime: 0.090 ms
(4 rows)

EXPLAIN ANALYSE DELETE FROM my_view;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on my_table (cost=0.00..1.01 rows=1 width=6) (actual
time=0.030..0.030 rows=0 loops=1)
-> Seq Scan on my_table (cost=0.00..1.01 rows=1 width=6) (actual
time=0.015..0.016 rows=1 loops=1)
Filter: (id > 0)
Total runtime: 0.063 ms
(4 rows)

The patch is currently very strict about what kinds of views can be
updated (based on SQL-92), and there is no support for WITH CHECK
OPTION, because I wanted to keep this patch as simple as possible.

The consensus last time seemed to be that backwards compatibility
should be offered through a new GUC variable to allow this feature to
be disabled globally, which I've not implemented yet.

I'm also aware that my new function ChangeVarAttnos() is almost
identical to the function map_variable_attnos() that Tom recently
added, but I couldn't immediately see a neat way to merge the two. My
function handles whole-row references to the view by mapping them to a
generic RowExpr based on the view definition. I don't think a
ConvertRowtypeExpr can be used in this case, because the column names
don't necessarily match.

Obviously there's still more work to do but the early signs seem to be
encouraging.

Thoughts?

Regards,
Dean

Attachment Content-Type Size
auto-update-views.patch application/octet-stream 20.5 KB

From: Darren Duncan <darren(at)darrenduncan(dot)net>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-07-01 23:54:58
Message-ID: 4FF0E352.8010400@darrenduncan.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

My thoughts on this is that it would be a very valuable feature to have, and
would make Postgres views behave more like they always were intended to behave,
which is indistinguishible to users from tables in behavior where all possible,
and that the reverse mapping would be automatic with the DBMS being given only
the view-defining SELECT, where possible. -- Darren Duncan

Dean Rasheed wrote:
> I've been playing around with the idea of supporting automatically
> updatable views, and I have a working proof of concept. I've taken a
> different approach than the previous attempts to implement this
> feature (e.g., http://archives.postgresql.org/pgsql-hackers/2009-01/msg01746.php),
> instead doing all the work in the rewriter, substituting the view for
> its base relation rather than attempting to auto-generate any rules or
> triggers.
>
> Basically what it does is this: in the first stage of query rewriting,
> just after any non-SELECT rules are applied, the new code kicks in -
> if the target relation is a view, and there were no unqualified
> INSTEAD rules, and there are no INSTEAD OF triggers, it tests if the
> view is simply updatable. If so, the target view is replaced by its
> base relation and columns are re-mapped. Then the remainder of the
> rewriting process continues, recursively handling any further
> non-SELECT rules or additional simply updatable views. This handles
> the case of views on top of views, with or without rules and/or
> triggers.
>
<snip>
>
> Obviously there's still more work to do but the early signs seem to be
> encouraging.
>
> Thoughts?


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-07-02 20:27:25
Message-ID: CA+Tgmoa=4m8-kGmYVEbzE5b9fG_OjQc5BNxacuSUoZ+RF2TJ0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 1, 2012 at 6:35 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> I've been playing around with the idea of supporting automatically
> updatable views, and I have a working proof of concept. I've taken a
> different approach than the previous attempts to implement this
> feature (e.g., http://archives.postgresql.org/pgsql-hackers/2009-01/msg01746.php),
> instead doing all the work in the rewriter, substituting the view for
> its base relation rather than attempting to auto-generate any rules or
> triggers.
>
> Basically what it does is this: in the first stage of query rewriting,
> just after any non-SELECT rules are applied, the new code kicks in -
> if the target relation is a view, and there were no unqualified
> INSTEAD rules, and there are no INSTEAD OF triggers, it tests if the
> view is simply updatable. If so, the target view is replaced by its
> base relation and columns are re-mapped. Then the remainder of the
> rewriting process continues, recursively handling any further
> non-SELECT rules or additional simply updatable views. This handles
> the case of views on top of views, with or without rules and/or
> triggers.

Regrettably, I don't have time to look at this in detail right now,
but please add it to the next CommitFest so it gets looked at. It
sounds like it could be very cool.

> The consensus last time seemed to be that backwards compatibility
> should be offered through a new GUC variable to allow this feature to
> be disabled globally, which I've not implemented yet.

I think the backward-compatibility concerns with this approach would
be much less than with the previously-proposed approach, so I'm not
100% sure we need a backward compatibility knob. If we're going to
have one, a reloption would probably be a better fit than a GUC, now
that views support reloptions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-07-02 20:34:05
Message-ID: 2226.1341261245@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, Jul 1, 2012 at 6:35 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> Basically what it does is this: in the first stage of query rewriting,
>> just after any non-SELECT rules are applied, the new code kicks in -
>> if the target relation is a view, and there were no unqualified
>> INSTEAD rules, and there are no INSTEAD OF triggers, it tests if ...

>> The consensus last time seemed to be that backwards compatibility
>> should be offered through a new GUC variable to allow this feature to
>> be disabled globally, which I've not implemented yet.

> I think the backward-compatibility concerns with this approach would
> be much less than with the previously-proposed approach, so I'm not
> 100% sure we need a backward compatibility knob.

If the above description is correct, the behavior is changed only in
cases that previously threw errors, so I tend to agree that no
"backwards compatibility knob" is needed.

regards, tom lane


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-07-03 06:53:02
Message-ID: CAEZATCXio55=4Ns0EbQHK3XsyQ5Kmn0BnNr95VEHfETuwLewtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 July 2012 21:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Sun, Jul 1, 2012 at 6:35 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>> Basically what it does is this: in the first stage of query rewriting,
>>> just after any non-SELECT rules are applied, the new code kicks in -
>>> if the target relation is a view, and there were no unqualified
>>> INSTEAD rules, and there are no INSTEAD OF triggers, it tests if ...
>
>>> The consensus last time seemed to be that backwards compatibility
>>> should be offered through a new GUC variable to allow this feature to
>>> be disabled globally, which I've not implemented yet.
>
>> I think the backward-compatibility concerns with this approach would
>> be much less than with the previously-proposed approach, so I'm not
>> 100% sure we need a backward compatibility knob.
>
> If the above description is correct, the behavior is changed only in
> cases that previously threw errors, so I tend to agree that no
> "backwards compatibility knob" is needed.
>

Yeah, I think you're right - the default ACLs will typically give
sensible behaviour. So unless users have been cavalier with the use of
GRANT ALL, their existing views are only going to start becoming
updatable to the owners of the views (and only then if the view owner
already has write permission on the underlying table).

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-08-12 21:14:06
Message-ID: CAEZATCXsismrGFbKR_nWn+XUtUmfJ3QFhu+nCSGRxovhJdCXaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've been looking at this further and I have made some improvements,
but also found a problem.

On 1 July 2012 23:35, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> I'm also aware that my new function ChangeVarAttnos() is almost
> identical to the function map_variable_attnos() that Tom recently
> added, but I couldn't immediately see a neat way to merge the two. My
> function handles whole-row references to the view by mapping them to a
> generic RowExpr based on the view definition. I don't think a
> ConvertRowtypeExpr can be used in this case, because the column names
> don't necessarily match.

I improved on this by reusing the existing function ResolveNew() which
reduced the size of the patch.

The problem, however, is that the original patch is not safe for
UPDATE/DELETE on security barrier views, because it mixes the user's
quals with those on the view. For example a query like

UPDATE some_view SET col=... WHERE user_quals

will get rewritten as

UPDATE base_table SET base_col=... WHERE user_quals AND view_quals

which potentially leaks data hidden by the view's quals.

So I think that it needs to use a subquery to isolate user_quals from
view_quals. The least invasive way I could see to do that was to
record the view quals in a new security barrier quals field on the
RTE, and then turn that into a subquery at the end of the rewriting
process. This approach avoids the extensive changes to the rewriter
that I think would otherwise be needed if the RTE were changed into a
subquery near the start of the rewriting process. It is also possible
that this code might be reusable in the row-level security patch by
Kohei KaiGai to handle modifications to tables with RLS quals,
although I haven't looked too closely at that patch yet.

Another complication is that the executor appears to have no
rechecking code for subqueries in nodeSubqueryscan.c, so it looks like
I need to lock rows coming from the base relation in the subquery. So
for example, given the following setup

CREATE TABLE private_t(a int, b text);
INSERT INTO private_t VALUES (1, 'Private');
INSERT INTO private_t
SELECT i, 'Public '||i FROM generate_series(2,10000) g(i);
CREATE INDEX private_t_a_idx ON private_t(a);
ANALYSE private_t;

CREATE VIEW public_v WITH (security_barrier=true)
AS SELECT b AS bb, a AS aa FROM private_t WHERE a != 1;

CREATE OR REPLACE FUNCTION snoop(b text)
RETURNS boolean AS
$$
BEGIN
RAISE NOTICE 'b=%', b;
RETURN false;
END;
$$
LANGUAGE plpgsql STRICT IMMUTABLE COST 0.000001;

an update on the view will get rewritten as an update on the base
table from a subquery scan as follows:

EXPLAIN (costs off) UPDATE public_v SET aa=aa WHERE snoop(bb) AND aa<=2;

Update on private_t
-> Subquery Scan on private_t
Filter: snoop(private_t.b)
-> LockRows
-> Index Scan using private_t_a_idx on private_t
Index Cond: (a <= 2)
Filter: (a <> 1)

The LockRows node appears to give the expected behaviour for
concurrently modified rows, but I'm not really sure if that's the
right approach.

None of this new code kicks in for non-security barrier views, so the
kinds of plans I posted upthread remain unchanged in that case. But
now a significant fraction of the patch is code added to handle
security barrier views. Of course we could simply say that such views
aren't updatable, but that seems like an annoying limitation if there
is a feasible way round it.

Thoughts?

Regards,
Dean

Attachment Content-Type Size
auto-update-views.patch application/octet-stream 36.3 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-08-27 19:26:11
Message-ID: CAEZATCWrUx8YTT4wksc6uqZmC4RWOrW=SD37y7F3NbmLSH-+NQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12 August 2012 22:14, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> I've been looking at this further and I have made some improvements...

Here's an updated WIP patch which I'll add to the next commitfest.
I've added information schema updates and some new regression tests. I
still need to work on some doc updates.

Regards,
Dean

Attachment Content-Type Size
auto-update-views.patch application/octet-stream 84.7 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-08-28 06:48:11
Message-ID: CAEZATCW5d4r0kwBVWXwkQQ82C-d3z2f4SqeoeYJ_V8aRF94nnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 August 2012 20:26, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> Here's an updated WIP patch which I'll add to the next commitfest.

Re-sending gzipped (apparently the mail system corrupted it last time).

Regards,
Dean

Attachment Content-Type Size
auto-update-views.patch.gz application/x-gzip 15.9 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-08-30 19:05:36
Message-ID: CA+Tgmoba=sbvS5M5tJ6AN4zBaeBsKRk_v71=sfSVTWfCqkGDbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Aug 12, 2012 at 5:14 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> None of this new code kicks in for non-security barrier views, so the
> kinds of plans I posted upthread remain unchanged in that case. But
> now a significant fraction of the patch is code added to handle
> security barrier views. Of course we could simply say that such views
> aren't updatable, but that seems like an annoying limitation if there
> is a feasible way round it.

Maybe it'd be a good idea to split this into two patches: the first
could implement the feature but exclude security_barrier views, and
the second could lift that restriction.

Just a thought.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-08-31 06:59:43
Message-ID: CAEZATCV9Z1rnjZh_sHFN+p_PmiD-E8OMMe0UEmC4hcwEi4SN7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30 August 2012 20:05, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Aug 12, 2012 at 5:14 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> None of this new code kicks in for non-security barrier views, so the
>> kinds of plans I posted upthread remain unchanged in that case. But
>> now a significant fraction of the patch is code added to handle
>> security barrier views. Of course we could simply say that such views
>> aren't updatable, but that seems like an annoying limitation if there
>> is a feasible way round it.
>
> Maybe it'd be a good idea to split this into two patches: the first
> could implement the feature but exclude security_barrier views, and
> the second could lift that restriction.
>

Yes, I think that makes sense.
I should hopefully get some time to look at it over the weekend.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: auto updatable views
Date: 2012-09-02 20:37:40
Message-ID: CAEZATCXX04uUkK6=BCSZAwd3xunzYfyid4itGHTcz3cg_y1TtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 31 August 2012 07:59, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On 30 August 2012 20:05, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Sun, Aug 12, 2012 at 5:14 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>> None of this new code kicks in for non-security barrier views, so the
>>> kinds of plans I posted upthread remain unchanged in that case. But
>>> now a significant fraction of the patch is code added to handle
>>> security barrier views. Of course we could simply say that such views
>>> aren't updatable, but that seems like an annoying limitation if there
>>> is a feasible way round it.
>>
>> Maybe it'd be a good idea to split this into two patches: the first
>> could implement the feature but exclude security_barrier views, and
>> the second could lift that restriction.
>>
>
> Yes, I think that makes sense.
> I should hopefully get some time to look at it over the weekend.
>

Here's an updated patch for the base feature (without support for
security barrier views) with updated docs and regression tests.

Regards,
Dean

Attachment Content-Type Size
auto-update-views.patch.gz application/x-gzip 14.1 KB