Re: About "Allow VIEW/RULE recompilation when the underlying tables change"

Lists: pgsql-hackers
From: "suzhiyang" <suzhiyang(at)gmail(dot)com>
To: "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 02:58:21
Message-ID: 200912191058189847804@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm doing the Todo List's "Allow VIEW/RULE recompilation when the underlying tables change ". I've a very simple idea that I save the "create view" query_string in systable. When I found that I select from a view, I drop the view and recreate the view by execute the source query_string. Then go on execute this "select from a view" and will get the right answer. My problem is that how could I get the orginal query which user typed to postgres?
Er, maybe drop and recreate the view is an ugly idea...Is there any better methods to do recompilation?

Thank you!

2009-12-19

suzhiyang


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: suzhiyang <suzhiyang(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 03:17:43
Message-ID: 603c8f070912181917p333b25a3i9f368c160b5ae099@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang <suzhiyang(at)gmail(dot)com> wrote:
> I'm doing the Todo List's "Allow VIEW/RULE recompilation when the underlying
> tables change ". I've a very simple idea that I save the "create view"
> query_string in systable. When I found that I select from a view, I drop the
> view and recreate the view by execute the source query_string. Then go on
> execute this "select from a view" and will get the right answer. My problem
> is that how could I get the orginal query which user typed to postgres?
> Er, maybe drop and recreate the view is an ugly idea...Is there any better
> methods to do recompilation?

I am not sure what this TODO item is supposed to refer to, but saving
the query string and re-executing it is clearly not acceptable. What
if a column or table or function referenced in the query has been
renamed since the view/rule was created?

...Robert


From: suzhiyang <suzhiyang(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [HACKERS] About "Allow VIEW/RULE recompilation when theunderlying tables change"
Date: 2009-12-19 03:25:22
Message-ID: 200912191125176878598@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yeah, when a column of the underlying table renamed, this re-executing must be failed...

2009-12-19

suzhiyang

发件人: Robert Haas
发送时间: 2009-12-19 11:17:44
收件人: suzhiyang
抄送: pgsql-hackers
主题: Re: [HACKERS] About "Allow VIEW/RULE recompilation when theunderlying tables change"

On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang <suzhiyang(at)gmail(dot)com> wrote:
> I'm doing the Todo List's "Allow VIEW/RULE recompilation when the underlying
> tables change ". I've a very simple idea that I save the "create view"
> query_string in systable. When I found that I select from a view, I drop the
> view and recreate the view by execute the source query_string. Then go on
> execute this "select from a view" and will get the right answer. My problem
> is that how could I get the orginal query which user typed to postgres?
> Er, maybe drop and recreate the view is an ugly idea...Is there any better
> methods to do recompilation?
I am not sure what this TODO item is supposed to refer to, but saving
the query string and re-executing it is clearly not acceptable. What
if a column or table or function referenced in the query has been
renamed since the view/rule was created?
...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: suzhiyang <suzhiyang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 03:39:45
Message-ID: 17527.1261193985@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang <suzhiyang(at)gmail(dot)com> wrote:
>> I'm doing the Todo List's "Allow VIEW/RULE recompilation when the underlying
>> tables change ".

> I am not sure what this TODO item is supposed to refer to, but saving
> the query string and re-executing it is clearly not acceptable. What
> if a column or table or function referenced in the query has been
> renamed since the view/rule was created?

The TODO item is terribly underdocumented, but I think what it's on
about is that right now we refuse commands like ALTER COLUMN TYPE if
the column is referenced in a view. It would be nice to propagate
such a change into views if possible. The hard part is that the view
might contain operations on the column --- such as "func(x)" or "x+1"
or "ORDER BY x" --- so you'd have to determine new semantics for those,
which is why the term "recompilation" seems appropriate.

The other point that is alluded to in the TODO item is that if the view
was originally specified as "SELECT * FROM foo" then one might wonder
whether ALTER TABLE foo ADD COLUMN bar adds the new column to the view
too. Now the SQL standard is perfectly clear that the answer is "no",
but that hasn't dissuaded certain people from lobbying for us to do it
anyway.

As you say, the current implementation has a number of advantages that
"reread the original query string" would not --- including compliance to
the standard on the above point --- and I doubt we're going to want to
give those up. So a patch that has ambitions of getting accepted is
going to have to do something smarter. Don't know exactly what.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: suzhiyang <suzhiyang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 03:58:00
Message-ID: 603c8f070912181958w7d57db23q7cf9029ec3e75b9a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 18, 2009 at 10:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Dec 18, 2009 at 9:58 PM, suzhiyang <suzhiyang(at)gmail(dot)com> wrote:
>>> I'm doing the Todo List's "Allow VIEW/RULE recompilation when the underlying
>>> tables change ".
>
>> I am not sure what this TODO item is supposed to refer to, but saving
>> the query string and re-executing it is clearly not acceptable.   What
>> if a column or table or function referenced in the query has been
>> renamed since the view/rule was created?
>
> The TODO item is terribly underdocumented, but I think what it's on
> about is that right now we refuse commands like ALTER COLUMN TYPE if
> the column is referenced in a view.  It would be nice to propagate
> such a change into views if possible.  The hard part is that the view
> might contain operations on the column --- such as "func(x)" or "x+1"
> or "ORDER BY x" --- so you'd have to determine new semantics for those,
> which is why the term "recompilation" seems appropriate.

I'm unconvinced that this is, as you would say, even mildly sane. I
don't think changing the type of a column in one table constitutes a
license to randomly reinterpret the semantics of views that reference
it. Suppose that we have functions foo(int) and bar(varchar).
Someone creates a view like this:

SELECT foo(x.x1) FROM x;

Next, they rename the function foo to bar. Now, they change the type
of x1 from integer to varchar. Do you REALLY want to now go grab the
other function? Because that sounds like an incredibly bad idea to
me. I think we have a consistent principle that views, foreign key
constraints, and all manner of objects generally within the database
are tied to a particular object rather than a particular name. I
think that's unquestionably a good decision, but even someone were to
feel otherwise, I don't see how you can argue that we would SOMETIMES
want to follow the object and OTHER TIMES the name.

Maybe you could make an argument for treating foo(x.x1) call as
foo(x.x1::integer) after the type of x.x1 gets changed, but even that
seems suspiciously like black magic.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: suzhiyang <suzhiyang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 04:24:01
Message-ID: 18746.1261196641@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Dec 18, 2009 at 10:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The TODO item is terribly underdocumented, but I think what it's on
>> about is that right now we refuse commands like ALTER COLUMN TYPE if
>> the column is referenced in a view. It would be nice to propagate
>> such a change into views if possible.

> I'm unconvinced that this is, as you would say, even mildly sane.

I've updated the TODO item to link to this discussion, so at least the
next three people who pop up with "let's just store the view source!"
will have some idea of what they're up against.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: suzhiyang <suzhiyang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 04:28:17
Message-ID: 603c8f070912182028g1de986f5p30b2d0cc84716e90@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 18, 2009 at 11:24 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Dec 18, 2009 at 10:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> The TODO item is terribly underdocumented, but I think what it's on
>>> about is that right now we refuse commands like ALTER COLUMN TYPE if
>>> the column is referenced in a view.  It would be nice to propagate
>>> such a change into views if possible.
>
>> I'm unconvinced that this is, as you would say, even mildly sane.
>
> I've updated the TODO item to link to this discussion, so at least the
> next three people who pop up with "let's just store the view source!"
> will have some idea of what they're up against.

Excellent. :-)

...Robert


From: Caleb Welton <cwelton(at)greenplum(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: suzhiyang <suzhiyang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 18:56:16
Message-ID: C75261D0.3FE4%cwelton@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I was dealing with a customer recently who very much wanted this behavior, during discussions with them
I wrote up a little something describing how different database vendors treat views and alter statements.
...

Part of the issue here is that the SQL Standard does a very poor job of expressing what correct behavior is of VIEWS when the underlying table is altered, as a result nearly every major database vendor has different behavior. <the customer> would be having similar (but slightly different) problems if they moved from almost any database to almost any other database.

Oracle: Treats all views as the text used to define them and allows for the possibility of "invalid" views
Terradata: Expands and fully qualifies the text used to define the views, but still treats them as text, and allows for the possibility of "invalid" views.
DB2: Treats views as logical and does not allow for the possibility of "invalid" views. It tries to allow ALTER statements but only under limited circumstances.
Postgres: Treats views as logical and does not allow for the possibility of "invalid" views. It tries to allow ALTER statements but only under limited circumstances (not the same circumstances as DB2).
Microsoft: Supports two different kinds of views.

These different approaches allow for different sorts of DDL operations to succeed and can leave views in different levels of usability.

ALTER TABLE example RENAME TO example_old;
- In oracle and Terradata views over "example" are now invalid.
- In Postgres and DB2 views over "example" continue to work even though the table has a different name.

DROP TABLE example;
- In oracle and Terradata views over "example" are now invalid.
- In Postgres and DB2 the DROP fails unless CASCADE is specified.

ALTER TABLE example SET SCHEMA new_schema;
- In Oracle the views become invalid unless the new schema is in the search path
- In Terradata the views become invalid
- In Postgres and DB2 the views still refer to the original table.

ALTER TABLE example ADD COLUMN new int;
- In Oracle views may return the new column
- In Terradata, Postgres, and DB2 the new column does not show up in existing views.

ALTER TABLE example DROP COLUMN old;
- In Oracle views may return fewer columns and/or become invalid
- In Terradata views that reference the stated column will become invalid (even when the view was simply "SELECT *").
- In Postgres and DB2 the ALTER statement will fail if the view references the specified column.

ALTER TABLE example RENAME COLUMN old TO new;
- In Oracle views will return different columns and/or become invalid, dependent views may become invalid.
- In Terradata views referencing the stated column become invalid.
- In Postgres, DB2 existing views will automatically update with the change.

ALTER TABLE example ALTER COLUMN old TYPE text;
- In Oracle and Terradata views may update automatically, or may become invalid.
- In DB2 views will try to rewrite themselves and may or may not fail depending on contents
- In Postgres the ALTER statement will fail if the view references the specified column.

Note that in the above NO database will always be able to keep views in sync with alterations to the underlying tables, this is because there is not a single well defined answer to how that update should occur. For every single database vendor certain types of update operations will require manual user intervention to go through the entire dependent view tree and manually fix the views under some circumstances. The question is only /which/ circumstances.

I maintain that the approaches that inform the user that they have met that condition via ALTER statement failures (Postgres/DB2/Microsoft Bound Views) have certain advantages over databases that allow the update but may silently leave views in an usable state (Oracle, Terradata), in that at least the user Knows when they have to re-examine their views.

There might be some slight inaccuracies above since I was going off documentation and extrapolation of the described behavior, but the general points still hold.
...

As far as I can tell there are three approaches that could be taken to help address this problem:
1) DB2 like approach - try to perform rewrites where able, but if the rewrite fails then the alter operation fails. Would allow simple edits such as ALTER TYPE that are only changes in typmod, or if done more ambitiously would allow numbers to be changed to other numbers. But as Robert says this quickly approaches the territory of black magic.
2) Microsoft like approach - create a new kind of view that is just stored as the view text and can become invalid. The people who want this type of view can use it combined with all the headaches associated with this type of view.
3) We extend things in a way that just makes dropping and recreating views more convenient. E.G. Some syntax for "drop all dependents" would be helpful to make schema changes easier.

Regards,
Caleb

On 12/18/09 8:28 PM, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:

On Fri, Dec 18, 2009 at 11:24 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Dec 18, 2009 at 10:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> The TODO item is terribly underdocumented, but I think what it's on
>>> about is that right now we refuse commands like ALTER COLUMN TYPE if
>>> the column is referenced in a view. It would be nice to propagate
>>> such a change into views if possible.
>
>> I'm unconvinced that this is, as you would say, even mildly sane.
>
> I've updated the TODO item to link to this discussion, so at least the
> next three people who pop up with "let's just store the view source!"
> will have some idea of what they're up against.

Excellent. :-)

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Caleb Welton <cwelton(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, suzhiyang <suzhiyang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-19 22:38:51
Message-ID: 603c8f070912191438m3faf39al6f948c19e8a0a10f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Dec 19, 2009 at 1:56 PM, Caleb Welton <cwelton(at)greenplum(dot)com> wrote:
> I maintain that the approaches that inform the user that they have met that
> condition via ALTER statement failures (Postgres/DB2/Microsoft Bound Views)
>  have certain advantages over databases that allow the update but may
> silently leave views in an usable state (Oracle, Terradata), in that at
> least the user Knows when they have to re-examine their views.

Agreed.

> As far as I can tell there are three approaches that could be taken to help
> address this problem:
>   1) DB2 like approach - try to perform rewrites where able, but if the
> rewrite fails then the alter operation fails.  Would allow simple edits such
> as ALTER TYPE that are only changes in typmod, or if done more ambitiously
> would allow numbers to be changed to other numbers.  But as Robert says this
> quickly approaches the territory of black magic.

And it can easily lead to silent breakage - e.g. if you change an
integer column to text, the view's attempt to coerce the text back to
integer will continue working as long as that coercion is valid for
all the data the view examines, but you have to think the user had a
reason for changing the type...

>   2) Microsoft like approach - create a new kind of view that is just stored
> as the view text and can become invalid.  The people who want this type of
> view can use it combined with all the headaches associated with this type of
> view.

This could be emulated fairly easily. Just create a table with all
your view definitions in it and write a function that drops and
recreates them all. Call it whenever you change anything.

>   3) We extend things in a way that just makes dropping and recreating views
> more convenient.   E.G. Some syntax for "drop all dependents" would be
> helpful to make schema changes easier.

How is that different from CASCADE?

...Robert


From: decibel <decibel(at)decibel(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Caleb Welton <cwelton(at)greenplum(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, suzhiyang <suzhiyang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: About "Allow VIEW/RULE recompilation when the underlying tables change"
Date: 2009-12-22 22:12:56
Message-ID: E3118575-1E46-404A-8543-F071B27A6C94@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 19, 2009, at 4:38 PM, Robert Haas wrote:
> On Sat, Dec 19, 2009 at 1:56 PM, Caleb Welton <cwelton(at)greenplum(dot)com> wrote:
>> I maintain that the approaches that inform the user that they have met that
>> condition via ALTER statement failures (Postgres/DB2/Microsoft Bound Views)
>> have certain advantages over databases that allow the update but may
>> silently leave views in an usable state (Oracle, Terradata), in that at
>> least the user Knows when they have to re-examine their views.
>
> Agreed.
>
>> As far as I can tell there are three approaches that could be taken to help
>> address this problem:
>> 1) DB2 like approach - try to perform rewrites where able, but if the
>> rewrite fails then the alter operation fails. Would allow simple edits such
>> as ALTER TYPE that are only changes in typmod, or if done more ambitiously
>> would allow numbers to be changed to other numbers. But as Robert says this
>> quickly approaches the territory of black magic.
>
> And it can easily lead to silent breakage - e.g. if you change an
> integer column to text, the view's attempt to coerce the text back to
> integer will continue working as long as that coercion is valid for
> all the data the view examines, but you have to think the user had a
> reason for changing the type...

Or we could simply disallow those types of cases. It's not optimal, but would still provide a lot of benefit in other cases.

BTW, +1 on allowing something akin to SELECT * to propagate ADD COLUMN, though for compatibility we should use something other that "SELECT *".
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net