Materialized views WIP patch

Lists: pgsql-hackers
From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Materialized views WIP patch
Date: 2012-11-15 02:28:19
Message-ID: 20121115022819.90140@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Attached is a patch that is still WIP but that I think is getting
pretty close to completion. It is not intended to be the be-all and
end-all for materialized views, but the minimum useful feature set --
which is all that I've had time to do for this release. In
particular, the view is only updated on demand by a complete rebuild.
For the next release, I hope to build on this base to allow more
eager and incremental updates, and perhaps a concurrent batch update.

1.  CREATE MATERIALIZED VIEW syntax is stolen directly from CREATE
   TABLE AS, with all the same clauses supported. That includes
   declaring a materialized view to be temporary or unlogged.
2.  MVs don't support inheritance.
3.  MVs can't define foreign keys.
4.  MVs can't be the target of foreign keys.
5.  MVs can't have triggers.
6.  Users can't create rules which reference MVs (although MVs
   [ab]use the rules mechanism internally, similar to how views do).
7.  MVs can't be converted to views, nor vice versa.
8.  Users may not directly use INSERT/UPDATE/DELETE on an MV.
9.  MVs can't directly be used in a COPY statement, but can be the
   source of data using a SELECT.
10. MVs can't own sequences.
11. MVs can't be the target of LOCK statements, although other
   statements get locks just like a table.
12. MVs can't use data modifying CTEs in their definitions.
13. pg_class now has a relisvalid column, which is true if an MV is
   truncated or created WITH NO DATA. You can not scan a relation
   flagged as invalid.
14. ALTER MATERIALIZED VIEW is supported for the options that seemed
   to make sense. For example, you can change the tablespace or
   schema, but you cannot add or drop column with ALTER.
15. The SELECT query used to define the MV may not contain a
   data-modifying CTE.
16. To get new data into the MV, the command is LOAD MATERIALIZED
   VIEW mat view_name. This seemed more descriptive to me that the
   alternatives and avoids declaring any new keywords beyond
   MATERIALIZED. If the MV is flagged as relisvalid == false, this
   will change it to true.
17. Since the data viewed in an MV is not up-to-date with the latest
   committed transaction, it didn't seem to make any sense to try to
   apply SERIALIZABLE transaction semantics to queries looking at
   the contents of an MV, although if LMV is run in a SERIALIZABLE
   transaction the MV data is guaranteed to be free of serialization
   anomalies. This does leave the transaction running the LOAD
   command vulnerable to serialization failures unless it is also
   READ ONLY DEFERRABLE.
18. Bound parameters are not supported for the CREATE MATERIALIZED
   VIEW statement.
19. LMV doesn't show a row count. It wouldn't be hard to add, it just
   seemed a little out of place to do that, when CLUSTER, etc.,
   don't.

I wasn't able to wrap up a few things in time for this commitfest:

- Documentation is incomplete.
- pg_dump support needs addtional dependencies added to properly
  handle MVs which are defined using other MVs.
- pg_dump binary hasn't had a lot of attention yet.
- There are no regression tests yet.
- I ran into problems getting the validity check working right, so I
  have disabled it by commenting out the function body in this
  patch.
- TRUNCATE should probably support a MATERIALIZED VIEW clause.

It would be good to have some discussion to try to reach a consensus
about whether we need to differentiate between *missing* datat (where
a materialized view which has been loaded WITH NO DATA or TRUNCATEd
and has not been subsequently LOADed) and potentially *stale* data.
If we don't care to distinguish between a view which generated no
rows when it ran and a one for which the query has not been run, we
can avoid adding the relisvalid flag, and we could support UNLOGGED
MVs. Perhaps someone can come up with a better solution to that
problem.

In the long term, we will probably need to separate the
implementation of CREATE TABLE AS and CREATE MATERIALIZED VIEW, but
for now there is so little that they need to do differently it seemed
less evil to have a few "if" clauses that that much duplicated code.

The paint is pretty wet still, but hopefully people can evaluate the
approach and work out any issues with the design choices in the CF so
that it can be wrapped up nicely for the next one.

92 files changed, 2377 insertions(+), 440 deletions(-)

-Kevin

Attachment Content-Type Size
matview-v1.patch text/x-patch 235.2 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-15 04:57:13
Message-ID: 20121115045712.GD12213@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:

Interesting stuff.

> /*
> + * SetRelationIsValid
> + * Set the value of the relation's relisvalid field in pg_class.
> + *
> + * NOTE: caller must be holding an appropriate lock on the relation.
> + * ShareUpdateExclusiveLock is sufficient.
> + *
> + * NOTE: an important side-effect of this operation is that an SI invalidation
> + * message is sent out to all backends --- including me --- causing plans
> + * referencing the relation to be rebuilt with the new list of children.
> + * This must happen even if we find that no change is needed in the pg_class
> + * row.
> + */
> + void
> + SetRelationIsValid(Oid relationId, bool relisvalid)
> + {

It's not clear to me that it's right to do this by doing regular heap
updates here instead of heap_inplace_update. Also, I think this might
end up causing a lot of pg_class tuple churn (at least for matviews that
delete rows at xact end), which would be nice to avoid.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-15 10:34:43
Message-ID: 50A4C543.4090605@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/14/12 6:28 PM, Kevin Grittner wrote:
> - Documentation is incomplete.
> ...
> - There are no regression tests yet.

Do you have any simple test cases you've been using you could attach?
With epic new features like this, when things don't work it's hard to
distinguish between "that just isn't implemented yet" and "the author
never tested that". Having some known good samples you have tested,
even if they're not proper regression tests, would be helpful for
establishing the code baseline works.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Materialized views WIP patch
Date: 2012-11-15 18:36:18
Message-ID: 50A53622.8090504@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin,

> Attached is a patch that is still WIP but that I think is getting
> pretty close to completion. It is not intended to be the be-all and
> end-all for materialized views, but the minimum useful feature set --
> which is all that I've had time to do for this release. In
> particular, the view is only updated on demand by a complete rebuild.
> For the next release, I hope to build on this base to allow more
> eager and incremental updates, and perhaps a concurrent batch update.

Nice to see this come in!

> 1. CREATE MATERIALIZED VIEW syntax is stolen directly from CREATE
> TABLE AS, with all the same clauses supported. That includes
> declaring a materialized view to be temporary or unlogged.

What use would a temporary matview be?

Unlogged is good.

> 2. MVs don't support inheritance.

In which direction? Can't inherit, or can't be inherited from?

> 3. MVs can't define foreign keys.
> 4. MVs can't be the target of foreign keys.
> 5. MVs can't have triggers.

Makes sense.

> 9. MVs can't directly be used in a COPY statement, but can be the
> source of data using a SELECT.

Hmmm? I don't understand the reason for this.

> 13. pg_class now has a relisvalid column, which is true if an MV is
> truncated or created WITH NO DATA. You can not scan a relation
> flagged as invalid.

What error would a user see?

> 14. ALTER MATERIALIZED VIEW is supported for the options that seemed
> to make sense. For example, you can change the tablespace or
> schema, but you cannot add or drop column with ALTER.

How would you change the definition of an MV then?

> 16. To get new data into the MV, the command is LOAD MATERIALIZED
> VIEW mat view_name. This seemed more descriptive to me that the
> alternatives and avoids declaring any new keywords beyond
> MATERIALIZED. If the MV is flagged as relisvalid == false, this
> will change it to true.

UPDATE MATERIALIZED VIEW was problematic?

Does LOAD automatically TRUNCATE the view before reloading it? If not,
why not?

> It would be good to have some discussion to try to reach a consensus
> about whether we need to differentiate between *missing* datat (where
> a materialized view which has been loaded WITH NO DATA or TRUNCATEd
> and has not been subsequently LOADed) and potentially *stale* data.
> If we don't care to distinguish between a view which generated no
> rows when it ran and a one for which the query has not been run, we
> can avoid adding the relisvalid flag, and we could support UNLOGGED
> MVs. Perhaps someone can come up with a better solution to that
> problem.

Hmmm. I understand the distinction you're making here, but I'm not sure
it actually matters to the user. MVs, by their nature, always have
potentially stale data. Being empty (in an inaccurate way) is just one
kind of stale data.

It would be nice for the user to have some way to know that a matview is
empty due to never being LOADed or recently being TRUNCATEd. However, I
don't think that relisvalid flag -- and preventing scanning the relation
-- is a good solution. What I'd rather have instead is a timestamp of
when the MV was last LOADed. If the MV was never loaded (or was
truncated) that timestamp would be NULL. Such a timestamp would allow
users to construct all kinds of ad-hoc refresh schemes for MVs which
would not be possible without it.

I don't see how this relates to UNLOGGED matviews either way.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-15 23:35:37
Message-ID: 1353022537.14335.26.camel@sussancws0025
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2012-11-14 at 21:28 -0500, Kevin Grittner wrote:
> Attached is a patch that is still WIP but that I think is getting
> pretty close to completion. It is not intended to be the be-all and
> end-all for materialized views, but the minimum useful feature set --
> which is all that I've had time to do for this release. In
> particular, the view is only updated on demand by a complete rebuild.
> For the next release, I hope to build on this base to allow more
> eager and incremental updates, and perhaps a concurrent batch update.

The documentation says that a materialized view is basically a
create-table-as-select except that it remembers the query. Would you say
that there is a compelling use case for this alone, or is this a
building block for more sophisticated materialized view support (e.g.
eager updating) later?

Regards,
Jeff Davis


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-16 13:13:43
Message-ID: m2fw491yig.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> The documentation says that a materialized view is basically a
> create-table-as-select except that it remembers the query. Would you say
> that there is a compelling use case for this alone, or is this a
> building block for more sophisticated materialized view support (e.g.
> eager updating) later?

The implementation of the re-LOAD'ing command makes it already
worthwile. Bonus point if locking is limited to when the new content is
all computer and ready, but even without that, I want to have it. ;)

I'd bikeshed and prefer the UPDATE MATERIALIZED VIEW nsp.foo; of course.

The alternative is creating a view, a matching table and a stored
procedure that will implement the rebuilding, for each mat view you want
to have. So that's already a big step forward in my eyes.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-16 15:40:51
Message-ID: CAHyXU0zgufXskD71kfRr1C_FHO8C4dFCrcbyArWxKLBpqDW9hQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 16, 2012 at 7:13 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> The documentation says that a materialized view is basically a
>> create-table-as-select except that it remembers the query. Would you say
>> that there is a compelling use case for this alone, or is this a
>> building block for more sophisticated materialized view support (e.g.
>> eager updating) later?
>
> The implementation of the re-LOAD'ing command makes it already
> worthwile. Bonus point if locking is limited to when the new content is
> all computer and ready, but even without that, I want to have it. ;)

Seconded. Background lock free refresh of materialization table would
be wonderful. But moving dependency between source and materialized
table out of plpgsql function and into defined schema justifies
feature on its own merits.

merlin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Materialized views WIP patch
Date: 2012-11-16 17:02:05
Message-ID: CA+TgmobjhnkRdQkiw6a2tMhvs_a73+BX5VR8FYjr2GaVNKL8=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 15, 2012 at 1:36 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Hmmm. I understand the distinction you're making here, but I'm not sure
> it actually matters to the user. MVs, by their nature, always have
> potentially stale data. Being empty (in an inaccurate way) is just one
> kind of stale data.

This is my feeling also.

> I don't see how this relates to UNLOGGED matviews either way.

Right now, Kevin has things set up so that when you do "TRUNCATE mv",
it clears the relisvalid flag. If we allowed unlogged materialized
views, the table would be automatically truncated on a crash, but
there wouldn't be any way to clear relisvalid in that situation, so
Kevin felt we should simply disable unlogged MVs. Personally, I'm not
excited about having a relisvalid flag at all, and doubly not excited
if it means we can't have unlogged MVs.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Thom Brown <thom(at)linux(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-16 17:14:43
Message-ID: CAA-aLv41_g1jMOLanPCWUzVE5tZaBTSeXoA4xpUENtqu-5caLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 November 2012 02:28, Kevin Grittner <kgrittn(at)mail(dot)com> wrote:

> Attached is a patch that...
>

Got this error:

postgres=# create view v_test as select 1;
CREATE VIEW
postgres=# create materialized view mv_test as select * from v_test;
ERROR: could not open file "base/12064/16425": No such file or directory

--
Thom


From: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Materialized views WIP patch
Date: 2012-11-16 21:02:05
Message-ID: 50A6A9CD.1030702@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

By chance (?) I got similar one today too, when dropping extension:

ERROR: could not open file "base/12623/12548": No such file or directory

I thought something might have gone wrong during Linux upgrade 2 days
ago, but it's not likely that we both have the issue.

I wonder if something is broken in the catalog. The last commit I have
in my environment is

commit 4af3dda13601d859a20425e3554533fde0549056
Author: Peter Eisentraut <peter_e(at)gmx(dot)net>
Date: Sun Oct 28 10:35:46 2012 -0400

Kind regards,
Tony.

On 11/16/2012 06:14 PM, Thom Brown wrote:
> On 15 November 2012 02:28, Kevin Grittner <kgrittn(at)mail(dot)com
> <mailto:kgrittn(at)mail(dot)com>> wrote:
>
> Attached is a patch that...
>
>
> Got this error:
>
> postgres=# create view v_test as select 1;
> CREATE VIEW
> postgres=# create materialized view mv_test as select * from v_test;
> ERROR: could not open file "base/12064/16425": No such file or directory
>
> --
> Thom


From: Marko Tiikkaja <pgmail(at)joh(dot)to>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 00:30:09
Message-ID: 50B2B811.3060300@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Kevin,

On 15/11/2012 03:28, Kevin Grittner wrote:
> Attached is a patch that is still WIP but that I think is getting
> pretty close to completion.

I've been looking at this, but I unfortunately haven't had as much time
as I had hoped for, and have not looked at the code in detail yet. It's
also a relatively big patch, so I wouldn't mind another pair of eyes on it.

I have been testing the patch a bit, and I'm slightly disappointed by
the fact that it still doesn't solve this problem (and I apologize if I
have missed discussion about this in the docs or in this thread):

<assume "foo" is a non-empty materialized view>

T1: BEGIN;
T1: LOAD MATERIALIZED VIEW foo;

T2: SELECT * FROM foo;

T1: COMMIT;

<T2 sees an empty table>

As others have pointed out, replacing the contents of a table is
something which people have been wanting to do for a long time, and I
think having this ability would make this patch a lot better; now it
just feels like syntactic sugar.

> 1. CREATE MATERIALIZED VIEW syntax is stolen directly from CREATE
> TABLE AS, with all the same clauses supported. That includes
> declaring a materialized view to be temporary or unlogged.
> 2. MVs don't support inheritance.
> 3. MVs can't define foreign keys.
> 4. MVs can't be the target of foreign keys.
> 5. MVs can't have triggers.
> 6. Users can't create rules which reference MVs (although MVs
> [ab]use the rules mechanism internally, similar to how views do).
> 7. MVs can't be converted to views, nor vice versa.
> 8. Users may not directly use INSERT/UPDATE/DELETE on an MV.
> 9. MVs can't directly be used in a COPY statement, but can be the
> source of data using a SELECT.
> 10. MVs can't own sequences.
> 11. MVs can't be the target of LOCK statements, although other
> statements get locks just like a table.
> 12. MVs can't use data modifying CTEs in their definitions.
> 13. pg_class now has a relisvalid column, which is true if an MV is
> truncated or created WITH NO DATA. You can not scan a relation
> flagged as invalid.
> 14. ALTER MATERIALIZED VIEW is supported for the options that seemed
> to make sense. For example, you can change the tablespace or
> schema, but you cannot add or drop column with ALTER.
> 16. To get new data into the MV, the command is LOAD MATERIALIZED
> VIEW mat view_name. This seemed more descriptive to me that the
> alternatives and avoids declaring any new keywords beyond
> MATERIALIZED. If the MV is flagged as relisvalid == false, this
> will change it to true.
> 17. Since the data viewed in an MV is not up-to-date with the latest
> committed transaction, it didn't seem to make any sense to try to
> apply SERIALIZABLE transaction semantics to queries looking at
> the contents of an MV, although if LMV is run in a SERIALIZABLE
> transaction the MV data is guaranteed to be free of serialization
> anomalies. This does leave the transaction running the LOAD
> command vulnerable to serialization failures unless it is also
> READ ONLY DEFERRABLE.
> 18. Bound parameters are not supported for the CREATE MATERIALIZED
> VIEW statement.

I believe all of these points have been under discussion, and I don't
have anything to add to the ongoing discussions.

> 19. LMV doesn't show a row count. It wouldn't be hard to add, it just
> seemed a little out of place to do that, when CLUSTER, etc.,
> don't.

This sounds like a useful feature, but your point about CLUSTER and
friends still stands.

> In the long term, we will probably need to separate the
> implementation of CREATE TABLE AS and CREATE MATERIALIZED VIEW, but
> for now there is so little that they need to do differently it seemed
> less evil to have a few "if" clauses that that much duplicated code.

Seems sensible.

I'll get back when I manage to get a better grasp of the code.

Regards,
Marko Tiikkaja


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Marko Tiikkaja <pgmail(at)joh(dot)to>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 13:07:17
Message-ID: CA+TgmoZpvJ+M7ix8B=5EO-jYgAW7Mqn-Weh25drg5raYQM8fAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 25, 2012 at 7:30 PM, Marko Tiikkaja <pgmail(at)joh(dot)to> wrote:
> As others have pointed out, replacing the contents of a table is something
> which people have been wanting to do for a long time, and I think having
> this ability would make this patch a lot better; now it just feels like
> syntactic sugar.

I agree that it's mostly syntactic sugar, but I think we need to have
realistic expectations for what is possible in an initial patch. When
I committed the first patch for foreign data wrappers, it didn't work
at all: it was just syntax support. Tom later committed a follow-on
patch that made them work. Similarly, I split the event trigger patch
into two halves, one of which added the syntax support and the other
of which made them functional: and even with both commits in, I think
it's fair to say that event triggers are still in a fairly primitive
state.

None of those patches were small patches. It's going to take multiple
years to get materialized views up to a state where they're really
useful to a broad audience in production applications, but I don't
think we should sneer at anyone for writing a patch that is "just
syntactic sugar". As it turns out, adding a whole new object type is
a lot of work and generates a big patch even if it doesn't do much
just yet. Rejecting such patches on the grounds that they aren't
comprehensive enough is, IMHO, extremely unwise; we'll either end up
landing even larger patches that are almost impossible to review
comprehensively and therefore more likely to break something, or else
we'll kill the projects outright and end up with nothing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Marko Tiikkaja <pgmail(at)joh(dot)to>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 13:14:42
Message-ID: 50B36B42.7060603@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/26/12 2:07 PM, Robert Haas wrote:
> On Sun, Nov 25, 2012 at 7:30 PM, Marko Tiikkaja <pgmail(at)joh(dot)to> wrote:
>> As others have pointed out, replacing the contents of a table is something
>> which people have been wanting to do for a long time, and I think having
>> this ability would make this patch a lot better; now it just feels like
>> syntactic sugar.
>
> I agree that it's mostly syntactic sugar, but I think we need to have
> realistic expectations for what is possible in an initial patch. When
> I committed the first patch for foreign data wrappers, it didn't work
> at all: it was just syntax support. Tom later committed a follow-on
> patch that made them work. Similarly, I split the event trigger patch
> into two halves, one of which added the syntax support and the other
> of which made them functional: and even with both commits in, I think
> it's fair to say that event triggers are still in a fairly primitive
> state.
>
> None of those patches were small patches. It's going to take multiple
> years to get materialized views up to a state where they're really
> useful to a broad audience in production applications, but I don't
> think we should sneer at anyone for writing a patch that is "just
> syntactic sugar". As it turns out, adding a whole new object type is
> a lot of work and generates a big patch even if it doesn't do much
> just yet. Rejecting such patches on the grounds that they aren't
> comprehensive enough is, IMHO, extremely unwise; we'll either end up
> landing even larger patches that are almost impossible to review
> comprehensively and therefore more likely to break something, or else
> we'll kill the projects outright and end up with nothing.

First of all, I have to apologize. Re-reading the email I sent out last
night, it does indeed feel a bit harsh and I can understand your reaction.

At no point did I mean to belittle Kevin's efforts or the patch itself.
I was mostly looking for Kevin's input on how hard it would be to
solve the particular problem and whether it would be possible to do so
for 9.3.

While I feel like the problem I pointed out is a small caveat and should
be at least documented for 9.3, I think this patch has merits of its own
even if that problem never gets fixed, and I will continue to review
this patch.

Regards,
Marko Tiikkaja


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Marko Tiikkaja <pgmail(at)joh(dot)to>, Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 13:42:05
Message-ID: CAEYLb_Wu8A0Svg1Vqn3+OpQ=Ggqbza+RfRE_LFaM5ZyahRM8Sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 November 2012 13:07, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> None of those patches were small patches. It's going to take multiple
> years to get materialized views up to a state where they're really
> useful to a broad audience in production applications, but I don't
> think we should sneer at anyone for writing a patch that is "just
> syntactic sugar".

+1. I have a sweet tooth. I don't like it when people criticise
patches on the basis of "obviously you could achieve the same effect
with $CONVOLUTION". Making things simpler is a desirable outcome. Now,
that isn't to say that we should disregard everything or even anything
else in pursuit of simplicity; just that "needing a Ph.D is
Postgresology", as you once put it, to do something routine to many is
really hard to defend.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 14:46:33
Message-ID: 50B380C9.6060101@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/14/12 9:28 PM, Kevin Grittner wrote:
> 17. Since the data viewed in an MV is not up-to-date with the latest
> committed transaction,

So, the way I understand it, in Oracle terms, this feature is a
"snapshot", not a materialized view. Maybe that's what it should be
called then.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Marko Tiikkaja <pgmail(at)joh(dot)to>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 15:12:10
Message-ID: CA+Tgmob73JUeCCTTCYd-avPPnq1nfMeD3pHo8mvUbJYLS4PRog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 26, 2012 at 8:14 AM, Marko Tiikkaja <pgmail(at)joh(dot)to> wrote:
> First of all, I have to apologize. Re-reading the email I sent out last
> night, it does indeed feel a bit harsh and I can understand your reaction.
>
> At no point did I mean to belittle Kevin's efforts or the patch itself. I
> was mostly looking for Kevin's input on how hard it would be to solve the
> particular problem and whether it would be possible to do so for 9.3.
>
> While I feel like the problem I pointed out is a small caveat and should be
> at least documented for 9.3, I think this patch has merits of its own even
> if that problem never gets fixed, and I will continue to review this patch.

OK, no worries. I didn't really interpret your email as belittling; I
just want to make sure this feature doesn't get feature-creeped to
death. I think everyone, including Kevin, understands that the
real-world applicability of v1 is going to be limited and many people
will choose alternative techniques rather than relying on this new
feature. But I also think that we'll never get to a really awesome,
kick-ass feature unless we're willing to commit an initial version
that isn't all that awesome or kick-ass. If I understand Kevin's
goals correctly, the plan is to get this basic version committed for
9.3, and then to try to expand the capability during the 9.4 release
cycle (and maybe 9.5, too, there's a lot of work to do here). I think
that's a pretty sound plan.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 15:19:21
Message-ID: 50B38879.1070307@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 11/26/2012 09:46 AM, Peter Eisentraut wrote:
> On 11/14/12 9:28 PM, Kevin Grittner wrote:
>> 17. Since the data viewed in an MV is not up-to-date with the latest
>> committed transaction,
> So, the way I understand it, in Oracle terms, this feature is a
> "snapshot", not a materialized view. Maybe that's what it should be
> called then.
>
>

If you use Jonathan Gardner's taxonomy at
<http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views>,
snapshots are a subclass of materialized views.

cheers

andrew


From: David Fetter <david(at)fetter(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 15:24:56
Message-ID: 20121126152456.GA31816@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 26, 2012 at 09:46:33AM -0500, Peter Eisentraut wrote:
> On 11/14/12 9:28 PM, Kevin Grittner wrote:
> > 17. Since the data viewed in an MV is not up-to-date with the
> > latest committed transaction,
>
> So, the way I understand it, in Oracle terms, this feature is a
> "snapshot", not a materialized view. Maybe that's what it should be
> called then.

"Snapshot" is one of the options for refreshing an Oracle materialized
view. There are others, which we'll eventually add if past is any
prologue :)

I hate to add to the bike-shedding, but we should probably add REFRESH
SNAPSHOT as an optional piece of the grammar, with more REFRESH
options to come.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 16:02:17
Message-ID: CAEYLb_U_D=gp-DmfmkeRvaQqwO5UZj75y96Z0J3te7puetRnqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 November 2012 15:24, David Fetter <david(at)fetter(dot)org> wrote:
> I hate to add to the bike-shedding, but we should probably add REFRESH
> SNAPSHOT as an optional piece of the grammar, with more REFRESH
> options to come.

I don't know that they should be called materialised views, but do we
really need to overload the word snapshot? I'd just as soon invent a
new word as use the Oracle one, since I don't think the term snapshot
is widely recognised as referring to anything other than snapshot
isolation.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: David Fetter <david(at)fetter(dot)org>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-26 16:05:21
Message-ID: 20121126160521.GD31816@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 26, 2012 at 04:02:17PM +0000, Peter Geoghegan wrote:
> On 26 November 2012 15:24, David Fetter <david(at)fetter(dot)org> wrote:
> > I hate to add to the bike-shedding, but we should probably add
> > REFRESH SNAPSHOT as an optional piece of the grammar, with more
> > REFRESH options to come.
>
> I don't know that they should be called materialised views, but do
> we really need to overload the word snapshot? I'd just as soon
> invent a new word as use the Oracle one, since I don't think the
> term snapshot is widely recognised as referring to anything other
> than snapshot isolation.

I believe that the meaning here is unambiguous, and is used in other
descriptions than Oracle's, including the one on our wiki.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-27 00:35:03
Message-ID: 1353976503.4992.10.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2012-11-26 at 09:46 -0500, Peter Eisentraut wrote:
> On 11/14/12 9:28 PM, Kevin Grittner wrote:
> > 17. Since the data viewed in an MV is not up-to-date with the latest
> > committed transaction,
>
> So, the way I understand it, in Oracle terms, this feature is a
> "snapshot", not a materialized view. Maybe that's what it should be
> called then.

OK, I take everything back and claim the opposite.

In current Oracle, SNAPSHOT is an obsolete alias for MATERIALIZED VIEW.
Materialized views have the option of REFRESH ON DEMAND and REFRESH ON
COMMIT, with the former being the default. So it seems that the syntax
of what you are proposing is in line with Oracle.

I'm not fond of overloading LOAD as the refresh command. Maybe you
could go the Oracle route here as well and use a stored procedure. That
would also allow things like SELECT pg_refresh_mv(oid) FROM ... more
easily.


From: "David Rowley" <dgrowleyml(at)gmail(dot)com>
To: "'Peter Eisentraut'" <peter_e(at)gmx(dot)net>, "'Kevin Grittner'" <kgrittn(at)mail(dot)com>
Cc: "'Pgsql Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-27 04:46:39
Message-ID: 000901cdcc5a$33734e10$9a59ea30$@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Peter Eisentraut
> Sent: 27 November 2012 13:35
> To: Kevin Grittner
> Cc: Pgsql Hackers
> Subject: Re: [HACKERS] Materialized views WIP patch
>
> On Mon, 2012-11-26 at 09:46 -0500, Peter Eisentraut wrote:
> > On 11/14/12 9:28 PM, Kevin Grittner wrote:
> > > 17. Since the data viewed in an MV is not up-to-date with the latest
> > > committed transaction,
> >
> > So, the way I understand it, in Oracle terms, this feature is a
> > "snapshot", not a materialized view. Maybe that's what it should be
> > called then.
>
> OK, I take everything back and claim the opposite.
>
> In current Oracle, SNAPSHOT is an obsolete alias for MATERIALIZED VIEW.
> Materialized views have the option of REFRESH ON DEMAND and REFRESH
> ON COMMIT, with the former being the default. So it seems that the syntax
> of what you are proposing is in line with Oracle.
>
> I'm not fond of overloading LOAD as the refresh command. Maybe you could
> go the Oracle route here as well and use a stored procedure. That would also
> allow things like SELECT pg_refresh_mv(oid) FROM ... more easily.
>
>
+1 to this.
I can see a use case where you might want to refresh all MVs that are X number of days/hours old. Rather than having to execute statements for each one. Something like pg_refresh_mv() within a query would allow this.

Pretty exciting work Kevin, I understand what Robert said about feature creep etc and agree 100%, but I'm really looking forward to when we can *one day* have the planner make use of an eager MV to optimise a query!

Regards

David Rowley

>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-27 10:25:51
Message-ID: m2ehjf1gwg.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> I'm not fond of overloading LOAD as the refresh command. Maybe you
> could go the Oracle route here as well and use a stored procedure. That
> would also allow things like SELECT pg_refresh_mv(oid) FROM ... more
> easily.

I would like that we have a way to refresh a Materialized View by
calling a stored procedure, but I don't think it should be the main UI.

The wholesale refreshing of a matview appears to me to be comparable to
TRUNCATE is that it's both a DDL and a DML. The incremental refreshing
modes we want to have later are clearly DML only, either on commit
refresh or incrementally on demand.

I would then propose that we use ALTER MATERIALIZED VIEW as the UI for
the wholesale on demand refreshing operation, and UPDATE MATERIALIZED
VIEW as the incremental command (to come later).

So my proposal for the current feature would be:

ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ];
UPDATE MATERIALIZED VIEW mv;

The choice of keywords and syntax here hopefully clearly hint the user
about the locking behavior of the commands, too. And as we said, the
bare minimum for this patch does *not* include the CONCURRENTLY option,
which we still all want to have (someday). :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-27 10:31:00
Message-ID: CAFj8pRAPLvJRLbuWQa+i4XvpxNA3OAXw=svQFpD5XwJYQGOXTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/11/27 Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> I'm not fond of overloading LOAD as the refresh command. Maybe you
>> could go the Oracle route here as well and use a stored procedure. That
>> would also allow things like SELECT pg_refresh_mv(oid) FROM ... more
>> easily.
>
> I would like that we have a way to refresh a Materialized View by
> calling a stored procedure, but I don't think it should be the main UI.
>
> The wholesale refreshing of a matview appears to me to be comparable to
> TRUNCATE is that it's both a DDL and a DML. The incremental refreshing
> modes we want to have later are clearly DML only, either on commit
> refresh or incrementally on demand.
>
> I would then propose that we use ALTER MATERIALIZED VIEW as the UI for
> the wholesale on demand refreshing operation, and UPDATE MATERIALIZED
> VIEW as the incremental command (to come later).
>
> So my proposal for the current feature would be:
>
> ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ];
> UPDATE MATERIALIZED VIEW mv;
>
> The choice of keywords and syntax here hopefully clearly hint the user
> about the locking behavior of the commands, too. And as we said, the
> bare minimum for this patch does *not* include the CONCURRENTLY option,
> which we still all want to have (someday). :)
>

+1

Pavel

> Regards,
> --
> Dimitri Fontaine
> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: David Johnston <polobo(at)yahoo(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Kevin Grittner <kgrittn(at)mail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-11-27 14:17:14
Message-ID: 68BBE549-5AB1-4776-BC8F-4ED301C0227B@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Nov 27, 2012, at 5:25, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> wrote:

>
> So my proposal for the current feature would be:
>
> ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ];
> UPDATE MATERIALIZED VIEW mv;
>
> The choice of keywords and syntax here hopefully clearly hint the user
> about the locking behavior of the commands, too. And as we said, the
> bare minimum for this patch does *not* include the CONCURRENTLY option,
> which we still all want to have (someday). :)
>

I dislike using ALTER syntax to perform a data-only action.

The other advantage of non-functional syntax is that you could more easily supply some form of where clause should you only want to perform a partial refresh. With a function call that becomes more obtuse.

David J.