Re: Materialized views WIP patch

From: Noah Misch <noah(at)leadboat(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <pgmail(at)joh(dot)to>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2013-02-16 19:46:45
Message-ID: 20130216194645.GA31264@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Sat, Feb 16, 2013 at 09:53:14AM -0800, Kevin Grittner wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Fri, Feb 15, 2013 at 8:01 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> >> There is one odd aspect to pg_dump, but I think the way it is
> >> behaving is the best way to handle it, although I invite other
> >> opinions.  If you load from pg_dump output, it will try to
> >> populated materialized views which were populated on dump, and
> >> leave the ones which were not scannable because the contents had
> >> not been generated in an empty and unscannable state on restore.
> >> That much seems pretty obvious.  Where it gets  a little tricky is
> >> if mva is generated with data, and mvb is generated based on mva.
> >> Then mva is truncated.  Then you dump.  mvb was populated at the
> >> time of the dump, but its contents can't be regenerated on restore
> >> because mva is not scannable.  As the patch currently stands, you
> >> get an error on the attempt to REFRESH mvb.  I think that's a good
> >> thing, but I'm open to arguments to the contrary.
> >
> > Hmm, anything that means a dump-and-restore can fail seems like a bad
> > thing to me.  There's nothing outrageous about that scenario.  It's
> > arguable what state dump-and-restore should leave the new database in,
> > but I don't see why it shouldn't work.  I predict we'll end up with
> > unhappy users if we leave it like this.

I agree that making the dump fail on this account is bad.

> Keeping in mind that mva may take hours to refresh, and mvb may
> take only minutes once you have the data from mva, what behavior do
> you think is preferable?
>
> The alternatives I can think of are:
>
> (1)  Force mva to refresh on restore, even though it was empty and
> not scannable on dump.  This may delay completion of the restore
> for an extended time.  It would leave both mva and mvb populated.

This is reasonable. If the user doesn't like it, he can presumably use an
edited dump list to remove the REFRESHes.

> (2)  Populate mvb by using mva's query as a regular view.  This
> would leave things in the same state as they were on dump, and
> might possibly optimized to something faster than generating mva
> and then mvb; but probably would not be much faster in most cases.

Interesting idea, but I don't think adding novel server behavior is justified
to achieve this.

> (3)  Change the failure to generate data for mvb in this case as a
> WARNING rather than an ERROR.

This is also fair. However, I think it's better to restore more valid MVs
(option 1) than fewer.

> (4)  Actually dump and restore data with COPY statements for
> materialized views, rather than having the dump create REFRESH
> statements.  The main down side of this, it seems to me, is that it
> opens up materialized views to direct tinkering of contents via SQL
> statements, which I was hoping to avoid.  Perhaps this can be
> mitigated in some way.

This is a door better left closed.

Overall, I recommend option 1. None of the options will furnish the desire of
every database, but the DBA can always tailor the outcome by replacing the
dumped REFRESH statements with his own. I'm not envisioning that MVs left
invalid for the long term will be a typical thing, anyway.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Bruce Momjian 2013-02-16 23:52:55 pgsql: Warn about initdb using mount-points
Previous Message Kevin Grittner 2013-02-16 18:00:34 Re: Materialized views WIP patch

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2013-02-16 19:55:26 Re: JSON Function Bike Shedding
Previous Message Pavel Stehule 2013-02-16 19:08:23 Re: is it bug? - printing boolean domains in sql/xml function