Re: Materialized views WIP patch

Lists: pgsql-hackers
From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Noah Misch" <noah(at)leadboat(dot)com>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marko Tiikkaja" <pgmail(at)joh(dot)to>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2013-01-24 20:14:15
Message-ID: 20130124201416.119080@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch wrote:
> On Thu, Jan 24, 2013 at 01:29:10PM -0500, Kevin Grittner wrote:
>> Noah Misch wrote:
>>> For the benefit of the archives, I note that we almost need not truncate an
>>> unlogged materialized view during crash recovery. MVs are refreshed in a
>>> VACUUM FULL-like manner: fill a new relfilenode, fsync it, and point the MV's
>>> pg_class to that relfilenode. When a crash occurs with no refresh in flight,
>>> the latest refresh had been safely synced. When a crash cuts short a refresh,
>>> the pg_class update will not stick, and the durability of the old heap is not
>>> in doubt. However, non-btree index builds don't have the same property; we
>>> would need to force an immediate sync of the indexes to be safe here. It
>>> would remain necessary to truncate unlogged MVs when recovering a base backup,
>>> which may contain a partially-written refresh that did eventually commit.
>>> Future MV variants that modify the MV in place would also need the usual
>>> truncate on crash.
>>
>> Hmm. That's a very good observation. Perhaps the issue can be
>> punted to a future release where we start adding more incremental
>> updates to them. I'll think on that, but on the face of it, it
>> sounds like the best choice.
>
> That situation is challenging for the same reason pg_class.relisvalid was hard
> to implement for unlogged relations. The startup process doesn't know the
> relkind of the unlogged-relation relfilenodes it cleans. If you can work
> through all that, it's certainly a nice endpoint to not lose unlogged snapshot
> MVs on crash. But I intended the first half of my message as the
> recommendation and the above as a wish for the future.

Well, if I just don't create an init fork for MVs, they are left as
they were on recovery, aren't they? So for 9.3, that solves that
issue, I think. pg_class.relisvald is a separate issue.

>> You might want to ignore the interim work on detecting the new
>> pg_dump dependencies through walking the internal structures. I
>> decided that was heading in a direction which might be
>> unnecessarily fragile and slow; so I tried writing it as a query
>> against the system tables. I'm pretty happy with the results.
>> Here's the query:
>>
>> with recursive w as
> [snip]
>
> Why is the dependency problem of ordering MV refreshes and MV index builds so
> different from existing pg_dump dependency problems?

If mva has indexes and is referenced by mvb, the CREATE statements
are all properly ordered, but you want mva populated and indexed
before you attempt to populate mvb. (Populated to get correct
results, indexed to get them quickly.) We don't have anything else
like that.

>> If we bail on having pg_class.relisvalid, then it will obviously
>> need adjustment.
>
> Even if we don't have the column, we can have the fact of an MV's validity
> SQL-visible in some other way.

Sure, I didn't say we had to abandon the query -- probably just
replace the relisvalid tests with a function call using the oid of
the MV.

-Kevin


From: Noah Misch <noah(at)leadboat(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <pgmail(at)joh(dot)to>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2013-01-24 20:49:54
Message-ID: 20130124204954.GA4877@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 24, 2013 at 03:14:15PM -0500, Kevin Grittner wrote:
> Noah Misch wrote:
> > On Thu, Jan 24, 2013 at 01:29:10PM -0500, Kevin Grittner wrote:
> >> Noah Misch wrote:
> >>> For the benefit of the archives, I note that we almost need not truncate an
> >>> unlogged materialized view during crash recovery. MVs are refreshed in a
> >>> VACUUM FULL-like manner: fill a new relfilenode, fsync it, and point the MV's
> >>> pg_class to that relfilenode. When a crash occurs with no refresh in flight,
> >>> the latest refresh had been safely synced. When a crash cuts short a refresh,
> >>> the pg_class update will not stick, and the durability of the old heap is not
> >>> in doubt. However, non-btree index builds don't have the same property; we
> >>> would need to force an immediate sync of the indexes to be safe here. It
> >>> would remain necessary to truncate unlogged MVs when recovering a base backup,
> >>> which may contain a partially-written refresh that did eventually commit.
> >>> Future MV variants that modify the MV in place would also need the usual
> >>> truncate on crash.
> >>
> >> Hmm. That's a very good observation. Perhaps the issue can be
> >> punted to a future release where we start adding more incremental
> >> updates to them. I'll think on that, but on the face of it, it
> >> sounds like the best choice.
> >
> > That situation is challenging for the same reason pg_class.relisvalid was hard
> > to implement for unlogged relations. The startup process doesn't know the
> > relkind of the unlogged-relation relfilenodes it cleans. If you can work
> > through all that, it's certainly a nice endpoint to not lose unlogged snapshot
> > MVs on crash. But I intended the first half of my message as the
> > recommendation and the above as a wish for the future.
>
> Well, if I just don't create an init fork for MVs, they are left as
> they were on recovery, aren't they? So for 9.3, that solves that
> issue, I think. pg_class.relisvald is a separate issue.

The startup process just looks for init forks, yes. But it's acceptable to
leave the unlogged MV materials alone during *crash* recovery only. When
recovering from a base backup, we once again need an init fork to refresh the
unlogged-MV relations. In turn, we would still need a relisvalid
implementation that copes. This is all solvable, sure, but it looks like a
trip off into the weeds relative to the core aim of this patch.

> > Why is the dependency problem of ordering MV refreshes and MV index builds so
> > different from existing pg_dump dependency problems?
>
> If mva has indexes and is referenced by mvb, the CREATE statements
> are all properly ordered, but you want mva populated and indexed
> before you attempt to populate mvb. (Populated to get correct
> results, indexed to get them quickly.) We don't have anything else
> like that.

Is the REFRESH order just a replay of the CREATE order (with index builds
interspersed), or can it differ?

nm