Re: refresh materialized view concurrently

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: refresh materialized view concurrently
Date: 2013-06-17 12:57:30
Message-ID: 1371473850.55293.YahooMailNeo@web162904.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 14.06.2013 19:05, Kevin Grittner wrote:
>> Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY
>> for 9.4 CF1.  The goal of this patch is to allow a refresh
>> without interfering with concurrent reads, using transactional
>> semantics.
>>
>> It is my hope to get this committed during this CF to allow me
>> to focus on incremental maintenance for the rest of the release
>> cycle.
>
> I must say this seems a bit pointless on its own.

I completely disagree.  When I read what people were posting about
the materialized view creation that went into 9.3, there were many
comments by people that they can't use it until the materialized
views can be refreshed without blocking readers.  There is a clear
need for this.  It doesn't do much to advance incremental
maintenance, but it is a much smaller patch which will make
matviews usable by a lot of people who can't use the initial
feature set.

> I didn't understand this error message:
>
> +     if (!foundUniqueIndex)
> +         ereport(ERROR,
> +                 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +                  errmsg("concurrent refresh requires a unique index on just columns for all rows of the materialized view")));
> +
>
> What does that mean?

It means that the REFRESH MATERIALIZED VIEW CONCURRENTLY command
cannot be used on a materialized view unless it has at least one
UNIQUE index which is not partial (i.e., there is no WHERE clause)
and is not indexing on an expression (i.e., the index is entirely
on bare column names).  Set logic to do the "diff" is hard to get
right if the tables are not proper sets (i.e., they contain
duplicate rows).  I can see at least three ways it *could* be done,
but all of them are much more complex and significantly slower.
With a UNIQUE index on some set of columns in all rows the correct
guarantees exist to use fast set logic.  It isn't that it's needed
for access; it is needed to provide a guarantee that there is no
row without NULLs that exactly duplicates another row.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-06-17 13:09:36 Re: GIN improvements part2: fast scan
Previous Message Alexander Korotkov 2013-06-17 12:56:28 Re: GIN improvements part 3: ordering in index