Re: refresh materialized view concurrently

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: refresh materialized view concurrently
Date: 2013-06-21 09:20:53
Message-ID: CAP7QgmkcGzd=wOs0TZgqszU-taQE6_e7j3MGw4nKzPreoNRwVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

> Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
> 9.4 CF1. The goal of this patch is to allow a refresh without
> interfering with concurrent reads, using transactional semantics.
>

I spent a few hours to review the patch.

As far as I can tell, the overall approach is as follows.

- create a new temp heap as non-concurrent does, but with ExclusiveLock on
the matview, so that reader wouldn't be blocked
- with this temp table and the matview, query FULL JOIN and extract
difference between original matview and temp heap (via SPI)
- this operation requires unique index for performance reason (or
correctness reason too)
- run ANALYZE on this diff table
- run UPDATE, INSERT and DELETE via SPI, to do the merge
- close these temp heap

If I don't miss something, the requirement for the CONCURRENTLY option is
to allow simple SELECT reader to read the matview concurrently while the
view is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked. So, I wonder why it is not possible just
to acquire ExclusiveLock on the matview while populating the data and swap
the relfile by taking small AccessExclusiveLock. This lock escalation is
no dead lock hazard, I suppose, because concurrent operation would block
the other at the point ExclusiveLock is acquired, and ExclusiveLock
conflicts AccessExclusiveLock. Then you don't need the complicated SPI
logic or unique key index dependency.

Assuming I'm asking something wrong and going for the current approach,
here's what I've found in the patch.

- I'm not sure if unique key index requirement is reasonable or not,
because users only add CONCURRENTLY option and not merging or incremental
update.
- This could be an overflow in diffname buffer.
+ quoteRelationName(tempname, tempRel);
+ strcpy(diffname, tempname);
+ strcpy(diffname + strlen(diffname) - 1, "_2\"");
- As others pointed out, quoteOneName can be replaced with quote_identifier
- This looks harmless, but I wonder if you need to change relkind.

*** 665,672 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
OldHeap->rd_rel->relowner,
OldHeapDesc,
NIL,
! OldHeap->rd_rel->relkind,
! OldHeap->rd_rel->relpersistence,
false,
RelationIsMapped(OldHeap),
true,
--- 680,687 ----
OldHeap->rd_rel->relowner,
OldHeapDesc,
NIL,
! RELKIND_RELATION,
! relpersistence,
false,
RelationIsMapped(OldHeap),
true,

Since OldHeap->rd_rel->relkind has been working with 'm', too, not only 'r'?
- I found two additional parameters on make_new_heap ugly, but couldn't
come up with better solution. Maybe we can pass Relation of old heap to
the function instead of Oid..

That's about it from me.

Thanks,
--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hitoshi Harada 2013-06-21 09:32:01 Re: Support for RANGE ... PRECEDING windows in OVER
Previous Message Dimitri Fontaine 2013-06-21 09:20:35 Re: updated emacs configuration