Re: refresh materialized view concurrently

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: refresh materialized view concurrently
Date: 2013-07-09 07:43:40
Message-ID: CAP7Qgm=jb3xkzQXfGtX9STx8fzd8EDDQ-oJ8ekcyeOud+yLCoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 6, 2013 at 9:20 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
>
>> Oops!
>
> Indeed. Thanks for the careful testing.
>
>> drop materialized view if exists mv;
>> drop table if exists foo;
>> create table foo(a, b) as values(1, 10);
>> create materialized view mv as select * from foo;
>> create unique index on mv(a);
>> insert into foo select * from foo;
>> refresh materialized view mv;
>> refresh materialized view concurrently mv;
>>
>> test=# refresh materialized view mv;
>> ERROR: could not create unique index "mv_a_idx"
>> DETAIL: Key (a)=(1) is duplicated.
>> test=# refresh materialized view concurrently mv;
>> REFRESH MATERIALIZED VIEW
>
> Fixed by scanning the temp table for duplicates before generating
> the diff:
>
> test=# refresh materialized view concurrently mv;
> ERROR: new data for "mv" contains duplicate rows without any NULL columns
> DETAIL: Row: (1,10)
>

I think the point is not check the duplicate rows. It is about unique
key constraint violation. So, if you change the original table foo as
values(1, 10), (1, 20), the issue is still reproduced. In
non-concurrent operation it is checked by reindex_index when swapping
the heap, but apparently we are not doing constraint check in
concurrent mode.

>> [ matview with all columns covered by unique indexes fails ]
>
> Fixed.
>
>> Other than these, I've found index is opened with NoLock, relying
>> on ExclusiveLock of parent matview, and ALTER INDEX SET
>> TABLESPACE or something similar can run concurrently, but it is
>> presumably safe. DROP INDEX, REINDEX would be blocked by the
>> ExclusiveLock.
>
> Since others were also worried that an index definition could be
> modified while another process is holding an ExclusiveLock on its
> table, I changed this.
>

OK, others are resolved. One thing I need to apology
make_temptable_name_n, because I pointed the previous coding would be
a potential memory overrun, but actually the relation name is defined
by make_new_heap, so unless the function generates stupid long name,
there is not possibility to make such big name that overruns
NAMEDATALEN. So +1 for revert make_temptable_name_n, which is also
meaninglessly invented.

I've found another issue, which is regarding
matview_maintenance_depth. If SPI calls failed (pg_cancel_backend is
quite possible) and errors out in the middle of processing, this value
stays above zero, so subsequently we can issue DML on the matview. We
should make sure the value becomes zero before jumping out from this
function.

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2013-07-09 07:46:11 Re: [9.4 CF] Free VMs for Reviewers & Testers
Previous Message Jesper Krogh 2013-07-09 07:24:49 Re: [HACKERS] [9.4 CF] Free VMs for Reviewers & Testers