Re: how can i prevent materialized views from refreshing during pg_restore

Lists: pgsql-hackers
From: "Kirk Roybal" <kirk(at)webfinish(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: how can i prevent materialized views from refreshing during pg_restore
Date: 2014-06-26 14:36:46
Message-ID: 39bd13e3689572a6fb2bdc1581c420f7.squirrel@apex.websiteinc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I asked a question over on StackOverflow, and Craig Ringer told me to
report it here.

http://stackoverflow.com/questions/24413161/how-can-i-prevent-materialized-views-from-refreshing-during-pg-restore

I have created a dump of the database using pg_dump in "custom" format
(-Fc). This format allows for pg_restore to be invoked with the "jobs"
option (-j8). The jobs options starts 8 processes, and restores the vast
majority of relations in my database within 10 minutes.

I'm left with 4 processes. One of them is the refresh of a materialized
view, and the other 3 are indexes to be applied to 3 tables that the
materialized view uses as data sources. The indexes are "waiting"
according to pg_stat_activity, presumably because the REFRESH of the
materialized view is still accessing the source tables.

When the indexes are in place, the refresh of the view only takes a couple
of minutes. Because the indexes are not in place during the REFRESH, I cut
the REFRESH process off at 17 hours, which made pg_restore fail.

How can I

Force the order of items so the indexes get created first
Turn off the refresh of the materialized view and do it manually later
Manipulate the dump file in custom format to say "WITH NO DATA"
Intercept the REFRESH MATERIALIZED VIEW statement and throw it in the
trash

Or any other solution that gets the job done?

I have a dump file that I'm willing to send to somebody that seems to
reproduce the problem pretty consistently.

Thank you,
/Kirk


From: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: how can i prevent materialized views from refreshing during pg_restore
Date: 2014-06-26 14:49:17
Message-ID: 1403794157042-5809367.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

bithead wrote
> I asked a question over on StackOverflow, and Craig Ringer told me to
> report it here.
>
> http://stackoverflow.com/questions/24413161/how-can-i-prevent-materialized-views-from-refreshing-during-pg-restore
>
> I have created a dump of the database using pg_dump in "custom" format
> (-Fc). This format allows for pg_restore to be invoked with the "jobs"
> option (-j8). The jobs options starts 8 processes, and restores the vast
> majority of relations in my database within 10 minutes.
>
> I'm left with 4 processes. One of them is the refresh of a materialized
> view, and the other 3 are indexes to be applied to 3 tables that the
> materialized view uses as data sources. The indexes are "waiting"
> according to pg_stat_activity, presumably because the REFRESH of the
> materialized view is still accessing the source tables.
>
> When the indexes are in place, the refresh of the view only takes a couple
> of minutes. Because the indexes are not in place during the REFRESH, I cut
> the REFRESH process off at 17 hours, which made pg_restore fail.
>
> How can I
>
> Force the order of items so the indexes get created first
> Turn off the refresh of the materialized view and do it manually later
> Manipulate the dump file in custom format to say "WITH NO DATA"
> Intercept the REFRESH MATERIALIZED VIEW statement and throw it in the
> trash
>
> Or any other solution that gets the job done?
>
> I have a dump file that I'm willing to send to somebody that seems to
> reproduce the problem pretty consistently.

Have/can you try the "-l (el) & -L" options to pg_restore?

http://www.postgresql.org/docs/9.3/static/app-pgrestore.html

(example of usage is toward the bottom of the page)

Basically re-order the command sequence so that the materialize runs as late
as possible, or just disable it altogether.

pg_dump/pg_restore should be taught to handle this better, which is the main
reason why Craig had you post here ASAP, but to get it functional for now
manual intervention will be necessary. In theory the "listing" capabilities
should allow you to do what you need.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/how-can-i-prevent-materialized-views-from-refreshing-during-pg-restore-tp5809364p5809367.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: "Kirk Roybal" <kirk(at)webfinish(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: how can i prevent materialized views from refreshing during pg_restore
Date: 2014-06-26 15:03:27
Message-ID: 31c1d5283d05a4e1cbeb2def30d5bbfd.squirrel@apex.websiteinc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I think this (pg_restore -l | pg_restore -L) will get me where I need to
go for now by inserting a small shell script in between that pushes the
materialized views to the end of the list, but then I will also have to
manage my own dependencies for the items that I re-sort (MatViews of
MatViews).
This pretty seriously limits the usefulness of materialized views for me.
For version 9.3.x, I'm likely to require MatView dependencies no more than
1 deep.
Thanks for the answer, I had no solution before that.

/Kirk

> bithead wrote
>> I asked a question over on StackOverflow, and Craig Ringer told me to
>> report it here.
>>
>> http://stackoverflow.com/questions/24413161/how-can-i-prevent-materialized-views-from-refreshing-during-pg-restore
>>
>> I have created a dump of the database using pg_dump in "custom" format
>> (-Fc). This format allows for pg_restore to be invoked with the "jobs"
>> option (-j8). The jobs options starts 8 processes, and restores the vast
>> majority of relations in my database within 10 minutes.
>>
>> I'm left with 4 processes. One of them is the refresh of a materialized
>> view, and the other 3 are indexes to be applied to 3 tables that the
>> materialized view uses as data sources. The indexes are "waiting"
>> according to pg_stat_activity, presumably because the REFRESH of the
>> materialized view is still accessing the source tables.
>>
>> When the indexes are in place, the refresh of the view only takes a
>> couple
>> of minutes. Because the indexes are not in place during the REFRESH, I
>> cut
>> the REFRESH process off at 17 hours, which made pg_restore fail.
>>
>> How can I
>>
>> Force the order of items so the indexes get created first
>> Turn off the refresh of the materialized view and do it manually
>> later
>> Manipulate the dump file in custom format to say "WITH NO DATA"
>> Intercept the REFRESH MATERIALIZED VIEW statement and throw it in
>> the
>> trash
>>
>> Or any other solution that gets the job done?
>>
>> I have a dump file that I'm willing to send to somebody that seems to
>> reproduce the problem pretty consistently.
>
> Have/can you try the "-l (el) & -L" options to pg_restore?
>
> http://www.postgresql.org/docs/9.3/static/app-pgrestore.html
>
> (example of usage is toward the bottom of the page)
>
> Basically re-order the command sequence so that the materialize runs as
> late
> as possible, or just disable it altogether.
>
> pg_dump/pg_restore should be taught to handle this better, which is the
> main
> reason why Craig had you post here ASAP, but to get it functional for now
> manual intervention will be necessary. In theory the "listing"
> capabilities
> should allow you to do what you need.
>
> David J.
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/how-can-i-prevent-materialized-views-from-refreshing-during-pg-restore-tp5809364p5809367.html
> Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
--bithead--