GSoC - Materialized Views - is stale or fresh?

Lists: pgsql-hackers
From: Pavel Baros <baros(dot)p(at)seznam(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: GSoC - Materialized Views - is stale or fresh?
Date: 2010-06-12 14:18:38
Message-ID: 4C13973E.1080902@seznam.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I am curious how could I solve the problem:

During refreshing I would like to know, if MV is stale or fresh? And I
had an idea:

In fact, MV need to know if its last refresh (transaction id) is older
than any INSERT, UPDATE, DELETE transaction launched against source
tables. So if MV has information about last (highest) xmin in source
tables, it could simply compare its own xmin to xmins (xmax for deleted
rows) from source tables and decide, if is stale or fresh.

Whole realization could look like this:
1. Make new column in pg_class (or somewhere in pg_stat* ?):
pg_class.rellastxid (of type xid)

2. After each INSERT, UPDATE, DELETE statement (transaction)
pg_class.rellastxid would be updated. That should not be time- or
memory- consuming (not so much) since pg_class is cached, I guess.

3. When refreshing, as described above, MV rellastxid compared to
source tables rellastxid could answer if MV is stale or still fresh.
Decision, if to run refreshing, would be as simple as it can.

a) Is the idea right?

b) Could appear some cases when it is not true? (except xid wraparound).

c) I was looking for some help with it in pg_stat*, but there is no
information about transactions, last changes in relations or anything.

d) or there are other mechanisms or ideas how to check if MV source
tables are changed from last refresh?

Thanks for replies

Pavel Baros


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Pavel Baros <baros(dot)p(at)seznam(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: GSoC - Materialized Views - is stale or fresh?
Date: 2010-06-14 05:52:25
Message-ID: 4C15C399.4010209@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/06/10 17:18, Pavel Baros wrote:
> I am curious how could I solve the problem:
>
> During refreshing I would like to know, if MV is stale or fresh? And I
> had an idea:
>
> In fact, MV need to know if its last refresh (transaction id) is older
> than any INSERT, UPDATE, DELETE transaction launched against source
> tables. So if MV has information about last (highest) xmin in source
> tables, it could simply compare its own xmin to xmins (xmax for deleted
> rows) from source tables and decide, if is stale or fresh.
>
> Whole realization could look like this:
> 1. Make new column in pg_class (or somewhere in pg_stat* ?):
> pg_class.rellastxid (of type xid)
>
> 2. After each INSERT, UPDATE, DELETE statement (transaction)
> pg_class.rellastxid would be updated. That should not be time- or
> memory- consuming (not so much) since pg_class is cached, I guess.

rellastxid would have to be updated at every insert/update/delete. It
would become a big bottleneck. That's not going to work.

Why do you need to know if a MV is stale?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Pavel Baroš <baros(dot)p(at)seznam(dot)cz>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: GSoC - Materialized Views - is stale or fresh?
Date: 2010-06-14 12:47:16
Message-ID: 4C1624D4.4040103@seznam.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas napsal(a):
> On 12/06/10 17:18, Pavel Baros wrote:
>> I am curious how could I solve the problem:
>>
>> During refreshing I would like to know, if MV is stale or fresh? And I
>> had an idea:
>>
>> In fact, MV need to know if its last refresh (transaction id) is older
>> than any INSERT, UPDATE, DELETE transaction launched against source
>> tables. So if MV has information about last (highest) xmin in source
>> tables, it could simply compare its own xmin to xmins (xmax for deleted
>> rows) from source tables and decide, if is stale or fresh.
>>
>> Whole realization could look like this:
>> 1. Make new column in pg_class (or somewhere in pg_stat* ?):
>> pg_class.rellastxid (of type xid)
>>
>> 2. After each INSERT, UPDATE, DELETE statement (transaction)
>> pg_class.rellastxid would be updated. That should not be time- or
>> memory- consuming (not so much) since pg_class is cached, I guess.
>
> rellastxid would have to be updated at every insert/update/delete. It
> would become a big bottleneck. That's not going to work.
>
> Why do you need to know if a MV is stale?
>

Sorry I did not mention it. If we knew MV is fresh, there is no neeed to
refresh MV and so it would prevent useless rebuilding of MV. So I
thought there is room for saving some work.

Anyway, I realized, this idea do not cover all the cases how to find out
MV is stale or fresh. For example, when updating a row of source table
of MV, that do not participate in MV, in that case refreshing of MV
would be useless too.