Re: Materialized views WIP patch

From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Marko Tiikkaja" <pgmail(at)joh(dot)to>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2012-12-03 16:31:52
Message-ID: 20121203163152.69320@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Marko Tiikkaja wrote:
> Kevin Grittner <kgrittn(at)mail(dot)com> wrote:
>> Marko Tiikkaja wrote:
>>> <T2 sees an empty table>
>>
>> As far as I know you are the first to notice this behavior.
>> Thanks for pointing it out.
>>
>> I will take a look at the issue; I don't know whether it's
>> something small I can address in this CF or whether it will need
>> to be in the next CF, but I will fix it.
>
> Any news on this front?

On a preliminary look, I think that it's because when the new heap
is visible, it has been populated with rows containing an xmin too
new to be visible to the transaction which was blocked. I'll see if
I can't populate the new heap with tuples which are already frozen
and hinted, which will not only fix this bug, but improve
performance.

>>> I'll get back when I manage to get a better grasp of the code.
>
> The code looks relatively straightforward and good to my eyes. It
> passes my testing and looks to be changing all the necessary
> parts of the code.

Thanks. I'll put together a new version of the patch based on
feedback so far. I need to finish my testing of the fklocks patch
and post a review first, so it will be a few days.

>> Keep in mind that the current behavior of behaving like a
>> regular view when the contents are invalid is not what I had in
>> mind, that was an accidental effect of commenting out the body
>> of the ExecCheckRelationsValid() function right before posting
>> the patch because I noticed a regression. When I noticed current
>> behavior, it struck me that someone might prefer it to the
>> intended behavior of showing an error like this:
>>
>> ereport(ERROR,
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>> errmsg("materialized view \"%s\" has not been populated",
>> get_rel_name(rte->relid)),
>> errhint("Use the LOAD MATERIALIZED VIEW command.")));
>>
>> I mention it in case someone wants to argue for silently
>> behaving as a regular view when the MV is not populated.
>
> FWIW, I'd prefer an error in this case, but I don't feel strongly
> about it.

Me, too. Since nobody else has spoken up, that's what I'll do.

I haven't heard anyone argue for keeping temporary materialized
views, so those will be dropped from the next version of the patch.

It sounds like most people prefer REFRESH to LOAD. If there is no
further discussion of that, I will make that change in the next
version of the patch; so if you want to argue against adding a new
unreserved keyword for that, now is the time to say so. Also,
should it be just?:

REFRESH matview_name;

or?:

REFRESH MATERIALIZED VIEW matview_name;

I will look at heap_inplace_update() where appropriate for the
pg_class changes.

On the issue of UHLOGGED MVs, I think it's pretty clear that the
right way to handle this is to have something in the init fork of
an unlogged MV that indicates that it is in an invalid state, which
is checked when the MV is opened. I'm not sure of the best way to
store that, but my first instinct would be to put it inot the
"special space" of the initial page, which would then be removed
after forcing the relisvalid flag to false. That feels pretty
kludgy, but its the best idea I've had so far. Anyone else want to
suggest a better way?

I will add regression tests in the next version of the patch.

While I agree that tracking one or more timestamps related to MV
"freshness" is a good idea, that seems like material for a
follow-on patch.

I don't agree that never having been loaded is a form of "stale",
in spite of multiple people whom I hold in high regard expressing
that opinion, so barring strong opposition I intend to keep the
relisvalid column. I just don't feel comfortable about the
fundamental correctness of the feature without it. If I'm reading
things correctly, so far the opposition has been based on being
lukewarm on the value of the column and wanting other features
which might be harder with this column (i.e., UNLOGGED) or which
are seen as alternatives to this column (i.e., a "freshness"
timestamp) rather than actual opposition to throwing an error on an
attempt to use an MV which has not been loaded with data.

I will fix the commented-out test of relisvalid.

I still haven't quite figured out what to do about the case Thom
uncovered where the MV's SELECT was just selecting a literal. I
haven't established what the most general case of that is, nor what
a sensible behavior is. Any opinions or insights welcome.

I didn't see any rebuttal to Tom's concerns about adding
asymetrical COPY support, so I will leave that part as is.

I will try to fill out the dependencies in pg_dump so that MVs
which depend on other MVs will be dumped in the right order.

I think that will probably be enough for the next version.

-Kevin

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-12-03 16:59:26 Re: [v9.3] Row-Level Security
Previous Message Tom Lane 2012-12-03 16:31:32 Re: Patch for removng unused targets