Re: record identical operator

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Steve Singer <steve(at)ssinger(dot)info>, Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: record identical operator
Date: 2013-09-20 14:51:46
Message-ID: 20130920145146.GB2706@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Kevin Grittner (kgrittn(at)ymail(dot)com) wrote:
> The result of this is that with the patch, an incremental refresh
> of a matview will always match what the query returned at the time
> it was run (there is no *correctness* problem) but if someone uses
> a query with non-deterministic results they may have a lot of
> activity on a concurrent refresh even if there was no change to the
> underlying data -- so you could have a *performance* penalty in
> cases where the query returns something different, compared to
> leaving the old "equal but not the same" results.

You mean 'at the time of the incremental refresh', right? Yet that may
or may not match running that query directly by an end-user as the plan
that a user might get for the entire query could be different than what
is run for an incremental update, or due to statistics updates, etc.

> > Consider a GROUP BY with a citext column as one of the key fields.
> > You're going to get whatever value the aggregate happened to come across
> > first when building the HashAgg.  Having the binary equality operator
> > doesn't help that and seems like it could even result in change storms
> > happening due to a different plan when the actual data didn't change.
>
> Yup.  A person who wants to GROUP BY a citext value for a matview
> might want to fold it to a consistent capitalization to avoid that
> issue.

I'm trying to figure out why that's a perfectly acceptable solution for
users running views with GROUP BYs, but apparently it isn't sufficient
for mat views? In other words, you would suggest telling users "sorry,
you can't rely on the value returned by a GROUP BY on that citext column
using a normal view, but we're going to try and do better for mat
views".

I don't intend the above to imply that we should never update values in
mat views when we can do so in a reliable way to ensure that the value
matches what a view would return. This matches our notion of UPDATE,
where we will still UPDATE a value even if the old value and the new
value are equal according to the type's equality operator, when the
conditional for the UPDATE is using a reliable type (eg: integer).

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-09-20 14:55:06 Re: record identical operator
Previous Message Andres Freund 2013-09-20 14:47:24 Re: Freezing without write I/O