Re: record identical operator

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: record identical operator
Date: 2013-09-16 20:58:21
Message-ID: 20130916205821.GA313832@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 16, 2013 at 04:28:23PM +0200, Andres Freund wrote:
> On 2013-09-15 19:49:26 -0400, Noah Misch wrote:
> > Type-specific identity operators seem like overkill, anyway. If we find that
> > meaningless variations in a particular data type are causing too many false
> > non-matches for the generic identity operator, the answer is to make the
> > functions generating datums of that type settle on a canonical form. That
> > would be the solution for your example involving array null bitmaps.
>
> I think that's pretty much unrealistic. I am pretty sure that if either
> of us starts looking we will find at about a dozen of such cases and
> miss the other dozen. Not to speak about external code which is damn
> likely to contain such cases.

It wouldn't be a problem if we missed cases or declined to update known cases.
The array example probably isn't worth changing. Who's going to repeatedly
flip-flop an array column value between the two representations and then
bemoan the resulting MV write traffic?

> And I think that efficiency will often make such normalization expensive
> (consider postgis where Datums afaik can exist with an internal bounding
> box or without).

If it's so difficult to canonicalize between two supposedly-identical
variations, it's likewise a stretch to trust that all credible operations will
fail to distinguish between those variations.

> I think it's far more realistic to implement an identity operator that
> will fall back to a type specific operator iff equals has "strange"
> properties.

Complicating such efforts, the author of a custom identity operator doesn't
have the last word on functions that process the data type. Take your postgis
example; if a second party adds a has_internal_bounding_box() function, an
identity operator ignoring that facet is no longer valid.

memcmp() has served well for HOT and for _equalConst(); why would it suddenly
fall short for MV maintenance?

nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2013-09-16 21:03:00 Re: Proposal: json_populate_record and nested json objects
Previous Message Greg Stark 2013-09-16 20:47:56 Re: Possible memory leak with SQL function?