Re: Exclusion Constraints on Arrays?

Lists: pgsql-hackers
From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Exclusion Constraints on Arrays?
Date: 2012-05-13 04:12:11
Message-ID: C7E9B9D1-40B5-4E34-958F-C2E58517AEB2@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hackers,

I need a constraint that ensures that a text[] column has only unique values -- that is, that there is no overlap of values between rows. I thought this was a made-to-order for an exclusion constraint. So I tried it:

david=# create table tags (names text[] primary key, exclude using gist (names WITH &&));NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "tags_pkey" for table "tags"
ERROR: data type text[] has no default operator class for access method "gist"
HINT: You must specify an operator class for the index or define a default operator class for the data type.

Rats! It looks like there is only a gin operator family for arrays, not gist, and exclusion constraints support only gist indexes, and I couldn't find an operator class, either. Have I missed something, in my (likely) ignorance? Or are there perhaps some types to consider modifying to support exclusion constraints?

Thanks,

David


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-13 22:45:26
Message-ID: CA+TgmobZhfRJNyz-fyw5kDtRurK0HjWP0vtP5fGZLE6eVSWCQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, May 13, 2012 at 12:12 AM, David E. Wheeler
<david(at)justatheory(dot)com> wrote:
> I need a constraint that ensures that a text[] column has only unique values -- that is, that there is no overlap of values between rows. I thought this was a made-to-order for an exclusion constraint. So I tried it:
>
> david=# create table tags (names text[] primary key, exclude using gist (names WITH &&));NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "tags_pkey" for table "tags"
> ERROR:  data type text[] has no default operator class for access method "gist"
> HINT:  You must specify an operator class for the index or define a default operator class for the data type.
>
> Rats! It looks like there is only a gin operator family for arrays, not gist, and exclusion constraints support only gist indexes, and I couldn't find an operator class, either. Have I missed something, in my (likely) ignorance? Or are there perhaps some types to consider modifying to support exclusion constraints?

Hmm, it looks like GIN can't support exclusive constraints because
amgettuple support is required, and unfortunately that got remove for
GIN in this commit:

commit ff301d6e690bb5581502ea3d8591a1600fd87acc
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Tue Mar 24 20:17:18 2009 +0000

Implement "fastupdate" support for GIN indexes, in which we try to
accumulate
multiple index entries in a holding area before adding them to the
main index
structure. This helps because bulk insert is (usually) significantly faster
than retail insert for GIN.

This patch also removes GIN support for amgettuple-style index scans. The
API defined for amgettuple is difficult to support with fastupdate, and
the previously committed partial-match feature didn't really work with
it either. We might eventually figure a way to put back amgettuple
support, but it won't happen for 8.4.

Code comments explain the problem in more detail:

/*
* First, scan the pending list and collect any matching
entries into the
* bitmap. After we scan a pending item, some other
backend could post it
* into the main index, and so we might visit it a second time
during the
* main scan. This is okay because we'll just re-set the same
bit in the
* bitmap. (The possibility of duplicate visits is a
major reason why GIN
* can't support the amgettuple API, however.) Note that it would not do
* to scan the main index before the pending list, since concurrent
* cleanup could then make us miss entries entirely.
*/
scanPendingInsert(scan, tbm, &ntids);

It seems like maybe we could work around this by remembering the
contents of the pending list throughout the scan. Every time we hit a
TID while scanning the main index, we check whether we already
returned it from the pending list; if so, we skip it, but if not, we
return it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-22 16:28:40
Message-ID: 708C6F18-D550-4FB4-BADD-10FD7625D413@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On May 13, 2012, at 3:45 PM, Robert Haas wrote:

> It seems like maybe we could work around this by remembering the
> contents of the pending list throughout the scan. Every time we hit a
> TID while scanning the main index, we check whether we already
> returned it from the pending list; if so, we skip it, but if not, we
> return it.

Should this go onto the To-Do list, then?

Best,

David


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-22 16:42:53
Message-ID: CA+TgmoYFyQr7qkK6wdpJMXv1jO0yLoOSeVLZqmcUsFDayEDBrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler
<david(at)justatheory(dot)com> wrote:
> On May 13, 2012, at 3:45 PM, Robert Haas wrote:
>
>> It seems like maybe we could work around this by remembering the
>> contents of the pending list throughout the scan.  Every time we hit a
>> TID while scanning the main index, we check whether we already
>> returned it from the pending list; if so, we skip it, but if not, we
>> return it.
>
> Should this go onto the To-Do list, then?

If someone other than me can confirm that it's not a stupid approach,
I would say yes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-22 16:56:13
Message-ID: 3152.1337705773@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler
> <david(at)justatheory(dot)com> wrote:
>> On May 13, 2012, at 3:45 PM, Robert Haas wrote:
>>> It seems like maybe we could work around this by remembering the
>>> contents of the pending list throughout the scan. Every time we hit a
>>> TID while scanning the main index, we check whether we already
>>> returned it from the pending list; if so, we skip it, but if not, we
>>> return it.

>> Should this go onto the To-Do list, then?

> If someone other than me can confirm that it's not a stupid approach,
> I would say yes.

It seems probably workable given that we expect the pending list to be
of fairly constrained size. However, the commit message referenced
upthread also muttered darkly about GIN's partial match logic not working
in amgettuple. I do not recall the details of that issue, but unless we
can solve that one too, there's not much use in fixing this one.

regards, tom lane


From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-22 18:58:04
Message-ID: 4DC8D284-05CF-4E3D-9670-AC9A32C37A36@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On May 22, 2012, at 9:56 AM, Tom Lane wrote:

> It seems probably workable given that we expect the pending list to be
> of fairly constrained size. However, the commit message referenced
> upthread also muttered darkly about GIN's partial match logic not working
> in amgettuple. I do not recall the details of that issue, but unless we
> can solve that one too, there's not much use in fixing this one.

Well, what about a GiST operator family/class for arrays?

Best,

David


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-23 17:34:01
Message-ID: CAEYLb_VDAcedms-hvUpCLWq94P+omNnL_K+3YMv6j_0=v-81YA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 May 2012 23:45, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Hmm, it looks like GIN can't support exclusive constraints because
> amgettuple support is required, and unfortunately that got remove for
> GIN in this commit:
>
> commit ff301d6e690bb5581502ea3d8591a1600fd87acc

That seems very unfortunate, given that the 9.0 release notes say:

"Exclusion constraints generalize uniqueness constraints by allowing
arbitrary comparison operators, not just equality....This is useful
for time periods and other ranges, as well as arrays."

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-23 17:35:24
Message-ID: CA+TgmoakBdj-i8DfXJKndpz=Jhut2ApWWTzVpcDSqHk3gsYidw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, May 23, 2012 at 1:34 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> On 13 May 2012 23:45, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Hmm, it looks like GIN can't support exclusive constraints because
>> amgettuple support is required, and unfortunately that got remove for
>> GIN in this commit:
>>
>> commit ff301d6e690bb5581502ea3d8591a1600fd87acc
>
> That seems very unfortunate, given that the 9.0 release notes say:
>
> "Exclusion constraints generalize uniqueness constraints by allowing
> arbitrary comparison operators, not just equality....This is useful
> for time periods and other ranges, as well as arrays."

I objected to it at the time, but lost the argument.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-23 17:38:33
Message-ID: 53A5174B-13D6-411B-9F5A-1EEF312D2EE2@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On May 23, 2012, at 10:35 AM, Robert Haas wrote:

>> "Exclusion constraints generalize uniqueness constraints by allowing
>> arbitrary comparison operators, not just equality....This is useful
>> for time periods and other ranges, as well as arrays."
>
> I objected to it at the time, but lost the argument.

It does rather seem to be a mistake to say that they can be used with arrays when they cannot. Or can they in some other way I haven’t noticed?

David