Re: Exclusion Constraints on Arrays?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Exclusion Constraints on Arrays?
Date: 2012-05-13 22:45:26
Message-ID: CA+TgmobZhfRJNyz-fyw5kDtRurK0HjWP0vtP5fGZLE6eVSWCQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, May 13, 2012 at 12:12 AM, David E. Wheeler
<david(at)justatheory(dot)com> wrote:
> I need a constraint that ensures that a text[] column has only unique values -- that is, that there is no overlap of values between rows. I thought this was a made-to-order for an exclusion constraint. So I tried it:
>
> david=# create table tags (names text[] primary key, exclude using gist (names WITH &&));NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "tags_pkey" for table "tags"
> ERROR:  data type text[] has no default operator class for access method "gist"
> HINT:  You must specify an operator class for the index or define a default operator class for the data type.
>
> Rats! It looks like there is only a gin operator family for arrays, not gist, and exclusion constraints support only gist indexes, and I couldn't find an operator class, either. Have I missed something, in my (likely) ignorance? Or are there perhaps some types to consider modifying to support exclusion constraints?

Hmm, it looks like GIN can't support exclusive constraints because
amgettuple support is required, and unfortunately that got remove for
GIN in this commit:

commit ff301d6e690bb5581502ea3d8591a1600fd87acc
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Tue Mar 24 20:17:18 2009 +0000

Implement "fastupdate" support for GIN indexes, in which we try to
accumulate
multiple index entries in a holding area before adding them to the
main index
structure. This helps because bulk insert is (usually) significantly faster
than retail insert for GIN.

This patch also removes GIN support for amgettuple-style index scans. The
API defined for amgettuple is difficult to support with fastupdate, and
the previously committed partial-match feature didn't really work with
it either. We might eventually figure a way to put back amgettuple
support, but it won't happen for 8.4.

Code comments explain the problem in more detail:

/*
* First, scan the pending list and collect any matching
entries into the
* bitmap. After we scan a pending item, some other
backend could post it
* into the main index, and so we might visit it a second time
during the
* main scan. This is okay because we'll just re-set the same
bit in the
* bitmap. (The possibility of duplicate visits is a
major reason why GIN
* can't support the amgettuple API, however.) Note that it would not do
* to scan the main index before the pending list, since concurrent
* cleanup could then make us miss entries entirely.
*/
scanPendingInsert(scan, tbm, &ntids);

It seems like maybe we could work around this by remembering the
contents of the pending list throughout the scan. Every time we hit a
TID while scanning the main index, we check whether we already
returned it from the pending list; if so, we skip it, but if not, we
return it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2012-05-13 23:17:59 Why do we still have commit_delay and commit_siblings?
Previous Message Noah Misch 2012-05-13 20:52:31 Update comments for PGPROC/PGXACT split