@> and <@ (contains and is contained by) operations on large arrays

From: Allan Kamau <kamauallan(at)gmail(dot)com>
To: Postgres General Postgres General <pgsql-general(at)postgresql(dot)org>
Cc: kamauallan(at)gmail(dot)com
Subject: @> and <@ (contains and is contained by) operations on large arrays
Date: 2010-04-10 09:54:46
Message-ID: p2lab1ea6541004100254v3c6240ees478181b07d9c3b3a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am performing some array membership operations ( namely @> or <@ )
on large arrays.

One of the arrays in this pair of arrays being compared is contained
in a database field. The other array of this pair will be dynamically
generated from an array intersection activity in another part of the
larger query. I would like to improve performance by somehow avoiding
a sequential table scan for the given array field of all the records
in this table while there is a possibility that not all these records
will produce positive matches.

I would like to index this column of array type, however I get the
“ERROR: index row requires 8776 bytes, maximum size is 8191” error
indicating that one or more of the records may have exceeded the
maximum allowed index size.

Is there a work-around?
If I breakup the field into multiple records, it seems quite
challenging use the “contained in” or “contains” operations.

Or maybe (I could be seriously wrong here) assuming tsearch will allow
for large values, I could use full text search where I could convert
the “contains” array field to tsvector and the array which I am
searching for “containment” I could convert it to tsquery as follows.

SELECT ARRAY['1','2','3','7']@>ARRAY['2','7'];

to

SELECT tsvector(array_to_string(ARRAY['1','2','3','7'],'
'))@@tsquery(array_to_string(ARRAY['2','7'],' & '));

Allan.

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2010-04-10 10:51:41 Re: When is an explicit cast necessary?
Previous Message Josh Kupershmidt 2010-04-09 22:01:34 psql's \d display of unique index vs. constraint