B-Tree emulation for GIN

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: B-Tree emulation for GIN
Date: 2008-10-31 16:50:26
Message-ID: 490B3752.3010800@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We'd like to improve usefulness of a new multicolumn feature of GIN index (
already in CVS HEAD ), which is currently a bit limited because of lack of GIN
support for most data types. For example, it'd be very useful to combine native
GIN-supported data types, like text search, hstore, arrays with other ones, for
example, date, integers, etc. Right now, in the best case, we'll have BitmapAnd
scan over *two* indexes - GIN and btree.

We suggest to emulate btree index by GIN to make possible to use only one index
scan. For that reason we developed btree_gin contrib module, as we do
btree_gist for GIN index. Current implementation provides support for int4 and
text data types, but it's straightforward and easy to add support for other data
types.

However, GIN interface should be extended for effective realization of such
emulation. We've already changed GIN interface, but it was done during 8.4
development cycle, so the new changes can be done safely.

New interface (only changed functions is shown):
Datum *extractQuery(Datum query, int32 *nkeys,
StrategyNumber n, bool *pmatch[],
void ** extra_data[])
bool consistent(bool check[],
StrategyNumber n, Datum query, bool *recheck,
void *extra_data[])
int comparePartial(Datum partial_key, Datum key, void *extra_data)

extractQuery() might allocate an array (with nkeys elements) of any data,
GIN provides that array as an 5-th argument for consistent() and corresponding
element for comparePartial().

Patch also includes changes in pg_trgm to cache the number of query's trigrams
in extra_data instead of fcinfo->flinfo->fn_extra usage in consistent(), to
improve situation when the same index can be used for several conditions, like (
foo.bar % 'book' and foo.bar 'look' )

TODO/issues
- add support of other data types to btree_gin
- current bulk insert algorithm (used in ginbuild and fast_insert_gin patch)
doesn't optimized for ordered or near ordered table, so time of btree_gin's
index creation is much worse than that for btree.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
gin_extra-0.1.gz application/x-tar 4.4 KB
btree_gin-0.3.tgz application/x-gtar 2.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hitoshi Harada 2008-10-31 17:10:55 Re: Window Functions: patch for CommitFest:Nov.
Previous Message Tom Lane 2008-10-31 16:39:34 Changing the result of ExecutorRun