Hstore: Query speedups with Gin index

From: Blake Smith <blakesmith0(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Hstore: Query speedups with Gin index
Date: 2013-08-22 14:55:45
Message-ID: CAPxT4eEe-MNs4FbVN1Le_nWsTgyw3N38tENbN5W7GPBrWB7cJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey everyone,

I'm looking for feedback on a contrib/hstore patch.

We've been experiencing slow "@>" queries involving an hstore column that's
covered by a Gin index. At the current postgresql git HEAD, the hstore <->
gin interface produces the following text items to be indexed:

hstore: "'a'=>'1234', 'b'=>'test'"
Produces indexed text items: "Ka", "V1234", "Kb", "Vtest"

For the size of our production table (10s of millions of rows), I observed
significant query speedups by changing the index strategy to the following:

hstore: "'a'=>'1234', 'b'=>'test'"
Produces indexed text items: "Ka", "KaV1234", "Kb", "KbVtest"

The combined entry is used to support "contains (@>)" queries, and the key
only item is used to support "key contains (?)" queries. This change seems
to help especially with hstore keys that have high cardinalities. Downsides
of this change is that it requires an index rebuild, and the index will be
larger in size.

Patch attached. Any thoughts on this change?

Thanks,

Blake

Attachment Content-Type Size
hstore_gin_speedup.patch application/octet-stream 3.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-08-22 15:45:38 Re: pg_system_identifier()
Previous Message bricklen 2013-08-22 14:47:50 Re: pg_system_identifier()