Re: GIN index build speed

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN index build speed
Date: 2008-12-22 00:22:20
Message-ID: 1229905340.2285.39.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2008-12-02 at 12:12 +0200, Heikki Linnakangas wrote:
> CREATE TABLE foo (bar tsvector);
> INSERT INTO foo SELECT to_tsvector('foo' || a) FROM generate_series(1,
> 200000) a;
> CREATE INDEX foogin ON foo USING gin (bar);
>
> The CREATE INDEX step takes about 40 seconds on my laptop, which seems
> excessive.
>

There seems to be a performance cliff right around the value you chose.
On my system:

100000 2 s
125000 9 s
135000 22 s
150000 56 s

I suppose that makes sense, but I was a little surprised the drop-off
was so sharp.

Seems like it would be a useful patch for next version. It may not be
useful for text search in normal situations (as Teodor mentioned), but
it may be useful for indexing arrays, which might be more likely to be
inserted in order.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2008-12-22 00:22:29 Re: rules regression test failed on mingw
Previous Message Markus Wanner 2008-12-22 00:22:05 Re: Sync Rep: Second thoughts