Quick Links

Very long times to build hash indexes

Lists:	pgsql-novice

From:	"David Monarchi" <david(dot)e(dot)monarchi(at)gmail(dot)com>
To:	pgsql-novice(at)postgresql(dot)org
Subject:	Very long times to build hash indexes
Date:	2007-03-20 17:51:48
Message-ID:	eea51fdb0703201051w7b557752i12502e0f4cd5a9c6@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-novice

Hello -

We have a database of about 250GB. The core table contains about 140M rows
that are all integers and small integers. Aside from the key field, the
rest are all foreign keys. Virtually all of our queries use equalities
rather than inequalities/ranges. The database changes at a fairly even rate
due to insertions and deletions. Updates are rare. Insertions dominate the
environment, and we expect to have about 400M rows in the core table by the
end of the year.

We need to build indexes on 10 foreign key fields in the core table. Based
on the type of queries and the fact that insertions in it are fast, we are
building hash indexes on those fields. We have successfully built 5 of the
10 hash indexes. Each one required about 20 hours to construct.

When we got to the 6th field, we found that the indexing process would not
terminate even after 70 hours. We then tried the 7th field with the same
result. Is there something that we've overlooked? Is there a limit of some
kind that we've missed?

Any advice/suggestions would be very much appreciated.

Thank you.

david

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"David Monarchi" <david(dot)e(dot)monarchi(at)gmail(dot)com>
Cc:	pgsql-novice(at)postgresql(dot)org
Subject:	Re: Very long times to build hash indexes
Date:	2007-03-20 18:22:03
Message-ID:	15977.1174414923@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-novice

"David Monarchi" <david(dot)e(dot)monarchi(at)gmail(dot)com> writes:
> We need to build indexes on 10 foreign key fields in the core table. Based
> on the type of queries and the fact that insertions in it are fast, we are
> building hash indexes on those fields. We have successfully built 5 of the
> 10 hash indexes. Each one required about 20 hours to construct.

> When we got to the 6th field, we found that the indexing process would not
> terminate even after 70 hours. We then tried the 7th field with the same
> result. Is there something that we've overlooked?

The short answer is that Postgres' hash indexes suck. The degree of
suckiness varies by PG version (which you failed to mention) but there
is no release currently in which I would use them in preference to a
btree index. The lack of WAL support is alone a sufficient reason why
they're unacceptable for production use, but on top of that they don't
actually have any performance advantage in any tests that I've seen.

regards, tom lane