Re: Index question on postgres

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "akp geek" <akpgeek(at)gmail(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Index question on postgres
Date: 2010-01-08 19:43:11
Message-ID: D425483C2C5C9F49B5B7A41F89441547029626E2@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: akp geek [mailto:akpgeek(at)gmail(dot)com]
Sent: Thursday, January 07, 2010 9:04 PM
To: Dann Corbit
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Index question on postgres

{snip}

Why would the index I have created not being used?

>>

The index you have created will not be used in several circumstances.
For instance:

1. It is faster to do a table scan than to use the index, despite
up to date statistics

2. The distribution of the data has changed since the last time
you analyzed the database

Consider a truly horrible case, an index on a single character. As it
turns out, this field contains exactly two values: 'M' or 'F' for male
or female. About 50% of the data is 'M' and about 50% is 'F'. If we
were to use this index to scan the data, we will be loading the index
pages, and then popping all over the data pages following the index. It
will truly be an awful sight. We would spend far more effort than
simply doing a table scan. Fortunately, we have statistics which have
come to our rescue. They will tell the optimizer to simply ignore the
horribly defined index file and never use it in any circumstance.

Consider an even more horrible case, the same index, but we have not
updated statistics in months and we have automatic stats and vacuum
disabled. The only time statistics was run, there was a single 'F' in
the index and 44 'M' values. A query comes along looking for "sex =
'F'" and the optimizer decides to use the index. We can't blame the
poor optimizer. It's not his fault that statistical collection was
disabled. So he merrily informs the query planner to follow the index
to collect the data, and the query takes eons to complete.

In short, using the index is not always a good idea. It's a good idea
to use an index when it is faster than not using an index. If you were
to post the explain analyze output, experts here could tell you exactly
why the decisions were made to use an index or not to use an index. And
if an index should have been used, they can tell you what to do so that
the index will be used next time.

<<

In response to

Browse pgsql-general by date

  From Date Subject
Next Message glaucomag 2010-01-08 20:22:03 pgadmin save password
Previous Message akp geek 2010-01-08 19:41:56 Re: Index question on postgres