manually force planner to use of index A vs index B

Lists: pgsql-performance
From: MirrorX <mirrorx(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: manually force planner to use of index A vs index B
Date: 2011-12-03 14:34:29
Message-ID: 1322922869210-5044616.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

hello to all,

the situation i am facing is this->
table X-> 200 mil rows
index A (date, columnA)
index B (date,columnB,columnC)

the query planner is working properly and for a specific query that selects
from table X where all 3 columns of index B are set, it uses index B.

but, at some point there are some bulk inserts with a different date. when
this happens and i run the query mentioned above the planner is using index
A and not index B.
i guess this happens b/c the planner due to the last analyze statistics has
no values of the new date and so it thinks that it is faster to use index A
than index B since the rows that it will search are few. but that's not the
case so this query takes much longer to finish than it would take if it used
the index B.

i have thought of some work-arounds to resolve this situation. for example i
could change the definition of index A to (columnA,date) and i could also
run an analyze command after every bulk insert. Another option would be to
reduce autovacuum_analyze_scale_factor to a very low value so that analyze
would be forced to be made much more often.

but, instead of these solutions, is there anything else that could lead to a
'better' query plan for this specific case? thx in advance

--
View this message in context: http://postgresql.1045698.n5.nabble.com/manually-force-planner-to-use-of-index-A-vs-index-B-tp5044616p5044616.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: manually force planner to use of index A vs index B
Date: 2011-12-03 15:10:05
Message-ID: 4EDA3BCD.30507@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 3.12.2011 15:34, MirrorX wrote:
>
> but, at some point there are some bulk inserts with a different date. when
> this happens and i run the query mentioned above the planner is using index
> A and not index B.
> i guess this happens b/c the planner due to the last analyze statistics has
> no values of the new date and so it thinks that it is faster to use index A
> than index B since the rows that it will search are few. but that's not the
> case so this query takes much longer to finish than it would take if it used
> the index B.

Probably. But you haven't posted any explain plans and I've broken my
crystall ball yesterday, so I can only guess.

Do this:

1) get EXPLAIN ANALYZE of the query running fine
2) do the bulk update
3) get EXPLAIN ANALYZE of the query (this time it uses the wrong index)
4) run ANALYZE on the table
5) get EXPLAIN ANALYZE of the query (should be using the right index)

and post the tree explain plans.

> i have thought of some work-arounds to resolve this situation. for example i
> could change the definition of index A to (columnA,date) and i could also
> run an analyze command after every bulk insert. Another option would be to
> reduce autovacuum_analyze_scale_factor to a very low value so that analyze
> would be forced to be made much more often.

That is not a workaround, that is a solution. The database needs
reasonably accurate statistics to prepare good plans, that's how it works.

If you know that the bulk insert is going to make the statistics
inaccurate, you should run ANALYZE manually at the end. Or you might let
autovacuum take care of that. But the autovacuum won't fix that
immediately - it's running each minute or so, so the queries executed
before that will see the stale stats.

You may lower the autocacuum naptime, you may make it more aggressive,
but that means more overhead.

Tomas


From: MirrorX <mirrorx(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: manually force planner to use of index A vs index B
Date: 2011-12-03 15:15:51
Message-ID: 1322925351487-5044691.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

thx a lot for the reply. i will post the query plans when a new bulk insert
will take place :)

--
View this message in context: http://postgresql.1045698.n5.nabble.com/manually-force-planner-to-use-of-index-A-vs-index-B-tp5044616p5044691.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.