Re: query planner not using the correct index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joshua Shanks" <jjshanks(at)gmail(dot)com>
Cc: "Craig Ringer" <craig(at)postnewspapers(dot)com(dot)au>, pgsql-performance(at)postgresql(dot)org
Subject: Re: query planner not using the correct index
Date: 2008-08-07 23:48:11
Message-ID: 29507.1218152891@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Joshua Shanks" <jjshanks(at)gmail(dot)com> writes:
> SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM
> pg_stats WHERE tablename = 'bars' AND attname='bars_id';
> null_frac | n_distinct | most_common_vals | most_common_freqs
> -----------+------------+----------------------+---------------------------
> 0 | 14 | {145823,47063,24895} | {0.484667,0.257333,0.242}

> Those 3 values in reality and in the stats account for 98% of the
> rows. actual distinct values are around 350

So you need to increase the stats target for this column. With those
numbers the planner is going to assume that any value that's not one
of the big three appears about (1 - (0.484667+0.257333+0.242)) / 11
of the time, or several hundred times in 300K rows. If n_distinct were
up around 350 it would be estimating just a dozen or so occurrences,
which should push the join plan into the shape you want. It's likely
that it won't bother to include any more entries in most_common_vals
no matter how much you raise the target; but a larger sample should
definitely give it a better clue about n_distinct.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Joshua Shanks 2008-08-07 23:55:42 Re: query planner not using the correct index
Previous Message Joshua Shanks 2008-08-07 23:04:58 Re: query planner not using the correct index