Re: BETWEEN optimizer problems with single-value

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>, "Andreas Kretschmer" <akretschmer(at)spamfence(dot)net>
Subject: Re: BETWEEN optimizer problems with single-value
Date: 2006-03-15 20:25:58
Message-ID: 441823F5.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

>>> On Wed, Mar 15, 2006 at 1:17 pm, in message
<28798(dot)1142450270(at)sss(dot)pgh(dot)pa(dot)us>,
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> 8.1 is certainly capable of devising the plan you want, for example
> in the regression database:
>
> regression=# explain select * from tenk1 where thousand = 10 and
tenthous
> between 42 and 144;
> QUERY PLAN
>
------------------------------------------------------------------------------------
> Index Scan using tenk1_thous_tenthous on tenk1 (cost=0.00..6.01
rows=1
> width=244)
> Index Cond: ((thousand = 10) AND (tenthous >= 42) AND (tenthous <=
144))
> (2 rows)

That matches one of the examples where it optimized well. I only saw
the bad plan when low and high ends of the BETWEEN range were equal.

> It looks to me like this is a matter of bad cost estimation, ie,
it's
> thinking the other index is cheaper to use. Why that is is not
clear.
> Can we see the pg_stats rows for ctofcNo and calDate?

schemaname | tablename | attname | null_frac | avg_width | n_distinct
| most_common_vals
|
most_common_freqs |
histogram_bounds
| correlation
------------+-----------+---------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------+-------------
public | Cal | calDate | 0 | 4 | 2114
|
{2003-06-02,2000-06-20,2001-04-16,2003-06-17,2003-12-01,2004-10-12,2001-04-23,2001-10-15,2002-03-06,2002-05-03}
|
{0.00333333,0.00233333,0.00233333,0.00233333,0.00233333,0.00233333,0.002,0.002,0.002,0.002}
|
{1986-03-14,1999-06-11,2000-07-14,2001-05-18,2002-03-21,2002-12-04,2003-08-12,2004-05-13,2005-02-01,2005-09-28,2080-12-31}
| 0.0545768
public | Cal | ctofcNo | 0 | 8 | 669
| {0793,1252,1571,0964,0894,1310,"DA ",0944,1668,0400}
|
{0.024,0.019,0.015,0.0123333,0.012,0.011,0.0106667,0.01,0.00966667,0.00866667}
| {0000,0507,0733,0878,1203,1336,14AG,1633,1971,3705,YVJO}
|
-0.0179665
(2 rows)

> Also, try to force it to generate the plan you want, so we can see
what
> it thinks the cost is for that. If you temporarily drop the wrong
index
> you should be able to get there:
>
> begin;
> drop index "Cal_CalDate";
> explain analyze select ... ;
> -- repeat as needed if it chooses some other wrong index
> rollback;

Sort (cost=4.03..4.03 rows=1 width=12) (actual time=48.484..48.486
rows=4 loops=1)
Sort Key: "calDate", "startTime"
-> Index Scan using "Cal_CtofcNo" on "Cal" "CA" (cost=0.00..4.02
rows=1 width=12) (actual time=36.750..48.228 rows=4 loops=1)
Index Cond: ((("ctofcNo")::bpchar = '2192'::bpchar) AND
(("calDate")::date >= '2006-03-15'::date) AND (("calDate")::date <=
'2006-03-15'::date))
Total runtime: 56.616 ms

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Kretschmer 2006-03-15 20:47:00 Re: BETWEEN optimizer problems with single-value range
Previous Message Merlin Moncure 2006-03-15 19:36:51 Re: [PERFORM] BETWEEN optimizer problems with single-value range

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2006-03-15 20:43:01 Re: Background writer configuration
Previous Message Joshua D. Drake 2006-03-15 19:54:33 Re: Background writer configuration