Re: index & Bitmap Heap Scan

Lists: pgsql-performance
From: Paul <paul(at)wayr(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: index & Bitmap Heap Scan
Date: 2007-08-28 08:49:09
Message-ID: 1188290949.29582.9.camel@thory
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hello,

I have a table (stats.tickets) with 2288965 rows (51 columns) and
indexes like:
ind_ti_stats_numero btree (tday, tmonth, tyear, r_cat, r_numero)
ind_ti_stats_service btree (tday, tmonth, tyear, r_cat, r_service)
ind_ti_stats_tmp_service btree (r_service, tyear, tmonth)
ind_ti_stats_tmp_service2 btree (r_service, tyear, tmonth, r_cat)

Now if i do :
1°)# explain analyze SELECT tday AS n, '' AS class, a.r_cat AS cat,
COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1
END) AS p, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END)
AS np, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree
END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE
t_duree1 END),0) AS tmc FROM stats.tickets AS a WHERE
a.r_numero='99084040' AND tyear = 2007 AND tmonth = 8 GROUP BY tyear,
tmonth, tday, a.r_cat;

QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=45412.96..45412.99 rows=1 width=34) (actual
time=649.944..650.178 rows=50 loops=1)
-> Index Scan using ind_ti_stats_numero on tickets a
(cost=0.00..45385.46 rows=1222 width=34) (actual time=15.697..642.570
rows=1043 loops=1)
Index Cond: ((tmonth = 8) AND (tyear = 2007) AND
((r_numero)::text = '99084040'::text))
Total runtime: 650.342 ms
(4 lignes)

Temps : 652,234 ms

2°)
# explain analyze SELECT tday AS n, '' AS class, a.r_cat AS cat,
COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1
END) AS p, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END)
AS np, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree
END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE
t_duree1 END),0) AS tmc FROM stats.tickets AS a WHERE a.r_service=95
AND tyear = 2007 AND tmonth = 8 GROUP BY tyear, tmonth, tday, a.r_cat;

QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=193969.97..193970.88 rows=26 width=34) (actual
time=20834.559..20834.694 rows=27 loops=1)
-> Bitmap Heap Scan on tickets a (cost=3714.84..186913.32
rows=313629 width=34) (actual time=889.880..19028.315 rows=321395
loops=1)
Recheck Cond: ((r_service = 95) AND (tyear = 2007) AND (tmonth
= 8))
-> Bitmap Index Scan on ind_ti_stats_tmp_service
(cost=0.00..3714.84 rows=313629 width=0) (actual time=836.181..836.181
rows=321395 loops=1)
Index Cond: ((r_service = 95) AND (tyear = 2007) AND
(tmonth = 8))
Total runtime: 20835.191 ms
(6 lignes)

Temps : 20838,798 ms

\d stats.tickets
[...]
r_numero | character varying(17) | not null
r_service | integer | not null default 0
[...]
stats.tickets has 173351 relpages , 2.30996e+06 reltuples.

Why in the first case, pgsql uses the "better" index and if i search
r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?
There ara too much rows in this table ?

PS: sorry for my english, i'm french.

--
Paul.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Paul <paul(at)wayr(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: index & Bitmap Heap Scan
Date: 2007-08-28 16:55:58
Message-ID: 29055.1188320158@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Paul <paul(at)wayr(dot)org> writes:
> Why in the first case, pgsql uses the "better" index and if i search
> r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?

Given the difference in the number of rows to be fetched, both plan
choices look pretty reasonable to me. If you want to experiment,
you can try forcing the other choice in each case (use enable_indexscan
and enable_bitmapscan) and see how fast it is, but I suspect the planner
got it right.

Beware of cache effects when trying two plans in quick succession ---
the second one might go faster just because all the data is already
swapped in.

regards, tom lane


From: Paul <paul(at)wayr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: index & Bitmap Heap Scan
Date: 2007-08-29 08:16:45
Message-ID: 1188375405.9028.3.camel@thory
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Thank you for your answer.
Now i ve to find how to reduce the size of the table.

Paul.

Le mardi 28 août 2007 à 12:55 -0400, Tom Lane a écrit :
> Paul <paul(at)wayr(dot)org> writes:
> > Why in the first case, pgsql uses the "better" index and if i search
> > r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?
>
> Given the difference in the number of rows to be fetched, both plan
> choices look pretty reasonable to me. If you want to experiment,
> you can try forcing the other choice in each case (use enable_indexscan
> and enable_bitmapscan) and see how fast it is, but I suspect the planner
> got it right.
>
> Beware of cache effects when trying two plans in quick succession ---
> the second one might go faster just because all the data is already
> swapped in.
>
> regards, tom lane
>