Re: Bitmap Index Scan optimization opportunity

Lists: pgsql-performance
From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Bitmap Index Scan optimization opportunity
Date: 2007-08-10 19:33:28
Message-ID: 46BC7737.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

These query times are the "fully cached" times for both, from doing a previous run of the same query. (The first one took 193.772 ms on its first run; I don't have a good "uncached" timing for the second one at this point.)

It seems like the first query could move the searchName filter to the Bitmap Index Scan phase, and save 97.5% of the page retrievals in the Bitmap Heap Scan.

-Kevin


cc=> explain analyze select * from "Warrant" where "soundex" = 'S530' and "searchName" like '%,G%' and "countyNo" = 40;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on "Warrant" (cost=55.37..1202.35 rows=841 width=123) (actual time=2.625..8.602 rows=112 loops=1)
Recheck Cond: (((soundex)::text = 'S530'::text) AND (("countyNo")::smallint = 40))
Filter: (("searchName")::text ~~ '%,G%'::text)
-> Bitmap Index Scan on "Warrant_WarrantSoundex" (cost=0.00..55.16 rows=4240 width=0) (actual time=1.911..1.911 rows=4492 loops=1)
Index Cond: (((soundex)::text = 'S530'::text) AND (("countyNo")::smallint = 40))
Total runtime: 8.739 ms
(6 rows)

cc=> explain analyze select * from "Warrant" where "soundex" = 'S530' and "searchName" like 'SMITH,G%' and "countyNo" = 40;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using "Warrant_WarrantName" on "Warrant" (cost=0.00..1.28 rows=1 width=123) (actual time=0.099..0.397 rows=112 loops=1)
Index Cond: ((("searchName")::text >= 'SMITH,G'::character varying) AND (("searchName")::text < 'SMITH,H'::character varying) AND (("countyNo")::smallint = 40))
Filter: (((soundex)::text = 'S530'::text) AND (("searchName")::text ~~ 'SMITH,G%'::text))
Total runtime: 0.510 ms
(4 rows)

cc=> \d "Warrant"
Table "public.Warrant"
Column | Type | Modifiers
----------------+-----------------+-----------
warrantSeqNo | "WarrantSeqNoT" | not null
countyNo | "CountyNoT" | not null
caseNo | "CaseNoT" | not null
nameL | "LastNameT" | not null
partyNo | "PartyNoT" | not null
searchName | "SearchNameT" | not null
soundex | "SoundexT" | not null
authSeqNo | "HistSeqNoT" |
dateAuthorized | "DateT" |
dateDisposed | "DateT" |
dateIssued | "DateT" |
dispoMethod | "EventTypeT" |
dispSeqNo | "HistSeqNoT" |
histSeqNo | "HistSeqNoT" |
nameF | "FirstNameT" |
nameM | "MiddleNameT" |
stayDate | "DateT" |
stayTime | "TimeT" |
suffix | "NameSuffixT" |
warrantDob | "DateT" |
Indexes:
"Warrant_pkey" PRIMARY KEY, btree ("warrantSeqNo", "countyNo")
"Warrant_HistSeqNo" UNIQUE, btree ("caseNo", "histSeqNo", "countyNo", "warrantSeqNo")
"Warrant_AuthSeqNo" btree ("caseNo", "authSeqNo", "countyNo")
"Warrant_CaseNo" btree ("caseNo", "partyNo", "countyNo")
"Warrant_DispSeqNo" btree ("caseNo", "dispSeqNo", "countyNo")
"Warrant_WarrantName" btree ("searchName", "countyNo")
"Warrant_WarrantSoundex" btree (soundex, "searchName", "countyNo")

cc=> select version();
version
-------------------------------------------------------------------------------------
PostgreSQL 8.2.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux)
(1 row)


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Bitmap Index Scan optimization opportunity
Date: 2007-08-10 20:05:13
Message-ID: 46BCC4F9.70903@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Kevin Grittner wrote:
> These query times are the "fully cached" times for both, from doing a previous run of the same query. (The first one took 193.772 ms on its first run; I don't have a good "uncached" timing for the second one at this point.)
>
> It seems like the first query could move the searchName filter to the Bitmap Index Scan phase, and save 97.5% of the page retrievals in the Bitmap Heap Scan.

Yes it could in theory, but unfortunately the planner/executor doesn't
have the capability to do that. An indexed value is never handed back
from the index; the indexed values are only used to satisfy index
conditions, not filters. It's been discussed before (see
http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php),
but it's not easy to implement so no one's done it yet.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com