Problems with index-scan on regexp in 8.1

Lists: pgsql-hackers
From: Lars Kanis <kanis(at)comcard(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Problems with index-scan on regexp in 8.1
Date: 2005-11-07 06:50:20
Message-ID: 200511070750.21070.kanis@comcard.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

We're using Postgres 8.0.2 on SuSE10.0 (64-Bit). Tests on 8.1 beta 4 have
shown no problems but this one:

SELECT * FROM mitglieder WHERE lower(vorname::text)='lars'

does a bitmap-index-scan like this:

Bitmap Heap Scan on mitglieder (cost=10.68..3770.52 rows=1051 width=226)
Recheck Cond: (lower((vorname)::text) = 'lars'::text)
-> Bitmap Index Scan on mitgl_lower_namen_idx (cost=0.00..10.68
rows=1051 width=0)
Index Cond: (lower((vorname)::text) = 'lars'::text)

but a regular expression always results in a seqscan:

SELECT * FROM mitglieder WHERE lower(vorname::text)~'^lars'

Seq Scan on mitglieder (cost=0.00..79703.73 rows=1 width=226)
Filter: (lower((vorname)::text) ~ '^lars'::text)

whereas V8.0.2 does a proper index-scan:

Index Scan using mitgl_lower_namen_idx on mitglieder (cost=0.01..18.05
rows=4 width=225)
Index Cond: ((lower((vorname)::text) >= 'lars'::text) AND
(lower((vorname)::text) < 'lart'::text))
Filter: (lower((vorname)::text) ~ '^lars'::text)

The use of indexes for regexp is quite important for the search in our
interactive frontend.

kind regards
Lars Kanis


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Lars Kanis <kanis(at)comcard(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problems with index-scan on regexp in 8.1
Date: 2005-11-07 13:13:13
Message-ID: 20051107131313.GC841@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 07, 2005 at 07:50:20AM +0100, Lars Kanis wrote:
> We're using Postgres 8.0.2 on SuSE10.0 (64-Bit). Tests on 8.1 beta 4 have
> shown no problems but this one:
>
> SELECT * FROM mitglieder WHERE lower(vorname::text)='lars'
>
> does a bitmap-index-scan like this:

Check your locales. For non-ASCII locales the normal shortcuts for
regex optimisation can't apply. Evidently your old installation uses a
different locale from your new one.

You should be able to make this work by declaring your index with
"text_pattern_ops", like so:

CREATE INDEX myindex ON mytable(mycolumn text_pattern_ops);

Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Lars Kanis <kanis(at)comcard(dot)de>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problems with index-scan on regexp in 8.1
Date: 2005-11-08 08:23:32
Message-ID: 200511080923.42109.kanis@comcard.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Montag, 7. November 2005 14:13 schrieb Martijn van Oosterhout:
> On Mon, Nov 07, 2005 at 07:50:20AM +0100, Lars Kanis wrote:
> > We're using Postgres 8.0.2 on SuSE10.0 (64-Bit). Tests on 8.1 beta 4 have
> > shown no problems but this one:
> >
> > SELECT * FROM mitglieder WHERE lower(vorname::text)='lars'
> >
> > does a bitmap-index-scan like this:
>
> Check your locales. For non-ASCII locales the normal shortcuts for
> regex optimisation can't apply. Evidently your old installation uses a
> different locale from your new one.
>
> You should be able to make this work by declaring your index with
> "text_pattern_ops", like so:
>
> CREATE INDEX myindex ON mytable(mycolumn text_pattern_ops);
>
> Hope this helps,

Thank you much, it helps.
The initdb-locales were different. pattern_ops work quite fine with the
regexps.

So I don't have any complaints to 8.1.

kind regards
Lars Kanis