text_pattern_ops and complex regexps

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: text_pattern_ops and complex regexps
Date: 2009-05-06 14:45:44
Message-ID: 20090506144544.GU8123@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

I've run into an annoying issue which I would think could be handled
better. Basically, indexes using text_pattern_ops don't work with
some complex regexps even when they (imv anyway) could. I'm willing
to believe I'm wrong about the potential to use them, or that my
regexp is wrong, but I don't see it.

Test case:

create table text_test (name text);
insert into text_test values ('North');
insert into text_test values ('North West');
create index text_test_name_idx on text_test using btree (name text_pattern_ops);
set enable_seqscan = false; -- just to show the test
-- works fine
explain analyze select * from text_test where name ~ '^(North)';
-- works fine
explain analyze select * from text_test where name ~ '^(North)( West)';
-- doesn't work
explain analyze select * from text_test where name ~ '^(North)(| West)';

Results:

CREATE TABLE
INSERT 0 1
INSERT 0 1
CREATE INDEX
SET
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Scan using text_test_name_idx on text_test (cost=0.00..8.27 rows=1 width=32) (actual time=0.071..0.077 rows=2 loops=1)
Index Cond: ((name ~>=~ 'North'::text) AND (name ~<~ 'Norti'::text))
Filter: (name ~ '^(North)'::text)
Total runtime: 0.121 ms
(4 rows)

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Scan using text_test_name_idx on text_test (cost=0.00..8.27 rows=1 width=32) (actual time=0.176..0.178 rows=1 loops=1)
Index Cond: ((name ~>=~ 'North'::text) AND (name ~<~ 'Norti'::text))
Filter: (name ~ '^(North)( West)'::text)
Total runtime: 0.209 ms
(4 rows)

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Seq Scan on text_test (cost=100000000.00..100000001.03 rows=1 width=32) (actual time=0.013..0.019 rows=2 loops=1)
Filter: (name ~ '^(North)(| West)'::text)
Total runtime: 0.045 ms
(3 rows)

I don't see why the last case can't use the index. Obviously, for
this example case, doing a Seq Scan is fine but with the real data set
there are cases where an index could help.

Any help would be greatly appreciated.

Thanks,

Stephen

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Олег Царев 2009-05-06 14:54:54 Some questions about PostgreSQL source code
Previous Message Tom Lane 2009-05-06 14:19:31 Re: bytea vs. pg_dump