Re: integrated tsearch doesn't work with non utf8 database

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: integrated tsearch doesn't work with non utf8 database
Date: 2007-09-10 12:12:09
Message-ID: 46E53499.1010904@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Note the Seq Scan on pg_ts_config_map, with filter on ts_lexize(mapdict,
> $1). That means that it will call ts_lexize on every dictionary, which
> will try to load every dictionary. And loading danish_stem dictionary
> fails in latin2 encoding, because of the problem with the stopword file.

Attached patch should fix it, I hope.

New plan:
Hash Join (cost=2.80..1073.85 rows=80 width=100)
Hash Cond: (parse.tokid = tt.tokid)
InitPlan
-> Seq Scan on pg_ts_config (cost=0.00..1.20 rows=1 width=4)
Filter: (oid = 11308::oid)
-> Seq Scan on pg_ts_config (cost=0.00..1.20 rows=1 width=4)
Filter: (oid = 11308::oid)
-> Function Scan on ts_parse parse (cost=0.00..12.50 rows=1000 width=36)
-> Hash (cost=0.20..0.20 rows=16 width=68)
-> Function Scan on ts_token_type tt (cost=0.00..0.20 rows=16 width=68)
SubPlan
-> Limit (cost=6.57..6.60 rows=1 width=36)
-> Subquery Scan dl (cost=6.57..6.60 rows=1 width=36)
-> Sort (cost=6.57..6.58 rows=1 width=8)
Sort Key: ((ts_lexize(m.mapdict, $1) IS NULL)), m.mapseqno
-> Seq Scan on pg_ts_config_map m (cost=0.00..6.56
rows=1 width=8)
Filter: ((mapcfg = 11308::oid) AND (maptokentype =
$0))
-> Sort (cost=6.57..6.57 rows=1 width=8)
Sort Key: m.mapseqno
-> Seq Scan on pg_ts_config_map m (cost=0.00..6.56 rows=1 width=8)
Filter: ((mapcfg = 11308::oid) AND (maptokentype = $0))

At least, it checks only needed dictionaries.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
patch text/plain 916 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-10 12:16:35 Re: invalidly encoded strings
Previous Message Oleg Bartunov 2007-09-10 12:10:26 Re: Include Lists for Text Search