From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | tsearch_core for inclusion |
Date: | 2007-03-23 11:58:10 |
Message-ID: | 4603C0D2.90701@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
http://www.sigaev.ru/misc/tsearch_core-0.41.gz
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/
Changes
1) added command
ALTER FULLTEXT MAPPING ON cfgname [FOR lexemetypename[, ...]] REPLACE
olddictname TO newdictname;
2) added operator class for text and varchar
CREATE INDEX idxname ON tblname USING GIN ( textcolumn );
3) changed definition of @@ operation: {tsvector|varchar|text} @@ {text|tsquery}
SELECT * FROM tblname WHERE textcolumn @@ text;
We have two questions:
1. pg_catalog schema, if not explicitly specified in search_path, implicitly
placed as the first schema to browse. To what extent it is intentioned ?
2. At present, visibility of FTS objects conforms to the standard PostgreSQL
rule and defined by search_path variable.
For given schema and server's locale, it's possible to have several FTS
configurations, but the only one (with special flag enabled)
could be used as default. Current (active) FTS configuration contains
in GUC variable tsearch_conf_name. If it's not defined, then FTS configuration
is looked in search_path to match server's locale with default flag enabled.
By default, the first visible schema is the pg_catalog, so that system FTS
objects always mask users. To change that, one need explicitly specify
pg_catalog in the search_path.
This can confuse people, especially unexperienced users. Imagine, she creates
public.fts configuration for ru_RU.UTF-8 locale and enabled it as default.
CREATE FULLTEXT CONFIGURATION public.fts LIKE pg_catalog.russian_utf8 AS DEFAULT;
but with default search_path default configuration will be still
pg_catalog.russian_utf8 and she should redefine search_path to use
public.fts. Then, she can creates index for "simple" (without creating
tsvector column) search on TEXT column
CREATE INDEX pgweb_idx ON pgweb USING gin(body);
Notice, there is no way to specify fts configuration, so CREATE INDEX will use
pg_catalog.russian_utf8 configuration and, consequently, specific dictionaries,
stop-words, etc. Next time, she should remember about search_path, else she will
be very confused, because pg_catalog.russian_utf8 will be used in
SELECT title FROM pgweb WHERE body @@ plainto_tsquery('create table');
Of course, there are several ways to avoid such kind of errors, but
we want to minimize this possible source of confusions and ask community
if it's worth to make user-created fts configuration to be visible prior to
system configurations in pg_catalog, if pg_catalog was not *explicitly*
specified in the search_path ?
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Florian G. Pflug | 2007-03-23 12:49:42 | Re: tsearch_core for inclusion |
Previous Message | Pavan Deolasee | 2007-03-23 10:19:16 | Re: CREATE INDEX and HOT - revised design |