Full-text search default vs specified configuration

From: Richard Huxton <dev(at)archonet(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Full-text search default vs specified configuration
Date: 2008-02-22 10:36:38
Message-ID: 47BEA5B6.10000@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been looking at a problem someone encountered with ts_headline:
http://archives.postgresql.org/pgsql-general/2008-02/msg01035.php

It turns out the problem was mixing ts_headline(<no specified config>)
with to_tsquery(<specified config>) where <specified config> wasn't the
default.

Fair enough, and in retrospect it's obvious. However, I fear it's going
to be a pretty common error. It's also one that's not easy to catch -
you can test a configuration, but you can't see what configuration
generated a particular tsvector / tsquery (afaict).

I realise there was a lot of discussion during 8.3 devt about what was
wanted from a default config and I'm guessing there's nothing that can
be done for 8.3.x

Would there be any support for two changes in 8.4 though?

1. Tag tsvector/tsquery's with the (oid of) their configuration?
This could then generate a warning/error if you are running a tsquery
against the wrong tsvector / combining two incompatible tsvectors etc.

2. Either warn or require CASCADE on changes to a
configuration/dictionary that could impact existing indexes etc.
I've done it once myself where a stopword dictionary was changed from
accept=true to accept=false. That change is OK (as long as you don't
mind rogue tokens in your tsvectors) but others are probably not.

--
Richard Huxton
Archonet Ltd

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tino Wildenhain 2008-02-22 11:18:34 Re: Permanent settings
Previous Message Joshua D. Drake 2008-02-22 10:00:22 Re: Including PL/PgSQL by default