Re: Include Lists for Text Search

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Include Lists for Text Search
Date: 2007-09-10 13:27:59
Message-ID: 1189430879.4281.247.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Mon, 2007-09-10 at 16:35 +0400, Oleg Bartunov wrote:
> On Mon, 10 Sep 2007, Simon Riggs wrote:
>
> > On Mon, 2007-09-10 at 16:10 +0400, Oleg Bartunov wrote:
> >> On Mon, 10 Sep 2007, Simon Riggs wrote:
> >>
> >>> It seems possible to write your own functions to support various
> >>> possibilities with text search.
> >>>
> >>> One of the more common thoughts is to have a list of words that you
> >>> would like to include, i.e. the opposite of a stop word list.
> >>>
> >>> There are clear indications that indexing too many words is a problem
> >>> for both GIN and GIST. If people already know what they'll be looking
> >>> for and what they will never be looking for, it seems easier to supply
> >>> that list up front, rather than hide it behind lots of hand-crafted
> >>> code.
> >>>
> >>> Can we include that functionality now?
> >>
> >> This could be realized very easyly using dict_strict, which returns
> >> only known words, and mapping contains only this dictionary. So,
> >> feel free to write it and submit.
> >
> > So there isn't one yet, but you think it will be easy to write and that
> > we should call it dict_strict?
>
> we have dict_synonym already and if your list is not big you'll be happy.

So I need to do something like

CREATE TEXT SEARCH DICTIONARY my_diction (
template = snowball,
synonym = include_only_these_words
);

which will then look for a file called include_only_these_words.syn?

I would prefer to be able to do something like this

CREATE TEXT SEARCH DICTIONARY my_diction (
template = snowball,
include = justthese
);
...which makes more sense to anyone reading it
and I also want to make the comparison case insensitive.

Would it be better to
1. include a new dictionary file (dict_strict, as you suggest)
2. a) allow case sensitivity as another option in dictionaries
b) allow "include" as another word for "stoplist", but with the
meaning reversed?

e.g.

CREATE TEXT SEARCH DICTIONARY my_diction (
template = snowball,
include = justthese,
case_sensitive = true
);

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-09-10 13:49:42 Re: Include Lists for Text Search
Previous Message Heikki Linnakangas 2007-09-10 13:08:06 Re: ispell dictionary broken in CVS HEAD ?

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2007-09-10 13:49:42 Re: Include Lists for Text Search
Previous Message Teodor Sigaev 2007-09-10 12:48:16 Re: Include Lists for Text Search