Re: Tsearch2 Dutch snowball stemmer in PG8.1

From: Alban Hertroys <a(dot)hertroys(at)magproductions(dot)nl>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Tsearch2 Dutch snowball stemmer in PG8.1
Date: 2007-10-03 14:05:57
Message-ID: 4703A1C5.4020601@magproductions.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alban Hertroys wrote:
> The only odd thing is that to_tsvector('dutch', 'some dutch text') now
> returns '|' for stop words...
>
> For example:
> select to_tsvector('nederlands', 'De beste stuurlui staan aan wal');
> to_tsvector
> ------------------------------------------------
> '|':1,5 'bes':2 'wal':6 'staan':4 'stuurlui':3

I found the cause. The stop words list I found contained comments
prefixed by '|' signs. Removing the contents and recreating the database
solved the problem. Just updating the reference didn't seem to help...

There's undoubtedly some cleaner way to replace the stop words list, but
at the current stage of our project this was the simplest to achieve.

--
Alban Hertroys
a(dot)hertroys(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-10-03 14:07:25 Re: PITR Recovery and out-of-sync indexes
Previous Message Alban Hertroys 2007-10-03 13:35:52 Re: Tsearch2 Dutch snowball stemmer in PG8.1