Re: BUG #6654: Full text search doesn't find europe

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6654: Full text search doesn't find europe
Date: 2012-05-21 23:01:57
Message-ID: 9098.1337641317@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Andres Freund <andres(at)anarazel(dot)de> writes:
> On Monday, May 21, 2012 07:26:38 PM wbrana(at)gmail(dot)com wrote:
>> CREATE INDEX idx_post_text ON posts USING gin
>> (to_tsvector('english'::regconfig, post_text::text))
>> select * from v_search WHERE to_tsvector('english', post_text) @@ 'europe'
>> returns no rows, but
>> select * from v_search WHERE to_tsvector('english', post_text) @@ 'japan'
>> returns row with "Japan and Europe"

> The problem is that youre using to_tsvector('english' for parsing the text but
> don't specify the text yearch configuration for the query. The default english
> configuration does stemming, the default_text_search_configuration obviously
> not.
> Try ... to_tsvector('english', post_text) @@ to_tsquery('english', 'europe')

BTW, a good way to debug this sort of issue is to look at the actual
tsvector and tsquery values.

regression=# select to_tsvector('english', 'Japan and Europe');
to_tsvector
---------------------
'europ':3 'japan':1
(1 row)

regression=# select to_tsquery('english', 'Japan');
to_tsquery
------------
'japan'
(1 row)

regression=# select to_tsquery('english', 'Europe');
to_tsquery
------------
'europ'
(1 row)

If you just cast 'europe' directly to tsquery, which is what's going to
happen in the first example, you get the lexeme 'europe' which doesn't
match 'europ'.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-05-21 23:13:28 Re: BUG #6656: Wrong timestamptz + interval calculation
Previous Message Andres Freund 2012-05-21 22:31:21 Re: BUG #6654: Full text search doesn't find europe