Lists: | pgsql-bugs |
---|
From: | wbrana(at)gmail(dot)com |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #6654: Full text search doesn't find europe |
Date: | 2012-05-21 17:26:38 |
Message-ID: | E1SWWNS-0007oz-0L@wrigleys.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 6654
Logged by: wbrana
Email address: wbrana(at)gmail(dot)com
PostgreSQL version: 9.1.3
Operating system: Linux
Description:
CREATE INDEX idx_post_text ON posts USING gin
(to_tsvector('english'::regconfig, post_text::text))
select * from v_search WHERE to_tsvector('english', post_text) @@ 'europe'
returns no rows, but
select * from v_search WHERE to_tsvector('english', post_text) @@ 'japan'
returns row with "Japan and Europe"
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #6654: Full text search doesn't find europe |
Date: | 2012-05-21 22:31:21 |
Message-ID: | 201205220031.22557.andres@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Monday, May 21, 2012 07:26:38 PM wbrana(at)gmail(dot)com wrote:
> The following bug has been logged on the website:
>
> Bug reference: 6654
> Logged by: wbrana
> Email address: wbrana(at)gmail(dot)com
> PostgreSQL version: 9.1.3
> Operating system: Linux
> Description:
>
> CREATE INDEX idx_post_text ON posts USING gin
> (to_tsvector('english'::regconfig, post_text::text))
> select * from v_search WHERE to_tsvector('english', post_text) @@ 'europe'
> returns no rows, but
> select * from v_search WHERE to_tsvector('english', post_text) @@ 'japan'
> returns row with "Japan and Europe"
The problem is that youre using to_tsvector('english' for parsing the text but
don't specify the text yearch configuration for the query. The default english
configuration does stemming, the default_text_search_configuration obviously
not.
Try ... to_tsvector('english', post_text) @@ to_tsquery('english', 'europe')
Andres
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #6654: Full text search doesn't find europe |
Date: | 2012-05-21 23:01:57 |
Message-ID: | 9098.1337641317@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On Monday, May 21, 2012 07:26:38 PM wbrana(at)gmail(dot)com wrote:
>> CREATE INDEX idx_post_text ON posts USING gin
>> (to_tsvector('english'::regconfig, post_text::text))
>> select * from v_search WHERE to_tsvector('english', post_text) @@ 'europe'
>> returns no rows, but
>> select * from v_search WHERE to_tsvector('english', post_text) @@ 'japan'
>> returns row with "Japan and Europe"
> The problem is that youre using to_tsvector('english' for parsing the text but
> don't specify the text yearch configuration for the query. The default english
> configuration does stemming, the default_text_search_configuration obviously
> not.
> Try ... to_tsvector('english', post_text) @@ to_tsquery('english', 'europe')
BTW, a good way to debug this sort of issue is to look at the actual
tsvector and tsquery values.
regression=# select to_tsvector('english', 'Japan and Europe');
to_tsvector
---------------------
'europ':3 'japan':1
(1 row)
regression=# select to_tsquery('english', 'Japan');
to_tsquery
------------
'japan'
(1 row)
regression=# select to_tsquery('english', 'Europe');
to_tsquery
------------
'europ'
(1 row)
If you just cast 'europe' directly to tsquery, which is what's going to
happen in the first example, you get the lexeme 'europe' which doesn't
match 'europ'.
regards, tom lane