Re: bug in ts_rank_cd

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sushant Sinha <sushant354(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: bug in ts_rank_cd
Date: 2010-12-22 04:03:55
Message-ID: 23981.1292990635@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sushant Sinha <sushant354(at)gmail(dot)com> writes:
> There is a bug in ts_rank_cd. It does not correctly give rank when the
> query lexeme is the first one in the tsvector.

Hmm ... I cannot reproduce the behavior you're complaining of.
You say

> select ts_rank_cd(to_tsvector('english', 'abc sdd'),
> plainto_tsquery('english', 'abc'));
> ts_rank_cd
> ------------
> 0

but I get

regression=# select ts_rank_cd(to_tsvector('english', 'abc sdd'),
regression(# plainto_tsquery('english', 'abc'));
ts_rank_cd
------------
0.1
(1 row)

> The problem is that the Cover finding algorithm ignores the lexeme at
> the 0th position,

As far as I can tell, there is no "0th position" --- tsvector counts
positions from one. The only way to see pos == 0 in the input to
Cover() is if the tsvector has been stripped of position information.
ts_rank_cd is documented to return 0 in that situation. Your patch
would have the effect of causing it to return some nonzero, but quite
bogus, ranking.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-12-22 04:17:45 Re: strncmp->memcmp when we know the shorter length
Previous Message Robert Haas 2010-12-22 03:49:54 Re: CommitFest wrap-up