Re: english parser in text search: support for multiple words in the same position

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: sushant354(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: english parser in text search: support for multiple words in the same position
Date: 2010-08-02 07:36:24
Message-ID: 4C567578.5030501@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 08/01/2010 08:04 PM, Sushant Sinha wrote:
> 1. We do not have separate tokens "wikipedia" and "org"
> 2. If we have the two tokens we should have them at adjacent position so
> that a phrase search for "wikipedia org" should work.

This would needlessly increase the number of tokens. Instead you'd
better make it work like compound word support, having just "wikipedia"
and "org" as tokens.

Searching for "wikipedia.org" or "wikipedia org" should then result in
the same search query with the two tokens: "wikipedia" and "org".

> position 0: WORD(wikipedia), URL(wikipedia.org/search?q=sushant)

IMO the differentiation between WORDs and URLs is not something the text
search engine should have to take care a lot. Let it just do the
searching and make it do that well.

What does a token "wikipedia.org/search?q=sushant" buy you in terms of
text searching? Or even result highlighting? I wouldn't expect anybody
to want to search for a full URL, do you?

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2010-08-02 08:21:13 Re: Initial review of xslt with no limits patch
Previous Message Hardik Belani 2010-08-02 07:20:18 Postgres as Historian