Re: Tsearch2 lexeme position

Lists: pgsql-general
From: Alexander Rüegg <arueegg(at)uni-bielefeld(dot)de>
To: pgsql-general(at)postgresql(dot)org
Cc: teodor(at)sigaev(dot)ru, oleg(at)sai(dot)msu(dot)su
Subject: Tsearch2 lexeme position
Date: 2003-08-13 09:15:20
Message-ID: 3F3A01A8.1030006@uni-bielefeld.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

Is it possible to get all the positions of a lexeme in a result-set of a
query? For example, we have the table

TEXT TEXT_IDX
'TSearch2 is very cool' ...

'It would be much cooler with lexeme positions'

Our query is
SELECT text, position FROM thetable WHERE text_idx @@ 'cool'::tsquery;
^^^^^^^^
The result should be something like:
'TSearch2 is very cool', 4
'It would be much cooler with lexeme positions', 5

If not, is there a function that returns the positions of a lexeme in a
single entry?

thanks
Alex

--
Dipl.-Inform. Alexander Rueegg
Bioinformatics Department Faculty of Technology
Bielefeld University
Phone: +49 (0)521-106-3541
Fax: +49 (0)521-106-6488
Room: C02-206
Email: arueegg(at)techfak(dot)uni-bielefeld(dot)de


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Alexander Rüegg <arueegg(at)uni-bielefeld(dot)de>
Cc: pgsql-general(at)postgresql(dot)org, oleg(at)sai(dot)msu(dot)su
Subject: Re: Tsearch2 lexeme position
Date: 2003-08-13 14:59:54
Message-ID: 3F3A526A.5080401@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Alexander Rüegg wrote:
> Hi,
>
> Is it possible to get all the positions of a lexeme in a result-set of a
> query? For example, we have the table
>
> TEXT TEXT_IDX
> 'TSearch2 is very cool' ...
>
> 'It would be much cooler with lexeme positions'
>
> Our query is
> SELECT text, position FROM thetable WHERE text_idx @@ 'cool'::tsquery;
> ^^^^^^^^
> The result should be something like:
> 'TSearch2 is very cool', 4
> 'It would be much cooler with lexeme positions', 5
>
> If not, is there a function that returns the positions of a lexeme in a
> single entry?
>

You can write such function, but why do you need it? May be there is more simple
way to resolve your problem?

BTW, lexeme can have more that one position...

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru


From: Alexander Rüegg <arueegg(at)uni-bielefeld(dot)de>
To: pgsql-general(at)postgresql(dot)org
Cc: teodor(at)sigaev(dot)ru, oleg(at)sai(dot)msu(dot)su
Subject: Re: Tsearch2 lexeme position
Date: 2003-08-13 16:02:05
Message-ID: 3F3A60FD.3090405@uni-bielefeld.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Thank you for your response.
We want to know the distance or sequence of words in a set of
text-entries. So first we try to retrieve the text-entries in which the
words appear using tsearch indexing. After that we want to calculate the
positions of the words in each entry, e.g. parsing the index column of
the retrieved text-entries.
Maybe there exists a function or an easier/cheaper way to get this
information (and which considers that the words maybe occur more than once).

thanks,
Alex

Teodor Sigaev wrote:

>
>
> Alexander Rüegg wrote:
>
>> Hi,
>>
>> Is it possible to get all the positions of a lexeme in a result-set of a
>> query? For example, we have the table
>>
>> TEXT TEXT_IDX
>> 'TSearch2 is very cool' ...
>>
>> 'It would be much cooler with lexeme positions'
>>
>> Our query is
>> SELECT text, position FROM thetable WHERE text_idx @@ 'cool'::tsquery;
>> ^^^^^^^^
>> The result should be something like:
>> 'TSearch2 is very cool', 4
>> 'It would be much cooler with lexeme positions', 5
>>
>> If not, is there a function that returns the positions of a lexeme in a
>> single entry?
>>
>
> You can write such function, but why do you need it? May be there is
> more simple way to resolve your problem?
>
> BTW, lexeme can have more that one position...
>
>
>
--

Alexander Rueegg
Email: arueegg(at)uni-bielefeld(dot)de


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Alexander Rüegg <arueegg(at)uni-bielefeld(dot)de>
Cc: pgsql-general(at)postgresql(dot)org, oleg(at)sai(dot)msu(dot)su
Subject: Re: Tsearch2 lexeme position
Date: 2003-08-14 15:00:40
Message-ID: 3F3BA418.5010207@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Alexander Rüegg wrote:
> Thank you for your response.
> We want to know the distance or sequence of words in a set of
> text-entries. So first we try to retrieve the text-entries in which the
> words appear using tsearch indexing. After that we want to calculate the
> positions of the words in each entry, e.g. parsing the index column of
> the retrieved text-entries.
> Maybe there exists a function or an easier/cheaper way to get this
> information (and which considers that the words maybe occur more than
> once).

No, it is not exists. The easiest way is to extract this info from tsvector value.

>
> thanks,
> Alex
>
> Teodor Sigaev wrote:
>
>>
>>
>> Alexander Rüegg wrote:
>>
>>> Hi,
>>>
>>> Is it possible to get all the positions of a lexeme in a result-set of a
>>> query? For example, we have the table
>>>
>>> TEXT TEXT_IDX
>>> 'TSearch2 is very cool' ...
>>>
>>> 'It would be much cooler with lexeme positions'
>>>
>>> Our query is
>>> SELECT text, position FROM thetable WHERE text_idx @@ 'cool'::tsquery;
>>> ^^^^^^^^
>>> The result should be something like:
>>> 'TSearch2 is very cool', 4
>>> 'It would be much cooler with lexeme positions', 5
>>>
>>> If not, is there a function that returns the positions of a lexeme in a
>>> single entry?
>>>
>>
>> You can write such function, but why do you need it? May be there is
>> more simple way to resolve your problem?
>>
>> BTW, lexeme can have more that one position...
>>
>>
>>

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Alexander Rüegg <arueegg(at)uni-bielefeld(dot)de>, pgsql-general(at)postgresql(dot)org
Subject: Re: Tsearch2 lexeme position
Date: 2003-08-14 15:04:16
Message-ID: Pine.GSO.4.56.0308141902150.2198@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Alexander,

we'd be glad to add such function to tsearch2 in case it'd be useful
for many peoples, not just you.

Oleg
On Thu, 14 Aug 2003, Teodor Sigaev wrote:

>
>
> Alexander RЭegg wrote:
> > Thank you for your response.
> > We want to know the distance or sequence of words in a set of
> > text-entries. So first we try to retrieve the text-entries in which the
> > words appear using tsearch indexing. After that we want to calculate the
> > positions of the words in each entry, e.g. parsing the index column of
> > the retrieved text-entries.
> > Maybe there exists a function or an easier/cheaper way to get this
> > information (and which considers that the words maybe occur more than
> > once).
>
> No, it is not exists. The easiest way is to extract this info from tsvector value.
>
>
> >
> > thanks,
> > Alex
> >
> > Teodor Sigaev wrote:
> >
> >>
> >>
> >> Alexander RЭegg wrote:
> >>
> >>> Hi,
> >>>
> >>> Is it possible to get all the positions of a lexeme in a result-set of a
> >>> query? For example, we have the table
> >>>
> >>> TEXT TEXT_IDX
> >>> 'TSearch2 is very cool' ...
> >>>
> >>> 'It would be much cooler with lexeme positions'
> >>>
> >>> Our query is
> >>> SELECT text, position FROM thetable WHERE text_idx @@ 'cool'::tsquery;
> >>> ^^^^^^^^
> >>> The result should be something like:
> >>> 'TSearch2 is very cool', 4
> >>> 'It would be much cooler with lexeme positions', 5
> >>>
> >>> If not, is there a function that returns the positions of a lexeme in a
> >>> single entry?
> >>>
> >>
> >> You can write such function, but why do you need it? May be there is
> >> more simple way to resolve your problem?
> >>
> >> BTW, lexeme can have more that one position...
> >>
> >>
> >>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83