Re: tsearch2api .. wrapper for integrated fultext

Lists: pgsql-patches
From: "Magnus Hagander" <magnus(at)hagander(dot)net>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-09 06:32:52
Message-ID: 200711090732520000@2881919696
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

A thought on this - should it not go in contrib/tsearch2 replacing the old deprecated stuff, instead of creating yet aother contrib dir?

/Magnus

> ------- Original Message -------
> From: Bruce Momjian <bruce(at)momjian(dot)us>
> To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> Sent: 07-11-09, 01:35:49
> Subject: Re: [PATCHES] tsearch2api .. wrapper for integrated fultext
>
> Should I apply this or wait for a final version?
>
> ---------------------------------------------------------------------------
>
> Pavel Stehule wrote:
> > Hello
> >
> > I am sending light wrapper of integrated fulltext. API is compatible
> > with tsearch2.
> >
> > Regards
> > Pavel Stehule
> >
> > ToDo: rewrite functions
>
> [ Attachment, skipping... ]
>
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: Don't 'kill -9' the postmaster
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://postgres.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "Bruce Momjian" <bruce(at)momjian(dot)us>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-09 07:17:22
Message-ID: 1792.1194592642@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Magnus Hagander" <magnus(at)hagander(dot)net> writes:
> A thought on this - should it not go in contrib/tsearch2 replacing the old deprecated stuff, instead of creating yet aother contrib dir?

That was the idea, I thought.

This proposed patch is in need of review, which I'd been hoping to get
to tomorrow; but since Pavel says he's about to send in a revised
version, I will wait for that ...

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-09 07:25:34
Message-ID: 162867790711082325n4244637nbede5e943604b97a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On 09/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Magnus Hagander" <magnus(at)hagander(dot)net> writes:
> > A thought on this - should it not go in contrib/tsearch2 replacing the old deprecated stuff, instead of creating yet aother contrib dir?

It can be moved - no problem.

>
> That was the idea, I thought.
>
> This proposed patch is in need of review, which I'd been hoping to get
> to tomorrow; but since Pavel says he's about to send in a revised
> version, I will wait for that ...
>
> regards, tom lane
>


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-09 23:50:00
Message-ID: 162867790711091550v4cee1685qd197ac09da185b8a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hello

this is revised version

I can not remove files with diff, so patch add files to tsearch2api
still. Please change it.

Regards
Pavel Stehule

On 09/11/2007, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> On 09/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > "Magnus Hagander" <magnus(at)hagander(dot)net> writes:
> > > A thought on this - should it not go in contrib/tsearch2 replacing the old deprecated stuff, instead of creating yet aother contrib dir?
>
> It can be moved - no problem.
>
> >
> > That was the idea, I thought.
> >
> > This proposed patch is in need of review, which I'd been hoping to get
> > to tomorrow; but since Pavel says he's about to send in a revised
> > version, I will wait for that ...
> >
> > regards, tom lane
> >
>

Attachment Content-Type Size
tsearch2api.diff text/x-patch 43.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-10 18:19:31
Message-ID: 17640.1194718771@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> writes:
> this is revised version

A couple of thoughts here:

* What is the point of creating stub functions for things that users
won't attempt to call directly, such as opclass support functions and
the old dictionary support functions? Couldn't we just leave those out
and save some code?

* The WRAPPER_FUNCTION stuff seems unnecessarily inefficient --- can't
we just declare those as LANGUAGE INTERNAL and link the SQL definition
directly to the built-in function?

* The SQL file doesn't create any of the old types (public.tsvector
etc) so it seems still a long ways short of ensuring that an old
dump file can be reloaded. Maybe I don't understand exactly how you
intend it to interact with the definitions that will be in the dump
file.

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-10 18:45:06
Message-ID: 162867790711101045g6520664bicfbf625f2796ee62@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hello

On 10/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> writes:
> > this is revised version
>
> A couple of thoughts here:
>
> * What is the point of creating stub functions for things that users
> won't attempt to call directly, such as opclass support functions and
> the old dictionary support functions? Couldn't we just leave those out
> and save some code?
>

we can. I don't understand to tsearch2 well, so this wrapper is
complete. There are not necessary all unsupported functions. But these
improvisation can be done simply in C preprocessor.

> * The WRAPPER_FUNCTION stuff seems unnecessarily inefficient --- can't
> we just declare those as LANGUAGE INTERNAL and link the SQL definition
> directly to the built-in function?
>

it's little bit inefficient, but it's more consistent and readable. So
it's reason.

> * The SQL file doesn't create any of the old types (public.tsvector
> etc) so it seems still a long ways short of ensuring that an old
> dump file can be reloaded. Maybe I don't understand exactly how you
> intend it to interact with the definitions that will be in the dump
> file.
>

with this version of wrapper you cannot load old dumps. It allows
application compatibility. Dump was readable with older variant which
is really ugly and that is on pgfoundry.

Steps with this wrapper:

a) uninstall tsearch2
b) dump
c) install 8.3 and configure fulltext
d) load
e) load wrapper

Pavel Stehule

> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-10 19:09:30
Message-ID: 18749.1194721770@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> writes:
> On 10/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> * The SQL file doesn't create any of the old types (public.tsvector
>> etc) so it seems still a long ways short of ensuring that an old
>> dump file can be reloaded. Maybe I don't understand exactly how you
>> intend it to interact with the definitions that will be in the dump
>> file.

> with this version of wrapper you cannot load old dumps. It allows
> application compatibility. Dump was readable with older variant which
> is really ugly and that is on pgfoundry.

> Steps with this wrapper:

> a) uninstall tsearch2
> b) dump
> c) install 8.3 and configure fulltext
> d) load
> e) load wrapper

That seems like a non-starter. Existing tsearch2 installations will
have tsvector columns in their tables, so I don't see how they are
going to "uninstall tsearch2" in the existing database. The other
problem is that I think we have to provide a migration path for people
who already have dump files (and, maybe, no longer have the original
installation).

For people who have custom or tar-format dumps, the previously posted
script to strip out the old tsearch2 objects during pg_restore would
help ... but it's useless if you used text dump (notably including
pg_dumpall output).

Another problem that was already noted was that the dump might contain
explicit references to "public.tsvector", or some other schema that you
put the tsearch2 objects in.

The approach that I was hoping to see was

a) dump
b) install 8.3 and configure fulltext
c) load wrapper (into same schema as you used for tsearch2 before)
d) load dump file

Since pg_dump doesn't do CREATE OR REPLACE, step (d) would result in a
lot of error messages, but it wouldn't overwrite any of the function
definitions installed by the wrapper. We could possibly deal with the
schema issue by having the wrapper create public.tsvector as a domain
for pg_catalog.tsvector, etc.

Thoughts?

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-10 20:20:52
Message-ID: 162867790711101220g25c1a0cbg214250ea7a239bd7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On 10/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> writes:
> > On 10/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> * The SQL file doesn't create any of the old types (public.tsvector
> >> etc) so it seems still a long ways short of ensuring that an old
> >> dump file can be reloaded. Maybe I don't understand exactly how you
> >> intend it to interact with the definitions that will be in the dump
> >> file.
>
> > with this version of wrapper you cannot load old dumps. It allows
> > application compatibility. Dump was readable with older variant which
> > is really ugly and that is on pgfoundry.
>
> > Steps with this wrapper:
>
> > a) uninstall tsearch2
> > b) dump
> > c) install 8.3 and configure fulltext
> > d) load
> > e) load wrapper
>
> That seems like a non-starter. Existing tsearch2 installations will
> have tsvector columns in their tables, so I don't see how they are
> going to "uninstall tsearch2" in the existing database. The other
> problem is that I think we have to provide a migration path for people
> who already have dump files (and, maybe, no longer have the original
> installation).
>
> For people who have custom or tar-format dumps, the previously posted
> script to strip out the old tsearch2 objects during pg_restore would
> help ... but it's useless if you used text dump (notably including
> pg_dumpall output).
>
> Another problem that was already noted was that the dump might contain
> explicit references to "public.tsvector", or some other schema that you
> put the tsearch2 objects in.
>

I forgot. With perl script, dump can be transformed to readable form.

I see forms of port TSearch to integrated full text:

1. clean and prefered .. dump and application are modified
2. dump is modified, application works via some api
3. full wrapper with known issues
4. special smart mode where TSearch2 API functions are dynamicly
converted to fulltext api

like
get function name
if it some from tsearch2 api transform it or forgot it
else create function

but perl script do it well and outside

> The approach that I was hoping to see was
>
> a) dump
> b) install 8.3 and configure fulltext
> c) load wrapper (into same schema as you used for tsearch2 before)
> d) load dump file

It was my original goal. But there is lot of issues.

I found different problem, that have to be solved if wrapper have to
be dump compatible.

there is about five functions with same name, and I have problem with
wrapping, because I create recursive calling. I am not able select
integrated functions. That was reason for use prefix tsa.

>
> Since pg_dump doesn't do CREATE OR REPLACE, step (d) would result in a
> lot of error messages, but it wouldn't overwrite any of the function
> definitions installed by the wrapper. We could possibly deal with the
> schema issue by having the wrapper create public.tsvector as a domain
> for pg_catalog.tsvector, etc.
>
> Thoughts?
>
> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-10 22:44:25
Message-ID: 21137.1194734665@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> writes:
> I found different problem, that have to be solved if wrapper have to
> be dump compatible.

> there is about five functions with same name, and I have problem with
> wrapping, because I create recursive calling. I am not able select
> integrated functions. That was reason for use prefix tsa.

I don't follow. The functions all have distinct names at the C-code
level, so what's the problem?

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-11 07:27:03
Message-ID: 162867790711102327k290ccc6evf36ff9f18c9f71d8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On 10/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> writes:
> > I found different problem, that have to be solved if wrapper have to
> > be dump compatible.
>
> > there is about five functions with same name, and I have problem with
> > wrapping, because I create recursive calling. I am not able select
> > integrated functions. That was reason for use prefix tsa.
>
> I don't follow. The functions all have distinct names at the C-code
> level, so what's the problem?

they don't have distinct names

tsearch_length
rewrite_query
...

>
> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-12 00:58:58
Message-ID: 10490.1194829138@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

I wrote:
> The approach that I was hoping to see was

> a) dump
> b) install 8.3 and configure fulltext
> c) load wrapper (into same schema as you used for tsearch2 before)
> d) load dump file

I've modified Pavel's version into something that seems to support this
approach --- at least I can load the 8.2 tsearch regression test
database into 8.3 after loading this. Still needs some polishing
probably, and some more testing. Comments?

regards, tom lane

Attachment Content-Type Size
tsearch2api.tar.gz application/octet-stream 5.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-13 21:14:55
Message-ID: 23989.1194988495@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

I wrote:
> I've modified Pavel's version into something that seems to support this
> approach --- at least I can load the 8.2 tsearch regression test
> database into 8.3 after loading this. Still needs some polishing
> probably, and some more testing. Comments?

I've committed this, replacing the old contrib/tsearch2 code.

It successfully runs most of the old module's regression test, after
some minor adjustments for default configuration names and suchlike.
One large omission is that the rewrite(ARRAY[]) aggregate isn't there.
AFAIR, we removed that just because it seemed a poorly designed API,
not because it didn't work. I'm thinking we should probably pull the
code for it out of the CVS history and stick it into contrib/tsearch2.
Any thoughts pro or con?

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2api .. wrapper for integrated fultext
Date: 2007-11-13 21:53:52
Message-ID: 162867790711131353r5cb7d51ch4b02a1228a3e18a6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On 13/11/2007, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > I've modified Pavel's version into something that seems to support this
> > approach --- at least I can load the 8.2 tsearch regression test
> > database into 8.3 after loading this. Still needs some polishing
> > probably, and some more testing. Comments?
>
> I've committed this, replacing the old contrib/tsearch2 code.
>
> It successfully runs most of the old module's regression test, after
> some minor adjustments for default configuration names and suchlike.
> One large omission is that the rewrite(ARRAY[]) aggregate isn't there.
> AFAIR, we removed that just because it seemed a poorly designed API,
> not because it didn't work. I'm thinking we should probably pull the
> code for it out of the CVS history and stick it into contrib/tsearch2.
> Any thoughts pro or con?
>

+1

Pavel
> regards, tom lane
>