full text search in 8.3

Lists: pgsql-hackers
From: Andy Colson <andy(at)camavision(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: full text search in 8.3
Date: 2007-10-10 17:17:11
Message-ID: 470D0917.70508@camavision.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi All,

You knew it was coming....

I have an 8.2 database that has full text searching. I tried to
backup/restore it to 8.3 but got lots of errors:

<snip>
ERROR: could not access file "$libdir/tsearch2": No such file or directory
ERROR: function public.gtsq_in(cstring) does not exist
ERROR: could not access file "$libdir/tsearch2": No such file or directory
ERROR: function public.gtsq_out(gtsq) does not exist
ERROR: function gtsq_in(cstring) does not exist
<snip>
ERROR: type "tsvector" is only a shell
ERROR: type "public.tsdebug" does not exist
<snip>
etc...

I didn't really expect it to totally work, but I'm not sure how to move
my db. Any pointers would be appreciated.

I used the 8.3 pg_dump on my laptop to backup the 8.2 db from the
server, and tried to restore on my laptop.

I tried both pg_dump -Fc, and just a pg_dump.

Am I going to need to backup the the data, and nothing else (pg_dump
--data-only ). Will the tsvector column be ok?

I tried doing a pg_dump --schema-only and restoring just that, but still
got a bunch of errors (those above). If I clean that up of all the old
text search stuff, and then run it, then do the data, will that work ok?

One more dumb question: I don't have to enable or install anything, do I?

Thanks,

-Andy


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-10 19:26:38
Message-ID: 470D276E.9080409@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andy Colson wrote:
> Hi All,
>
> You knew it was coming....
>
>
> I tried doing a pg_dump --schema-only and restoring just that, but still
> got a bunch of errors (those above). If I clean that up of all the old
> text search stuff, and then run it, then do the data, will that work ok?
>

Further to this, I edited the schema, and created it ok, then dumped the
data across (after editing out all the tsearch stuff). So far so good.

However, my trigger failed:

CREATE TRIGGER fulltext_update
BEFORE INSERT OR UPDATE ON times
FOR EACH ROW
EXECUTE PROCEDURE tsearch2('vectors', 'remarks');

Looks like we renamed/removed the tsearch2 function? Is
setweight(to_tsvector()) the new way?

-Andy


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andy Colson <andy(at)camavision(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-10 20:00:36
Message-ID: 23067.1192046436@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andy Colson <andy(at)camavision(dot)com> writes:
> I have an 8.2 database that has full text searching. I tried to
> backup/restore it to 8.3 but got lots of errors:
> ...
> I didn't really expect it to totally work, but I'm not sure how to move
> my db.

Did the data transfer over? The declarations of the former contrib
functions would of course fail, but type tsvector is still there.
I would like to think that ignoring pg_restore's whining would get
you most of the way there.

(As noted in the beta release notes, somebody really really needs to
write a migration guide for existing tsearch2 users.)

regards, tom lane


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-10 20:31:35
Message-ID: 470D36A7.9040201@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Andy Colson <andy(at)camavision(dot)com> writes:
>> I have an 8.2 database that has full text searching. I tried to
>> backup/restore it to 8.3 but got lots of errors:
>> ...
>> I didn't really expect it to totally work, but I'm not sure how to move
>> my db.
>
> Did the data transfer over? The declarations of the former contrib
> functions would of course fail, but type tsvector is still there.
> I would like to think that ignoring pg_restore's whining would get
> you most of the way there.
>
> (As noted in the beta release notes, somebody really really needs to
> write a migration guide for existing tsearch2 users.)
>
> regards, tom lane

Ahh.. Ya know, I didn't even look. I saw all the errors and just
stopped it.

So I tried again: The long answer is no, the table with the tsvector
did not get created, and thus, not copied:

pg_restore: [archiver (db)] could not execute query: ERROR: type
"tsvector" is only a shell
LINE 11: vectors tsvector
^
Command was: CREATE TABLE times (
rowid integer NOT NULL,
empid integer NOT NULL,
idate timestamp without time zone,
jobi...
pg_restore: [archiver (db)] could not execute query: ERROR: relation
"public.times" does not exist
Command was: ALTER TABLE public.times OWNER TO andy;

In the backup script, the tsvector stuff is recreated, I assume that's why:

<snip>
CREATE TYPE tsquery (
INTERNALLENGTH = variable,
INPUT = tsquery_in,
OUTPUT = tsquery_out,
ALIGNMENT = int4,
STORAGE = plain
);

ALTER TYPE public.tsquery OWNER TO andy;

--
-- Name: tsvector; Type: SHELL TYPE; Schema: public; Owner: andy
--

CREATE TYPE tsvector;

--
-- Name: tsvector_in(cstring); Type: FUNCTION; Schema: public; Owner: andy
--

CREATE FUNCTION tsvector_in(cstring) RETURNS tsvector
AS '$libdir/tsearch2', 'tsvector_in'
LANGUAGE c STRICT;
<snip>

The rest of the tables, however, did get created/copied ok.

-Andy


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-10 21:31:27
Message-ID: 470D44AF.1090600@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

andy wrote:
> Andy Colson wrote:
>> Hi All,
>>
>> You knew it was coming....
>>
>>
>> I tried doing a pg_dump --schema-only and restoring just that, but
>> still got a bunch of errors (those above). If I clean that up of all
>> the old text search stuff, and then run it, then do the data, will
>> that work ok?
>>
>
> Further to this, I edited the schema, and created it ok, then dumped the
> data across (after editing out all the tsearch stuff). So far so good.
>
> However, my trigger failed:
>
> CREATE TRIGGER fulltext_update
> BEFORE INSERT OR UPDATE ON times
> FOR EACH ROW
> EXECUTE PROCEDURE tsearch2('vectors', 'remarks');
>
> Looks like we renamed/removed the tsearch2 function? Is
> setweight(to_tsvector()) the new way?
>
> -Andy

Ah, found it.. in the doc's no less :-)

CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON tblMessages FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(tsv, 'pg_catalog.english', strMessage);

-Andy


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-10 21:50:27
Message-ID: 24958.1192053027@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

andy <andy(at)squeakycode(dot)net> writes:
> Tom Lane wrote:
>> Did the data transfer over? The declarations of the former contrib
>> functions would of course fail, but type tsvector is still there.
>> I would like to think that ignoring pg_restore's whining would get
>> you most of the way there.

> So I tried again: The long answer is no, the table with the tsvector
> did not get created, and thus, not copied:

> pg_restore: [archiver (db)] could not execute query: ERROR: type
> "tsvector" is only a shell
> LINE 11: vectors tsvector
> ^
> Command was: CREATE TABLE times (

Hmph, that's annoying. I suppose the problem is that the script has
just set the search path to "public, pg_catalog", and so the failed
shell tsvector type in public is found in preference to the one in
pg_catalog.

What you could probably do as a workaround for testing is to create a
dummy type entry to block the creation of the shell type, say

create domain public.tsvector as pg_catalog.tsvector;

and then run pg_restore. This seems pretty ugly though ... anyone
have a better idea?

(Knew we should have insisted on seeing a migration plan sooner.
Oh well.)

regards, tom lane


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-10 21:57:39
Message-ID: 470D4AD3.9090502@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> andy <andy(at)squeakycode(dot)net> writes:
>> Tom Lane wrote:
>>> Did the data transfer over? The declarations of the former contrib
>>> functions would of course fail, but type tsvector is still there.
>>> I would like to think that ignoring pg_restore's whining would get
>>> you most of the way there.
>
>> So I tried again: The long answer is no, the table with the tsvector
>> did not get created, and thus, not copied:
>
>> pg_restore: [archiver (db)] could not execute query: ERROR: type
>> "tsvector" is only a shell
>> LINE 11: vectors tsvector
>> ^
>> Command was: CREATE TABLE times (
>
> Hmph, that's annoying. I suppose the problem is that the script has
> just set the search path to "public, pg_catalog", and so the failed
> shell tsvector type in public is found in preference to the one in
> pg_catalog.
>
> What you could probably do as a workaround for testing is to create a
> dummy type entry to block the creation of the shell type, say
>
> create domain public.tsvector as pg_catalog.tsvector;
>
> and then run pg_restore. This seems pretty ugly though ... anyone
> have a better idea?
>
> (Knew we should have insisted on seeing a migration plan sooner.
> Oh well.)
>
> regards, tom lane

I have the sql script to create the tables (I was going to pg_dump
--schema-only and edit, but there is a lot of stuff in there). I just
ran it to create empty tables, then did the pg_dump --data-only, and
scripted it in.

Worked fine.

Changed the trigger from calling tsearch2 to tsvector_update_trigger.

Then in my php I changed:
to_tsquery('default', '$fulltext')
to
to_tsquery('$fulltext')

and renamed rank to ts_rank, and my site (well, the test site) is happy
on 8.3.

-Andy


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 05:09:05
Message-ID: 14efcc4600038755@previdencia.gov.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andy,

seems you're a right person for writing migration guide.

Oleg
On Wed, 10 Oct 2007, andy wrote:

> andy wrote:
>> Andy Colson wrote:
>>> Hi All,
>>>
>>> You knew it was coming....
>>>
>>>
>>> I tried doing a pg_dump --schema-only and restoring just that, but still
>>> got a bunch of errors (those above). If I clean that up of all the old
>>> text search stuff, and then run it, then do the data, will that work ok?
>>>
>>
>> Further to this, I edited the schema, and created it ok, then dumped the
>> data across (after editing out all the tsearch stuff). So far so good.
>>
>> However, my trigger failed:
>>
>> CREATE TRIGGER fulltext_update
>> BEFORE INSERT OR UPDATE ON times
>> FOR EACH ROW
>> EXECUTE PROCEDURE tsearch2('vectors', 'remarks');
>>
>> Looks like we renamed/removed the tsearch2 function? Is
>> setweight(to_tsvector()) the new way?
>>
>> -Andy
>
> Ah, found it.. in the doc's no less :-)
>
> CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
> ON tblMessages FOR EACH ROW EXECUTE PROCEDURE
> tsvector_update_trigger(tsv, 'pg_catalog.english', strMessage);
>
> -Andy
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 05:09:05
Message-ID: Pine.LNX.4.64.0710110907530.3304@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andy,

seems you're a right person for writing migration guide.

Oleg
On Wed, 10 Oct 2007, andy wrote:

> andy wrote:
>> Andy Colson wrote:
>>> Hi All,
>>>
>>> You knew it was coming....
>>>
>>>
>>> I tried doing a pg_dump --schema-only and restoring just that, but still
>>> got a bunch of errors (those above). If I clean that up of all the old
>>> text search stuff, and then run it, then do the data, will that work ok?
>>>
>>
>> Further to this, I edited the schema, and created it ok, then dumped the
>> data across (after editing out all the tsearch stuff). So far so good.
>>
>> However, my trigger failed:
>>
>> CREATE TRIGGER fulltext_update
>> BEFORE INSERT OR UPDATE ON times
>> FOR EACH ROW
>> EXECUTE PROCEDURE tsearch2('vectors', 'remarks');
>>
>> Looks like we renamed/removed the tsearch2 function? Is
>> setweight(to_tsvector()) the new way?
>>
>> -Andy
>
> Ah, found it.. in the doc's no less :-)
>
> CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
> ON tblMessages FOR EACH ROW EXECUTE PROCEDURE
> tsvector_update_trigger(tsv, 'pg_catalog.english', strMessage);
>
> -Andy
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 13:41:47
Message-ID: 470E281B.3090007@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oleg Bartunov wrote:
> Andy,
>
> seems you're a right person for writing migration guide.
>
> Oleg
> On Wed, 10 Oct 2007, andy wrote:
>

Where would be an easy place to find all the renamed functions?

My experience with fts is limited to one project, and I just used all
the default dictionaries, so I've never even played with any of that.

That being said, I'd be happy to write up what I can (and I can try to
put together a list of the renamed functions).

Is there any chance there is an easier way to backup/restore? On one
hand, its not too bad, and it'll only be once (correct?). Now that fts
is in core future backup/restores will work, right? I think it's
analogous to telling someone they are updating from tsearch2 to
tsearch3, and it might be a little more painful than just a backup/restore.

On the other hand I think a backup/restore will pollute the new db with
a bunch of functions and types that wont ever be used, so it's so much
cleaner to build it by hand.

Are there other fts users that might have opinions on that?

-Andy


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 13:41:47
Message-ID: 16c4e7d400001221@previdencia.gov.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oleg Bartunov wrote:
> Andy,
>
> seems you're a right person for writing migration guide.
>
> Oleg
> On Wed, 10 Oct 2007, andy wrote:
>

Where would be an easy place to find all the renamed functions?

My experience with fts is limited to one project, and I just used all
the default dictionaries, so I've never even played with any of that.

That being said, I'd be happy to write up what I can (and I can try to
put together a list of the renamed functions).

Is there any chance there is an easier way to backup/restore? On one
hand, its not too bad, and it'll only be once (correct?). Now that fts
is in core future backup/restores will work, right? I think it's
analogous to telling someone they are updating from tsearch2 to
tsearch3, and it might be a little more painful than just a backup/restore.

On the other hand I think a backup/restore will pollute the new db with
a bunch of functions and types that wont ever be used, so it's so much
cleaner to build it by hand.

Are there other fts users that might have opinions on that?

-Andy

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 13:57:15
Message-ID: 470E2BBB.5000703@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

andy wrote:
> Is there any chance there is an easier way to backup/restore? On one
> hand, its not too bad, and it'll only be once (correct?). Now that fts
> is in core future backup/restores will work, right? I think it's
> analogous to telling someone they are updating from tsearch2 to
> tsearch3, and it might be a little more painful than just a backup/restore.
>
> On the other hand I think a backup/restore will pollute the new db with
> a bunch of functions and types that wont ever be used, so it's so much
> cleaner to build it by hand.
>
> Are there other fts users that might have opinions on that?

I'm not really a tsearch user (just played with it a bit once). But I wondered
if you are aware that you can prevent certain objects from being restored
quite easiy if you use pg_dump and pg_restore together with "custom format"
(-Fc). There is some option to pg_restore that reads the dump, and ouputs a
table of contents. You can then remove some entries from that list, and pass the
modified list to pg_restore which will skip entries that do not show up on your
modified list.

Maybe we could document some regexp, awk script, or similar that strips the
tsearch stuff from such a table of contents?

regards, Florian Pflug


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 13:57:15
Message-ID: 16d3556e00001b26@previdencia.gov.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

andy wrote:
> Is there any chance there is an easier way to backup/restore? On one
> hand, its not too bad, and it'll only be once (correct?). Now that fts
> is in core future backup/restores will work, right? I think it's
> analogous to telling someone they are updating from tsearch2 to
> tsearch3, and it might be a little more painful than just a backup/restore.
>
> On the other hand I think a backup/restore will pollute the new db with
> a bunch of functions and types that wont ever be used, so it's so much
> cleaner to build it by hand.
>
> Are there other fts users that might have opinions on that?

I'm not really a tsearch user (just played with it a bit once). But I wondered
if you are aware that you can prevent certain objects from being restored
quite easiy if you use pg_dump and pg_restore together with "custom format"
(-Fc). There is some option to pg_restore that reads the dump, and ouputs a
table of contents. You can then remove some entries from that list, and pass the
modified list to pg_restore which will skip entries that do not show up on your
modified list.

Maybe we could document some regexp, awk script, or similar that strips the
tsearch stuff from such a table of contents?

regards, Florian Pflug

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 14:06:04
Message-ID: 470E2DCC.8060506@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian G. Pflug wrote:
> andy wrote:
>> Is there any chance there is an easier way to backup/restore? On one
>> hand, its not too bad, and it'll only be once (correct?). Now that
>> fts is in core future backup/restores will work, right? I think it's
>> analogous to telling someone they are updating from tsearch2 to
>> tsearch3, and it might be a little more painful than just a
>> backup/restore.
>>
>> On the other hand I think a backup/restore will pollute the new db
>> with a bunch of functions and types that wont ever be used, so it's so
>> much cleaner to build it by hand.
>>
>> Are there other fts users that might have opinions on that?
>
> I'm not really a tsearch user (just played with it a bit once). But I
> wondered if you are aware that you can prevent certain objects from
> being restored
> quite easiy if you use pg_dump and pg_restore together with "custom
> format" (-Fc). There is some option to pg_restore that reads the dump,
> and ouputs a table of contents. You can then remove some entries from
> that list, and pass the modified list to pg_restore which will skip
> entries that do not show up on your modified list.
>
> Maybe we could document some regexp, awk script, or similar that strips
> the tsearch stuff from such a table of contents?
>
> regards, Florian Pflug
>

Ahh, I did not know that... I'll try that out and see if I can come up
with something. Thanks!

-Andy


From: Richard Huxton <dev(at)archonet(dot)com>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 14:13:53
Message-ID: 470E2FA1.2050000@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

andy wrote:
> Florian G. Pflug wrote:
>> Maybe we could document some regexp, awk script, or similar that
>> strips the tsearch stuff from such a table of contents?
>
> Ahh, I did not know that... I'll try that out and see if I can come up
> with something. Thanks!

If you hack the old tsearch2.sql install script you can change the
schema it installs to. That should make it easier to identify everything
it installs.

--
Richard Huxton
Archonet Ltd


From: "Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: andy <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 14:20:42
Message-ID: e431ff4c0710110720q2e3e7768g7f57bd7ad794bf1b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/11/07, Florian G. Pflug <fgp(at)phlo(dot)org> wrote:
>
> Maybe we could document some regexp, awk script, or similar that strips the
> tsearch stuff from such a table of contents?

Just my .02c for those who will work on migration manual.

In my case, all tsearch2 stuff was kept (before 8.3) in separate
schema, namely "tsearch2". So, in 8.2, I had tsearch2.tsvector and
tsearch2.tsquery data types and so on.

During restoration to 8.3 I've catched segfaults -- during INSERTs
into tables with "tsearch2"."tsvector" columns. What helped me is the
following procedure:
1. restore schema only;
2. restore data with replacing "tsearch2"."tsvector" datatype to "tsvector":
sed -e 's/tsearch2\.tsvector/tsvector/g' DATADUMP | psql DBNAME
2>restoration_errors.log
3. drop "tsearch2" schema since it isn't needed anymore.

After that -- everything works normally.
My case is specific since I use separate schemas for every single
contrib module.
--
Best regards,
Nikolay


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 14:24:22
Message-ID: Pine.LNX.4.64.0710111819300.23412@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Oct 2007, andy wrote:

> Oleg Bartunov wrote:
>> Andy,
>>
>> seems you're a right person for writing migration guide.
>>
>> Oleg
>> On Wed, 10 Oct 2007, andy wrote:
>>
>
> Where would be an easy place to find all the renamed functions?

My incomplete list:
http://www.sai.msu.su/~megera/wiki/Tsearch2_83_changes

psql \h shows syntax of text search commands

\h create text search
\h alter text search
\h drop text search

>
> My experience with fts is limited to one project, and I just used all the
> default dictionaries, so I've never even played with any of that.
>
> That being said, I'd be happy to write up what I can (and I can try to put
> together a list of the renamed functions).
>
> Is there any chance there is an easier way to backup/restore? On one hand,
> its not too bad, and it'll only be once (correct?). Now that fts is in core
> future backup/restores will work, right? I think it's analogous to telling
> someone they are updating from tsearch2 to tsearch3, and it might be a little
> more painful than just a backup/restore.
>
> On the other hand I think a backup/restore will pollute the new db with a
> bunch of functions and types that wont ever be used, so it's so much cleaner
> to build it by hand.
>
> Are there other fts users that might have opinions on that?
>
> -Andy
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: nikolay(at)samokhvalov(dot)com
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, andy <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 14:38:04
Message-ID: 162867790710110738u2beb7983sdd4e478faee3a6b2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I working on binary compatible library with tsearch2, which should be
usable for all users who has default configuration of tsearch2. I
hope, I send patch before morning

Pavel


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: nikolay(at)samokhvalov(dot)com
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, andy <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 15:07:20
Message-ID: 13951.1192115240@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com> writes:
> During restoration to 8.3 I've catched segfaults -- during INSERTs
> into tables with "tsearch2"."tsvector" columns.

Segfaults? That shouldn't happen. Please show a test case.

regards, tom lane


From: "Nikolay Samokhvalov" <nikolay(at)samokhvalov(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, andy <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 15:35:43
Message-ID: e431ff4c0710110835p3091ff69m1a1da02867244cd6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com> writes:
> > During restoration to 8.3 I've catched segfaults -- during INSERTs
> > into tables with "tsearch2"."tsvector" columns.
>
> Segfaults? That shouldn't happen. Please show a test case.

Test case: use old tsearch2.so to register all tsearch2 functions to
"tsearch2" schema (old fashioned way). Then try:

nikolay=# select '111'::tsearch2.tsvector;
ERROR: invalid memory alloc request size 1878491112
nikolay=# select '111'::tsearch2.tsvector;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

Backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x082c2d91 in gettoken_tsvector (state=0xbfae77cc, strval=0x100,
lenval=0x0, pos_ptr=0xbfae7a1c, poslen=0x0, endptr=0x0) at
tsvector_parser.c:209
209 RETURN_TOKEN;
(gdb) bt
#0 0x082c2d91 in gettoken_tsvector (state=0xbfae77cc, strval=0x100,
lenval=0x0, pos_ptr=0xbfae7a1c, poslen=0x0, endptr=0x0) at
tsvector_parser.c:209
#1 0xa730d85f in tsvector_in () from /test/lib/postgresql/tsearch2.so
#2 0x082dda5f in InputFunctionCall (flinfo=0x0, str=0x8479c00 "111",
typioparam=61357, typmod=-1) at fmgr.c:1835
#3 0x082dfe26 in OidInputFunctionCall (functionId=61358,
str=0x8479c00 "111", typioparam=61357, typmod=-1) at fmgr.c:1939
#4 0x081317a7 in stringTypeDatum (tp=0xa732eef8, string=0x8479c00
"111", atttypmod=-1) at parse_type.c:462
#5 0x081343bd in coerce_type (pstate=0x8479bb4, node=0x8479ec0,
inputTypeId=705, targetTypeId=61357, targetTypeMod=-1,
ccontext=COERCION_EXPLICIT,
cformat=COERCE_EXPLICIT_CAST) at parse_coerce.c:210
#6 0x08134b8c in coerce_to_target_type (pstate=0x8479bb4,
expr=0x8479ec0, exprtype=705, targettype=61357, targettypmod=-1,
ccontext=COERCION_EXPLICIT,
cformat=COERCE_EXPLICIT_CAST) at parse_coerce.c:81
#7 0x081279d3 in typecast_expression (pstate=0x8479bb4,
expr=0x8479ec0, typename=0x8479cd4) at parse_expr.c:2221
#8 0x0812872b in transformExpr (pstate=0x8479bb4, expr=0x8479d00) at
parse_expr.c:150
#9 0x081369fc in transformTargetEntry (pstate=0x8479bb4,
node=0x8479d00, expr=0x0, colname=0x0, resjunk=0 '\0') at
parse_target.c:74
#10 0x08136ed4 in transformTargetList (pstate=0x8479bb4,
targetlist=0x8479d5c) at parse_target.c:146
#11 0x0810f188 in transformStmt (pstate=0x8479bb4,
parseTree=0x8479d78) at analyze.c:695
#12 0x0811103f in parse_analyze (parseTree=0x8479d78,
sourceText=0x847939c "select '111'::tsearch2.tsvector;",
paramTypes=0x0, numParams=0) at analyze.c:96
#13 0x0822e00e in pg_analyze_and_rewrite (parsetree=0x8479d78,
query_string=0x847939c "select '111'::tsearch2.tsvector;",
paramTypes=0x0, numParams=0)
at postgres.c:596
#14 0x0822e1b9 in exec_simple_query (query_string=0x847939c "select
'111'::tsearch2.tsvector;") at postgres.c:899
#15 0x0822fde6 in PostgresMain (argc=4, argv=<value optimized out>,
username=0x841f508 "nikolay") at postgres.c:3529
#16 0x081ff938 in ServerLoop () at postmaster.c:3181
#17 0x08200656 in PostmasterMain (argc=3, argv=0x841b878) at postmaster.c:1027
#18 0x081b34c0 in main (argc=3, argv=0xbfae7a4c) at main.c:188

I don't think that it's abnormal, because segfault was caused by old
.so. Nothing wrong, right? But what we should worry about is the fact
that some users will follow the same way I did and will have this
segfault too... Maybe we should notice that one should remove old
tsearch.so during upgrade process (the case when one runs 8.2 and 8.3
on the same machine).

--
Best regards,
Nikolay


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Nikolay Samokhvalov" <nikolay(at)samokhvalov(dot)com>
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, andy <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 15:56:48
Message-ID: 22326.1192118208@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Nikolay Samokhvalov" <nikolay(at)samokhvalov(dot)com> writes:
> On 10/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Segfaults? That shouldn't happen. Please show a test case.

> Test case: use old tsearch2.so to register all tsearch2 functions to
> "tsearch2" schema (old fashioned way). Then try:

How did you get 8.3 to load the old .so at all? It should have the
wrong PG_MODULE_MAGIC values.

regards, tom lane


From: "Nikolay Samokhvalov" <nikolay(at)samokhvalov(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, andy <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 16:04:16
Message-ID: e431ff4c0710110904qe2c52f6ya5441b23ec8bc8d5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Nikolay Samokhvalov" <nikolay(at)samokhvalov(dot)com> writes:
> > On 10/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Segfaults? That shouldn't happen. Please show a test case.
>
> > Test case: use old tsearch2.so to register all tsearch2 functions to
> > "tsearch2" schema (old fashioned way). Then try:
>
> How did you get 8.3 to load the old .so at all? It should have the
> wrong PG_MODULE_MAGIC values.

Ooops, it's my mistake, sorry. It was CVS version of contrib/tsearch2.
So, users shouldn't encounter with this problem.

--
Best regards,
Nikolay


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 18:13:14
Message-ID: 470E67BA.7050900@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian G. Pflug wrote:
> I'm not really a tsearch user (just played with it a bit once). But I
> wondered if you are aware that you can prevent certain objects from
> being restored
> quite easiy if you use pg_dump and pg_restore together with "custom
> format" (-Fc). There is some option to pg_restore that reads the dump,
> and ouputs a table of contents. You can then remove some entries from
> that list, and pass the modified list to pg_restore which will skip
> entries that do not show up on your modified list.
>
> Maybe we could document some regexp, awk script, or similar that strips
> the tsearch stuff from such a table of contents?
>
> regards, Florian Pflug
>

This has worked well. I have a little sed script that deletes all the
tsearch2 looking stuff, then the restore works ok (except for the
trigger, which is ok).

However, am I safe not restoring all these operators?

1126; 2617 98028 OPERATOR public !! andy
1124; 2617 98024 OPERATOR public && andy
1112; 2617 98003 OPERATOR public < andy
1118; 2617 98017 OPERATOR public < andy
1113; 2617 98004 OPERATOR public <= andy
1119; 2617 98018 OPERATOR public <= andy
1117; 2617 98005 OPERATOR public <> andy
1123; 2617 98019 OPERATOR public <> andy
1128; 2617 98036 OPERATOR public <@ andy
1116; 2617 98006 OPERATOR public = andy
1122; 2617 98020 OPERATOR public = andy
1115; 2617 98001 OPERATOR public > andy
1121; 2617 98015 OPERATOR public > andy
1114; 2617 98002 OPERATOR public >= andy
1120; 2617 98016 OPERATOR public >= andy
1129; 2617 98039 OPERATOR public @ andy
1127; 2617 98037 OPERATOR public @> andy
1111; 2617 97955 OPERATOR public @@ andy
1110; 2617 97956 OPERATOR public @@ andy
1132; 2617 98055 OPERATOR public @@@ andy
1131; 2617 98056 OPERATOR public @@@ andy
1109; 2617 97941 OPERATOR public || andy
1125; 2617 98026 OPERATOR public || andy
1130; 2617 98038 OPERATOR public ~ andy

the operator = is not the 'normal =' is it? Its the 'tsearch2 =', right?

Do I need to worry about sed with window's users?

-Andy


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 18:40:02
Message-ID: 4005.1192128002@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

andy <andy(at)squeakycode(dot)net> writes:
> the operator = is not the 'normal =' is it? Its the 'tsearch2 =', right?

That one probably is, but how is your sed script going to distinguish it
from other user-defined '=' operators that might be in the dump?

> Do I need to worry about sed with window's users?

I think sed is available but not normally installed on Windows.
Unfortunately the same could be said of any other tool you might choose,
so that's probably not a reason not to use it.

regards, tom lane


From: andy <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 18:49:23
Message-ID: 470E7033.7040402@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> andy <andy(at)squeakycode(dot)net> writes:
>> the operator = is not the 'normal =' is it? Its the 'tsearch2 =', right?
>
> That one probably is, but how is your sed script going to distinguish it
> from other user-defined '=' operators that might be in the dump?
>
>> Do I need to worry about sed with window's users?
>
> I think sed is available but not normally installed on Windows.
> Unfortunately the same could be said of any other tool you might choose,
> so that's probably not a reason not to use it.
>
> regards, tom lane

Oh man... Ok, do you want to go as far as extracting just one operator,
and pulling out its PROCEDURE name?

For one of the ='s, I put just its line to the file x:

1122; 2617 98020 OPERATOR public = andy

Then ran:
andy(at)slacker:/pub/back$ pg_restore -Fc -L x vcstimes.bak
--
-- PostgreSQL database dump
--

SET client_encoding = 'LATIN1';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

SET search_path = public, pg_catalog;

--
-- Name: =; Type: OPERATOR; Schema: public; Owner: andy
--

CREATE OPERATOR = (
PROCEDURE = tsquery_eq,
LEFTARG = tsquery,
RIGHTARG = tsquery,
COMMUTATOR = =,
NEGATOR = <>,
MERGES,
RESTRICT = eqsel,
JOIN = eqjoinsel
);

ALTER OPERATOR public.= (tsquery, tsquery) OWNER TO andy;

--
-- PostgreSQL database dump complete
--

I could grep out the PROCEDURE line and see if it looks tsearch2'ish.
If you want to go that route, its starting to sound beyond sed, would
perl be ok?

-Andy


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: andy <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: full text search in 8.3
Date: 2007-10-11 18:59:13
Message-ID: 470E7281.3010606@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

andy wrote:
>
> Do I need to worry about sed with window's users?
>
>

yes.

Perl is probably more common in Windows, and should be able to do
everything sed can. It's also required for doing Windows/MSVC builds.

cheers

andrew