Lists: | pgsql-general |
---|
From: | Felipe de Jesús Molina Bravo <felipe(dot)molina(at)inegi(dot)gob(dot)mx> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Cc: | Felipe(dot)molina(at)inegi(dot)gob(dot)mx |
Subject: | Tsearch2 - spanish |
Date: | 2007-09-17 21:23:41 |
Message-ID: | 1190064221.6856.35.camel@fjmb |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Hi
I had installed postgresql-8.2.4 and tsearch2 with dictionary spanish.
My problem is:
prueba=# select to_tsvector('espanol','melón');
ERROR: Affix parse error at 506 line
And if execute:
prueba=# select lexize('sp','melón');
lexize
---------
{melon}
(1 row)
I tried many dictionaries with the same results. Also I change the
codeset of files :aff and dict (from "latin1 to utf8" and "utf8 to
iso88591") and got the same error
where can I investigate for resolve about this problem?
My dictionary at 506 line had:
flag *J: # isimo
E > -E, ÍSIMO # grande grandísimo
E > -E, ÍSIMOS # grande grandísimos
E > -E, ÍSIMA # grande grandísima
E > -E, ÍSIMAS # grande grandísimas
O > -O, ÍSIMO # tonto tontísimo
O > -O, ÍSIMA # tonto tontísima
O > -O, ÍSIMOS # tonto tontísimos
O > -O, ÍSIMAS # tonto tontísimas
L > ÍSIMO # formal formalísimo
L > ÍSIMA # formal formalísima
L > ÍSIMOS # formal formalísimos
L > ÍSIMAS # formal formalísimas
If removed "Í" then I don't have problem, but the lexema is incorrect
I saw the post
http://archives.postgresql.org/pgsql-general/2007-07/msg00888.php
Maybe Marcelo had resolve the problem, can you tell me your
configuration of tsearch2?
best regards
PD I need to resolve it for my work
From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Felipe de Jesús Molina Bravo <felipe(dot)molina(at)inegi(dot)gob(dot)mx> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Tsearch2 - spanish |
Date: | 2007-09-18 15:19:00 |
Message-ID: | 46EFEC64.1090207@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
> prueba=# select to_tsvector('espanol','melón');
> ERROR: Affix parse error at 506 line
and
> prueba=# select lexize('sp','melón');
> lexize
> ---------
> {melon}
> (1 row)
Looks very strange, can you provide list of dictionaries and configuration map?
> I tried many dictionaries with the same results. Also I change the
> codeset of files :aff and dict (from "latin1 to utf8" and "utf8 to
> iso88591") and got the same error
>
> where can I investigate for resolve about this problem?
>
> My dictionary at 506 line had:
Where do you take this file? And what is encdoing/locale setting of your db?
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From: | Felipe de Jesús Molina Bravo <felipe(dot)molina(at)inegi(dot)gob(dot)mx> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Tsearch2 - spanish |
Date: | 2007-09-18 19:47:15 |
Message-ID: | 1190144835.6821.55.camel@fjmb |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Hi
You are rigth, the output of "show lc_ctype;" is C.
Then I did is:
prueba1=# show lc_ctype;
lc_ctype
-----------------
es_MX.ISO8859-1
(1 row)
and do it
% initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1
(how you do say)
and "createdb -E iso8859-1 prueba1" and finally tsearch2
the original problem is resolved
prueba1=# select to_tsvector('espanol','melón');
to_tsvector
-------------
'melón':1
(1 row)
but if I change the sentece for it:
prueba1=# select to_tsvector('espanol','melón perro mordelón');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
??? lost the connection ... the server is up .... any idea?
The synonym is intentional
thanks in advanced
El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev escribió:
> > LC_CTYPE="POSIX"
>
>
> pls, output of "show lc_ctype;" command. If it's C locale then I can identify
> problem - characters diacritical mark (as ó) is not an alpha character, and
> ispell dictionary will fail. To fix that you should run initdb with options:
> % initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1
> or
> % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8
>
> In last case you should also recode all dictionary's datafile in utf8 encoding.
>
> >>> prueba=# select to_tsvector('espanol','melón');
> >>> ERROR: Affix parse error at 506 line
> >> and
> >>> prueba=# select lexize('sp','melón');
> >>> lexize
> >>> ---------
> >>> {melon}
> >>> (1 row)
> sp is a Snowball stemmer, it doesn't require affix file, so it works.
>
> By the way, why is synonym dictionary paced after ispell? is it intentional?
> Usually, synonym dictionary goes first, then ispell and after all of them snowball.
>
From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Felipe de Jesús Molina Bravo <felipe(dot)molina(at)inegi(dot)gob(dot)mx> |
Cc: | PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Tsearch2 - spanish |
Date: | 2007-09-19 16:30:42 |
Message-ID: | 46F14EB2.2060506@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
> prueba1=# select to_tsvector('espanol','melón perro mordelón');
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
Hmm, can you provide backtrace?
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From: | marcelo Cortez <jmdc_marcelo(at)yahoo(dot)com(dot)ar> |
---|---|
To: | Felipe de Jesús Molina Bravo <felipe(dot)molina(at)inegi(dot)gob(dot)mx>, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Tsearch2 - spanish |
Date: | 2007-09-20 12:13:18 |
Message-ID: | 694124.69149.qm@web32110.mail.mud.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Felipe
--- Felipe de Jesús Molina Bravo
<felipe(dot)molina(at)inegi(dot)gob(dot)mx> escribió:
> Hi
>
> You are rigth, the output of "show lc_ctype;" is C.
>
> Then I did is:
>
> prueba1=# show lc_ctype;
> lc_ctype
> -----------------
> es_MX.ISO8859-1
> (1 row)
>
> and do it
>
> % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
>
> (how you do say)
>
> and "createdb -E iso8859-1 prueba1" and finally
> tsearch2
>
> the original problem is resolved
>
> prueba1=# select to_tsvector('espanol','melón');
> to_tsvector
> -------------
> 'melón':1
> (1 row)
>
>
> but if I change the sentece for it:
>
> prueba1=# select to_tsvector('espanol','melón perro
> mordelón');
> server closed the connection unexpectedly
> This probably means the server terminated
> abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting
> reset: Failed.
> !>
The same thing he same thing happened my to me at
first time with
Tsearch2 - spanish , i think you need
patch snowball with tsearch_snowball_82 file ,
googling
you find instructions how doit .
best regards
mdc
>
>
> ??? lost the connection ... the server is up ....
> any idea?
>
> The synonym is intentional
>
>
> thanks in advanced
>
>
> El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev
> escribió:
> > > LC_CTYPE="POSIX"
> >
> >
> > pls, output of "show lc_ctype;" command. If it's C
> locale then I can identify
> > problem - characters diacritical mark (as ó) is
> not an alpha character, and
> > ispell dictionary will fail. To fix that you
> should run initdb with options:
> > % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
> > or
> > % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8
> >
> > In last case you should also recode all
> dictionary's datafile in utf8 encoding.
> >
> > >>> prueba=# select
> to_tsvector('espanol','melón');
> > >>> ERROR: Affix parse error at 506 line
> > >> and
> > >>> prueba=# select lexize('sp','melón');
> > >>> lexize
> > >>> ---------
> > >>> {melon}
> > >>> (1 row)
> > sp is a Snowball stemmer, it doesn't require affix
> file, so it works.
> >
> > By the way, why is synonym dictionary paced after
> ispell? is it intentional?
> > Usually, synonym dictionary goes first, then
> ispell and after all of them snowball.
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please
> send an appropriate
> subscribe-nomail command to
> majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list
> cleanly
>
Seguí de cerca a la Selección Argentina de Rugby en el Mundial de Francia 2007.
http://ar.sports.yahoo.com/mundialderugby
From: | "MOLINA BRAVO FELIPE DE JESUS" <felipe(dot)molina(at)inegi(dot)gob(dot)mx> |
---|---|
To: | "marcelo Cortez" <jmdc_marcelo(at)yahoo(dot)com(dot)ar>, "Teodor Sigaev" <teodor(at)sigaev(dot)ru> |
Cc: | "PostgreSQL General" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Tsearch2 - spanish |
Date: | 2007-09-20 16:51:25 |
Message-ID: | 5CE6C20D880B514E88D5A05E9949E15628F44A@CORREOAGS03.inegi.gob.mx |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Hi
Thank's Teodor and Marcelo
the problem is solved
regards
-----Mensaje original-----
De: marcelo Cortez [mailto:jmdc_marcelo(at)yahoo(dot)com(dot)ar]
Enviado el: jue 20/09/2007 7:13
Para: MOLINA BRAVO FELIPE DE JESUS; Teodor Sigaev
CC: PostgreSQL General
Asunto: Re: [GENERAL] Tsearch2 - spanish
Felipe
--- Felipe de Jesús Molina Bravo
<felipe(dot)molina(at)inegi(dot)gob(dot)mx> escribió:
> Hi
>
> You are rigth, the output of "show lc_ctype;" is C.
>
> Then I did is:
>
> prueba1=# show lc_ctype;
> lc_ctype
> -----------------
> es_MX.ISO8859-1
> (1 row)
>
> and do it
>
> % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
>
> (how you do say)
>
> and "createdb -E iso8859-1 prueba1" and finally
> tsearch2
>
> the original problem is resolved
>
> prueba1=# select to_tsvector('espanol','melón');
> to_tsvector
> -------------
> 'melón':1
> (1 row)
>
>
> but if I change the sentece for it:
>
> prueba1=# select to_tsvector('espanol','melón perro
> mordelón');
> server closed the connection unexpectedly
> This probably means the server terminated
> abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting
> reset: Failed.
> !>
The same thing he same thing happened my to me at
first time with
Tsearch2 - spanish , i think you need
patch snowball with tsearch_snowball_82 file ,
googling
you find instructions how doit .
best regards
mdc
>
>
> ??? lost the connection ... the server is up ....
> any idea?
>
> The synonym is intentional
>
>
> thanks in advanced
>
>
> El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev
> escribió:
> > > LC_CTYPE="POSIX"
> >
> >
> > pls, output of "show lc_ctype;" command. If it's C
> locale then I can identify
> > problem - characters diacritical mark (as ó) is
> not an alpha character, and
> > ispell dictionary will fail. To fix that you
> should run initdb with options:
> > % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
> > or
> > % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8
> >
> > In last case you should also recode all
> dictionary's datafile in utf8 encoding.
> >
> > >>> prueba=# select
> to_tsvector('espanol','melón');
> > >>> ERROR: Affix parse error at 506 line
> > >> and
> > >>> prueba=# select lexize('sp','melón');
> > >>> lexize
> > >>> ---------
> > >>> {melon}
> > >>> (1 row)
> > sp is a Snowball stemmer, it doesn't require affix
> file, so it works.
> >
> > By the way, why is synonym dictionary paced after
> ispell? is it intentional?
> > Usually, synonym dictionary goes first, then
> ispell and after all of them snowball.
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please
> send an appropriate
> subscribe-nomail command to
> majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list
> cleanly
>
Seguí de cerca a la Selección Argentina de Rugby en el Mundial de Francia 2007.
http://ar.sports.yahoo.com/mundialderugby
From: | "madhtr" <madhtr(at)schif(dot)org> |
---|---|
To: | "PostgreSQL General" <pgsql-general(at)postgresql(dot)org> |
Subject: | How to clear bits? |
Date: | 2007-09-20 17:01:47 |
Message-ID: | 009001c7fba7$ee8544d0$7b55503f@useronewin2klt |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Hello group :)
How do a clear bits in a number in PostGreSQL?
in c++ its:
0xffffff00 &~ 0x0000ffff
what is it in PostGreSQL from the psql command line app?
select ...
Thanx:)
From: | "madhtr" <madhtr(at)schif(dot)org> |
---|---|
To: | "madhtr" <madhtr(at)schif(dot)org>, "PostgreSQL General" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: How to clear bits? |
Date: | 2007-09-20 18:10:00 |
Message-ID: | 00d101c7fbb1$76553fb0$7b55503f@useronewin2klt |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
nevermind, I figured it out ...
fails:
0xffffff00 &~ 0x0000ffff
succeeds:
0xffffff00 & ~ 0x0000ffff
I had to add a space.
----- Original Message -----
From: "madhtr" <madhtr(at)schif(dot)org>
To: "PostgreSQL General" <pgsql-general(at)postgresql(dot)org>
Sent: Thursday, September 20, 2007 13:01
Subject: [GENERAL] How to clear bits?
> Hello group :)
>
> How do a clear bits in a number in PostGreSQL?
>
> in c++ its:
>
> 0xffffff00 &~ 0x0000ffff
>
> what is it in PostGreSQL from the psql command line app?
>
> select ...
>
> Thanx:)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match