Re: [GENERAL] Number format problem

Lists: pgsql-generalpgsql-patches
From: Stéphane SCHILDKNECHT <sas(at)atosc(dot)org>
To: postgresql-general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Number format problem
Date: 2006-02-03 15:36:23
Message-ID: 43E37877.8050801@atosc.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Hi,

There seems to be some tricky behaviour with number formating and french
locale.

I tried the following request:
select to_char(1485.12, '9G999D99');

I was expecting to get: 1 485,12

But, surprinsingly, I got 1,1485,12.

My postgresql server is an 8.1.2 version. The same problem occurs under
Ubuntu Breezy and Debian Testing.
My current configuration is
LC_NUMERIC=fr_FR(at)euro
client_encoding=LATIN9
server_encoding=LATIN9

I tried to reconfigure locales and restart the server, but I can't get
the result I expect.

I really don't know what else I could do.

Sincerely,

--
Stéphane SCHILDKNECHT
Président de PostgreSQLFr
http://www.postgresqlfr.org


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Cc: Stéphane SCHILDKNECHT <sas(at)atosc(dot)org>
Subject: Re: Number format problem
Date: 2006-02-04 01:04:00
Message-ID: 200602040204.00676.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Stéphane SCHILDKNECHT wrote:
> select to_char(1485.12, '9G999D99');

> But, surprinsingly, I got 1,1485,12.

The fr_FR locale is broken. You should report this to glibc.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Number format problem
Date: 2006-02-06 11:15:52
Message-ID: 20060206121927.827211@crocodudule.admin.clubint.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Peter Eisentraut wrote:

> Stéphane SCHILDKNECHT wrote:
> > select to_char(1485.12, '9G999D99');
>
> > But, surprinsingly, I got 1,1485,12.
>
> The fr_FR locale is broken. You should report this to glibc.

On my debian sarge with LC_NUMERIC set to fr_FR(at)euro, a
printf("%'g\n", 1485.12);
produces 1485,12 with which seems to be correct given that the
'thousands_sep' locale entry is set to "" (empty string) and
'decimal_point' to U002C

On the other hand, what postgres produces is:
test=> set lc_numeric='fr_FR(at)euro';
SET
test=> select to_char(1485.12, '9G999D99');
to_char
-----------
1,485,12
(1 row)

which is wrong with regard to thousands_sep="".

In fact, grep'ing the source code reveals that, when 'thousands_sep' is set to
an empty string, it gets ignored and a comma is used instead.
I'm referring to backend/utils/adt/formatting.c, NUM_prepare_locale() in 8.1.2:

/*
* Number thousands separator
*/
if (lconv->thousands_sep && *lconv->thousands_sep)
Np->L_thousands_sep = lconv->thousands_sep;
else
Np->L_thousands_sep = ",";

What's wrong with lconv->thousands_sep being set to an empty string?
Shouldn't it be used nonetheless?

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] Number format problem
Date: 2006-02-12 19:51:22
Message-ID: 200602121951.k1CJpMB09061@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches


Interesting. We should be able to handle a "" thousands separator. In
fact, psql does, but to_char() does not.

Looking at to_char(), it seems that we should be able to handle "" for
thousands and plus sign.

I have applied the following patch for 8.2. I have not applied to 8.1.X
because it is a behavior change.

Thanks for the report.

---------------------------------------------------------------------------

Daniel Verite wrote:
> Peter Eisentraut wrote:
>
> > St?phane SCHILDKNECHT wrote:
> > > select to_char(1485.12, '9G999D99');
> >
> > > But, surprinsingly, I got 1,1485,12.
> >
> > The fr_FR locale is broken. You should report this to glibc.
>
> On my debian sarge with LC_NUMERIC set to fr_FR(at)euro, a
> printf("%'g\n", 1485.12);
> produces 1485,12 with which seems to be correct given that the
> 'thousands_sep' locale entry is set to "" (empty string) and
> 'decimal_point' to U002C
>
> On the other hand, what postgres produces is:
> test=> set lc_numeric='fr_FR(at)euro';
> SET
> test=> select to_char(1485.12, '9G999D99');
> to_char
> -----------
> 1,485,12
> (1 row)
>
> which is wrong with regard to thousands_sep="".
>
> In fact, grep'ing the source code reveals that, when 'thousands_sep' is set to
> an empty string, it gets ignored and a comma is used instead.
> I'm referring to backend/utils/adt/formatting.c, NUM_prepare_locale() in 8.1.2:
>
> /*
> * Number thousands separator
> */
> if (lconv->thousands_sep && *lconv->thousands_sep)
> Np->L_thousands_sep = lconv->thousands_sep;
> else
> Np->L_thousands_sep = ",";
>
> What's wrong with lconv->thousands_sep being set to an empty string?
> Shouldn't it be used nonetheless?
>
> --
> Daniel
> PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 1.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] Number format problem
Date: 2006-02-12 23:24:52
Message-ID: 25558.1139786692@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> I have applied the following patch for 8.2. I have not applied to 8.1.X
> because it is a behavior change.

This patch broke the regression tests. It seems to change the C-locale
behavior on many platforms.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] Number format problem
Date: 2006-02-12 23:45:12
Message-ID: 200602122345.k1CNjCD12720@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > I have applied the following patch for 8.2. I have not applied to 8.1.X
> > because it is a behavior change.
>
> This patch broke the regression tests. It seems to change the C-locale
> behavior on many platforms.

Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
disinguish "" as "don't have" from "" as "don't want". Will revert.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] Number format problem
Date: 2006-02-12 23:51:00
Message-ID: 25834.1139788260@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Tom Lane wrote:
>> This patch broke the regression tests. It seems to change the C-locale
>> behavior on many platforms.

> Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
> disinguish "" as "don't have" from "" as "don't want". Will revert.

Possibly we could special-case C locale, and allow an empty string only
when not C locale. However, it could be that there are other locales
where the change would surprise people, too :-(

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] Number format problem
Date: 2006-02-12 23:56:15
Message-ID: 200602122356.k1CNuFj14534@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> This patch broke the regression tests. It seems to change the C-locale
> >> behavior on many platforms.
>
> > Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
> > disinguish "" as "don't have" from "" as "don't want". Will revert.
>
> Possibly we could special-case C locale, and allow an empty string only
> when not C locale. However, it could be that there are other locales
> where the change would surprise people, too :-(

Not sure. Peter stated the French locale was wrong for the Debian they
were using, and he should know, so I am going to wait for more feeback.

One thing that has me confused is that psql/print.c doesn't check for a
zero-length string for thousands_sep in setDecimalLocale, so I am not
thinking that is wrong and should skip "" like we do in formatting.c.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "PostgreSQL-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] Number format problem
Date: 2006-02-13 02:26:17
Message-ID: 20060213032606.2165832@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Bruce Momjian wrote:

> > > Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
> > > disinguish "" as "don't have" from "" as "don't want". Will revert.
> >
> > Possibly we could special-case C locale, and allow an empty string only
> > when not C locale. However, it could be that there are other locales
> > where the change would surprise people, too :-(
>
> Not sure. Peter stated the French locale was wrong for the Debian they
> were using, and he should know, so I am going to wait for more feeback.

The baseline glibc sources from gnu.org have
(in glibc-2.3.6/localedata/locales/fr_FR):

LC_NUMERIC
decimal_point "<U002C>"
thousands_sep ""
grouping 0;0
END LC_NUMERIC

So at first sight it's more a glibc issue rather than debian-specific.

However, this was filed as a bug in debian long ago. Incidentally, it got fixed
just this week, meaning that the future debian-glibc should use a space instead
of the empty string, as reported here:
http://lists.debian.org/debian-glibc/2006/02/msg00168.html
Which is consistent with Peter's statement that it was wrong up to now, and also
with what the OP expected in the first place.

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] Number format problem
Date: 2006-02-13 02:58:50
Message-ID: 200602130258.k1D2wo209084@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-patches

Daniel Verite wrote:
> Bruce Momjian wrote:
>
> > > > Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
> > > > disinguish "" as "don't have" from "" as "don't want". Will revert.
> > >
> > > Possibly we could special-case C locale, and allow an empty string only
> > > when not C locale. However, it could be that there are other locales
> > > where the change would surprise people, too :-(
> >
> > Not sure. Peter stated the French locale was wrong for the Debian they
> > were using, and he should know, so I am going to wait for more feeback.
>
> The baseline glibc sources from gnu.org have
> (in glibc-2.3.6/localedata/locales/fr_FR):
>
> LC_NUMERIC
> decimal_point "<U002C>"
> thousands_sep ""
> grouping 0;0
> END LC_NUMERIC
>
> So at first sight it's more a glibc issue rather than debian-specific.
>
> However, this was filed as a bug in debian long ago. Incidentally, it got fixed
> just this week, meaning that the future debian-glibc should use a space instead
> of the empty string, as reported here:
> http://lists.debian.org/debian-glibc/2006/02/msg00168.html
> Which is consistent with Peter's statement that it was wrong up to now, and also
> with what the OP expected in the first place.

Oh, good. I looked and now realize that psql is already checking for a
non-"" string:

if (*extlconv->thousands_sep)

so we are consistent.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073