Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

Lists: pgsql-committerspgsql-hackers
From: mha(at)postgresql(dot)org (Magnus Hagander)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-05-27 12:24:42
Message-ID: 20080527122442.A996C754C54@cvs.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Log Message:
-----------
Explicitly bind gettext() to the UTF8 locale when in use.
This is required on Windows due to the special locale
handling for UTF8 that doesn't change the full environment.

Fixes crash with translated error messages per bugs 4180
and 4196.

Tom Lane

Modified Files:
--------------
pgsql/src/backend/utils/mb:
mbutils.c (r1.70 -> r1.71)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/mb/mbutils.c?r1=1.70&r2=1.71)


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: mha(at)postgresql(dot)org
Cc: pgsql-committers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-24 01:34:17
Message-ID: 492A0499.1010601@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hi Magnus and all,

Magnus Hagander wrote:
> Log Message:
> -----------
> Explicitly bind gettext() to the UTF8 locale when in use.
> This is required on Windows due to the special locale
> handling for UTF8 that doesn't change the full environment.

Thanks to this change UTF-8 case was solved but Japanese users
are still unhappy with Windows databases with EUC_JP encoding.
Shift_JIS which is a Japanese encoding under Windows doesn't
match any server encoding and causes a crash with the use of
gettext. So Saito-san removed ja message catalog just before
the 8.3 release.

Attached is a simple patch to avoid the crash and enable the
use of Japanese message catalog.
Please apply the patch if there's no problem.

regards,
Hiroshi Inoue

> Fixes crash with translated error messages per bugs 4180
> and 4196.
>
> Tom Lane
>
> Modified Files:
> --------------
> pgsql/src/backend/utils/mb:
> mbutils.c (r1.70 -> r1.71)
> (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/mb/mbutils.c?r1=1.70&r2=1.71)

Attachment Content-Type Size
mbutils.patch text/plain 904 bytes

From: "Hiroshi Saito" <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
To: "Hiroshi Inoue" <inoue(at)tpf(dot)co(dot)jp>, <mha(at)postgresql(dot)org>
Cc: <pgsql-committers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-24 07:17:40
Message-ID: E4D091955F7743BF900606C70E5CABA2@HIRO57887DE653
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hi.

It was comfortable in my environment. Then, as for me, to be contained
in the next release is desire.

BTW, I was wondering that it did not look at a problem in any countries
other than Japan. Probably, the environment with the runs locale which
can't be specified to be SERVER_ENCODING will be required.

Regards,
Hiroshi Saito

----- Original Message -----
From: "Hiroshi Inoue" <inoue(at)tpf(dot)co(dot)jp>

> Hi Magnus and all,
>
> Magnus Hagander wrote:
>> Log Message:
>> -----------
>> Explicitly bind gettext() to the UTF8 locale when in use.
>> This is required on Windows due to the special locale
>> handling for UTF8 that doesn't change the full environment.
>
> Thanks to this change UTF-8 case was solved but Japanese users
> are still unhappy with Windows databases with EUC_JP encoding.
> Shift_JIS which is a Japanese encoding under Windows doesn't
> match any server encoding and causes a crash with the use of
> gettext. So Saito-san removed ja message catalog just before
> the 8.3 release.
>
> Attached is a simple patch to avoid the crash and enable the
> use of Japanese message catalog.
> Please apply the patch if there's no problem.
>
> regards,
> Hiroshi Inoue
>
>> Fixes crash with translated error messages per bugs 4180
>> and 4196.
>>
>> Tom Lane
>>
>> Modified Files:
>> --------------
>> pgsql/src/backend/utils/mb:
>> mbutils.c (r1.70 -> r1.71)
>>
>> (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/mb/mbutils.c?r1=1.70&r2=1.71)
>
>
>

--------------------------------------------------------------------------------

> *** mbutils.c.orig Sun Nov 23 08:42:57 2008
> --- mbutils.c Sun Nov 23 13:52:41 2008
> ***************
> *** 825,830 ****
> --- 825,832 ----
> void
> SetDatabaseEncoding(int encoding)
> {
> + const char *target_codeset = NULL;
> +
> if (!PG_VALID_BE_ENCODING(encoding))
> elog(ERROR, "invalid database encoding: %d", encoding);
>
> ***************
> *** 846,852 ****
> */
> #ifdef ENABLE_NLS
> if (encoding == PG_UTF8)
> ! if (bind_textdomain_codeset("postgres", "UTF-8") == NULL)
> elog(LOG, "bind_textdomain_codeset failed");
> #endif
> }
> --- 848,860 ----
> */
> #ifdef ENABLE_NLS
> if (encoding == PG_UTF8)
> ! target_codeset = "UTF-8";
> ! #ifdef WIN32
> ! else if (encoding == PG_EUC_JP)
> ! target_codeset = "EUC_JP";
> ! #endif /* WIN32 */
> ! if (NULL != target_codeset)
> ! if (bind_textdomain_codeset("postgres", target_codeset) == NULL)
> elog(LOG, "bind_textdomain_codeset failed");
> #endif
> }
>

--------------------------------------------------------------------------------

>
> --
> Sent via pgsql-committers mailing list (pgsql-committers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-committers
>


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-24 12:28:01
Message-ID: 492A9DD1.6070502@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
> Hi Magnus and all,
>
> Magnus Hagander wrote:
>> Log Message:
>> -----------
>> Explicitly bind gettext() to the UTF8 locale when in use.
>> This is required on Windows due to the special locale
>> handling for UTF8 that doesn't change the full environment.
>
> Thanks to this change UTF-8 case was solved but Japanese users
> are still unhappy with Windows databases with EUC_JP encoding.
> Shift_JIS which is a Japanese encoding under Windows doesn't
> match any server encoding and causes a crash with the use of
> gettext. So Saito-san removed ja message catalog just before
> the 8.3 release.
>
> Attached is a simple patch to avoid the crash and enable the
> use of Japanese message catalog.
> Please apply the patch if there's no problem.

Hi!

It will clearly also need an update to the comment, but I can take care
of that.

I assume you have tested this? The comment says that it works because we
are handling UTF8 on a special way on Windows, but AFAIK we *don't*
handle EUC_JP in a special way there?

If your database is in EUC_JP, I don't see why gettext() isn't picking
it up properly in the first place..

And why do we need that on Windows only, and not on other platforms?

//Magnus


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-24 13:57:26
Message-ID: 492AB2C6.9090709@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> Hiroshi Inoue wrote:
>> Hi Magnus and all,
>>
>> Magnus Hagander wrote:
>>> Log Message:
>>> -----------
>>> Explicitly bind gettext() to the UTF8 locale when in use.
>>> This is required on Windows due to the special locale
>>> handling for UTF8 that doesn't change the full environment.
>> Thanks to this change UTF-8 case was solved but Japanese users
>> are still unhappy with Windows databases with EUC_JP encoding.
>> Shift_JIS which is a Japanese encoding under Windows doesn't
>> match any server encoding and causes a crash with the use of
>> gettext. So Saito-san removed ja message catalog just before
>> the 8.3 release.
>>
>> Attached is a simple patch to avoid the crash and enable the
>> use of Japanese message catalog.
>> Please apply the patch if there's no problem.
>
> Hi!
>
> It will clearly also need an update to the comment, but I can take care
> of that.
>
> I assume you have tested this?

Though I myself didn't test it, Saito-san tested it.

> The comment says that it works because we
> are handling UTF8 on a special way on Windows,

ISTM UTF-8 isn't a special case.
In fact the comment also mentions the following.

* In future we might want to call bind_textdomain_codeset
* unconditionally, but that....

> If your database is in EUC_JP, I don't see why gettext() isn't picking
> it up properly in the first place..

In Japan 2 encodings (EUC_JP and Shift_JIS) are often used.
EUC_JP is mainly used on *nix and on Windows Shift_JIS is
used. We use EUC_JP not Shift_JIS as the server encoding

> And why do we need that on Windows only, and not on other platforms?

because Shift_JIS isn't allowed as a server encoding. So
the Japanese Windows native message encoding Shift_JIS never
matches the server encoding EUC_JP and a conversion between
Shitt_jis and EUC_JP is necessarily needed.

regards,
Hiroshi Inoue


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-24 14:09:48
Message-ID: 492AB5AC.7060709@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
> Magnus Hagander wrote:
>> Hiroshi Inoue wrote:
>>> Hi Magnus and all,
>>>
>>> Magnus Hagander wrote:
>>>> Log Message:
>>>> -----------
>>>> Explicitly bind gettext() to the UTF8 locale when in use.
>>>> This is required on Windows due to the special locale
>>>> handling for UTF8 that doesn't change the full environment.
>>> Thanks to this change UTF-8 case was solved but Japanese users
>>> are still unhappy with Windows databases with EUC_JP encoding.
>>> Shift_JIS which is a Japanese encoding under Windows doesn't
>>> match any server encoding and causes a crash with the use of
>>> gettext. So Saito-san removed ja message catalog just before
>>> the 8.3 release.
>>>
>>> Attached is a simple patch to avoid the crash and enable the
>>> use of Japanese message catalog.
>>> Please apply the patch if there's no problem.
>> Hi!
>>
>> It will clearly also need an update to the comment, but I can take care
>> of that.
>>
>> I assume you have tested this?
>
> Though I myself didn't test it, Saito-san tested it.

Ok, good.

>> The comment says that it works because we
>> are handling UTF8 on a special way on Windows,
>
> ISTM UTF-8 isn't a special case.
> In fact the comment also mentions the following.
>
> * In future we might want to call bind_textdomain_codeset
> * unconditionally, but that....

I think that's partially unrelated. UTF8 is special in that the
environment that the backend runs in is different from SERVER_ENCODING
in this case. For other encodings, we have setlocale():d to the same
encoding.

>> If your database is in EUC_JP, I don't see why gettext() isn't picking
>> it up properly in the first place..
>
> In Japan 2 encodings (EUC_JP and Shift_JIS) are often used.
> EUC_JP is mainly used on *nix and on Windows Shift_JIS is
> used. We use EUC_JP not Shift_JIS as the server encoding
>
>> And why do we need that on Windows only, and not on other platforms?
>
> because Shift_JIS isn't allowed as a server encoding. So
> the Japanese Windows native message encoding Shift_JIS never
> matches the server encoding EUC_JP and a conversion between
> Shitt_jis and EUC_JP is necessarily needed.

Ah, so we're basically hardcoding that information? The system will go
up in SJIS, but since we can't deal with it, we switch it to EUC_JP?

Ok, I think I understand. I've made some minor stylistic changes (we
don't normally use if (NULL != <whatever>) in the pg sources), and will
apply with those.

This is for HEAD only, correct? Or is it something we should backpatch?

//Magnus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-24 16:26:04
Message-ID: 2830.1227543964@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> Hiroshi Inoue wrote:
>> because Shift_JIS isn't allowed as a server encoding. So
>> the Japanese Windows native message encoding Shift_JIS never
>> matches the server encoding EUC_JP and a conversion between
>> Shitt_jis and EUC_JP is necessarily needed.

> Ah, so we're basically hardcoding that information? The system will go
> up in SJIS, but since we can't deal with it, we switch it to EUC_JP?

I'm not following this either. If the patch is really necessary then it
seems it must be working around a bug in the Windows version of gettext,
ie failure to distinguish CP932 from CP20932. Is that correct?

> Ok, I think I understand. I've made some minor stylistic changes (we
> don't normally use if (NULL != <whatever>) in the pg sources), and will
> apply with those.

It definitely needs a comment explaining why this is needed.

regards, tom lane


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-25 03:00:37
Message-ID: 492B6A55.6010308@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Tom Lane wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> Hiroshi Inoue wrote:
>>> because Shift_JIS isn't allowed as a server encoding. So
>>> the Japanese Windows native message encoding Shift_JIS never
>>> matches the server encoding EUC_JP and a conversion between
>>> Shitt_jis and EUC_JP is necessarily needed.
>
>> Ah, so we're basically hardcoding that information? The system will go
>> up in SJIS, but since we can't deal with it, we switch it to EUC_JP?
>
> I'm not following this either. If the patch is really necessary then it
> seems it must be working around a bug in the Windows version of gettext,
> ie failure to distinguish CP932 from CP20932. Is that correct?

I'm afraid I don't understand what you mean exactly.
AFAIK the output encoding of Windows gettext is detemined by the
ANSI system code page which is usualy CP932(Shift_JIS) in Japan and
unrelated to the locale settings.
In addition CP20932 is rarely used in Japan IIRC. I've never used it
and don't know what it is correctly (maybe a kind of EUC_JP).

regards,
Hiroshi Inoue


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-25 03:11:19
Message-ID: 9559.1227582679@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
> Tom Lane wrote:
>> I'm not following this either. If the patch is really necessary then it
>> seems it must be working around a bug in the Windows version of gettext,
>> ie failure to distinguish CP932 from CP20932. Is that correct?

> I'm afraid I don't understand what you mean exactly.
> AFAIK the output encoding of Windows gettext is detemined by the
> ANSI system code page which is usualy CP932(Shift_JIS) in Japan and
> unrelated to the locale settings.

If that's true then this code is presently broken for *every* locale
under Windows, not only Japanese.

To my mind the really correct thing to be doing here would be to call
bind_textdomain_codeset in all cases, rather than trusting gettext to
guess correctly about which encoding we want. As the comment notes,
we have not attempted that because the codeset names aren't well
standardized. But it seems to me that we could certainly find out what
codeset names are used on Windows, and apply bind_textdomain_codeset
all the time on Windows. That would make a lot more sense than ad-hoc
treatment of UTF-8 and EUC-JP if you ask me ...

regards, tom lane


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-25 03:55:11
Message-ID: 492B771F.5090008@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Tom Lane wrote:
> Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
>> Tom Lane wrote:
>>> I'm not following this either. If the patch is really necessary then it
>>> seems it must be working around a bug in the Windows version of gettext,
>>> ie failure to distinguish CP932 from CP20932. Is that correct?
>
>> I'm afraid I don't understand what you mean exactly.
>> AFAIK the output encoding of Windows gettext is detemined by the
>> ANSI system code page which is usualy CP932(Shift_JIS) in Japan and
>> unrelated to the locale settings.
>
> If that's true then this code is presently broken for *every* locale
> under Windows, not only Japanese.

Maybe there are a few languages/countires where 2 encodings are
widely used.

> To my mind the really correct thing to be doing here would be to call
> bind_textdomain_codeset in all cases, rather than trusting gettext to
> guess correctly about which encoding we want. As the comment notes,
> we have not attempted that because the codeset names aren't well
> standardized. But it seems to me that we could certainly find out what
> codeset names are used on Windows, and apply bind_textdomain_codeset
> all the time on Windows. That would make a lot more sense than ad-hoc
> treatment of UTF-8 and EUC-JP if you ask me ...

I fundamentally agree with you.
What we hope is to enable the use of Japanese message catalog which
we gave up in 8.3 Windows-version release.

regards,
Hiroshi Inoue


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-25 04:00:18
Message-ID: 10005.1227585618@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
> Tom Lane wrote:
>> If that's true then this code is presently broken for *every* locale
>> under Windows, not only Japanese.

> Maybe there are a few languages/countires where 2 encodings are
> widely used.

UTF8 vs Latin-N? In any case I think the problem is that gettext is
looking at a setting that is not what we are looking at. Particularly
with the 8.4 changes to allow per-database locale settings, this has
got to be fixed in a bulletproof way.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-25 06:00:02
Message-ID: 6CDA7879-9656-44F8-A8CC-D1B2B4240145@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 25 nov 2008, at 05.00, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
>> Tom Lane wrote:
>>> If that's true then this code is presently broken for *every* locale
>>> under Windows, not only Japanese.
>
>> Maybe there are a few languages/countires where 2 encodings are
>> widely used.
>
> UTF8 vs Latin-N?

We already special-cases utf8...

I think the thing us that as long as the encodings are compatible
(latin1 with different names for example) it worked fine.

> In any case I think the problem is that gettext is
> looking at a setting that is not what we are looking at. Particularly
> with the 8.4 changes to allow per-database locale settings, this has
> got to be fixed in a bulletproof way.
>

Agreed.

/Magnus


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-11-27 14:26:01
Message-ID: 492EADF9.20804@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> On 25 nov 2008, at 05.00, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
>>> Tom Lane wrote:
>>>> If that's true then this code is presently broken for *every* locale
>>>> under Windows, not only Japanese.
>>
>>> Maybe there are a few languages/countires where 2 encodings are
>>> widely used.
>>
>> UTF8 vs Latin-N?
>
> We already special-cases utf8...
>
> I think the thing us that as long as the encodings are compatible
> (latin1 with different names for example) it worked fine.
>
>> In any case I think the problem is that gettext is
>> looking at a setting that is not what we are looking at. Particularly
>> with the 8.4 changes to allow per-database locale settings, this has
>> got to be fixed in a bulletproof way.

Attached is a new patch to apply bind_textdomain_codeset() to most
server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
and PG_EUC_JIS_2004. "EUC-JP" may be OK for EUC_JIS_2004.

Unfortunately it's hard for Saito-san and me to check encodings
other than EUC-JP.

regards,
Hiroshi Inoue

Attachment Content-Type Size
mbutils.patch text/plain 2.1 KB

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-12-03 10:05:21
Message-ID: 493659E1.2080207@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
>> I think the thing us that as long as the encodings are compatible
>> (latin1 with different names for example) it worked fine.
>>
>>> In any case I think the problem is that gettext is
>>> looking at a setting that is not what we are looking at. Particularly
>>> with the 8.4 changes to allow per-database locale settings, this has
>>> got to be fixed in a bulletproof way.
>
> Attached is a new patch to apply bind_textdomain_codeset() to most
> server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
> and PG_EUC_JIS_2004. "EUC-JP" may be OK for EUC_JIS_2004.
>
> Unfortunately it's hard for Saito-san and me to check encodings
> other than EUC-JP.

In principle this looks good, I think, but I'm a bit worried around the
lack of testing. I can do some testing under LATIN1 which is what we use
in Sweden (just need to get gettext working *at all* in my dev
environment again - I've somehow managed to break it), and perhaps we
can find someone to do a test in an eastern-european locale to get some
more datapoints?

Can you outline the steps one needs to go through to show the problem,
so we can confirm it's fixed in these locales?

//Magnus


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-12-04 01:35:42
Message-ID: 493733EE.7000503@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> Hiroshi Inoue wrote:
>>> I think the thing us that as long as the encodings are compatible
>>> (latin1 with different names for example) it worked fine.
>>>
>>>> In any case I think the problem is that gettext is
>>>> looking at a setting that is not what we are looking at. Particularly
>>>> with the 8.4 changes to allow per-database locale settings, this has
>>>> got to be fixed in a bulletproof way.
>> Attached is a new patch to apply bind_textdomain_codeset() to most
>> server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
>> and PG_EUC_JIS_2004. "EUC-JP" may be OK for EUC_JIS_2004.
>>
>> Unfortunately it's hard for Saito-san and me to check encodings
>> other than EUC-JP.
>
> In principle this looks good, I think, but I'm a bit worried around the
> lack of testing.

Thanks and I agree with you.

> I can do some testing under LATIN1 which is what we use
> in Sweden (just need to get gettext working *at all* in my dev
> environment again - I've somehow managed to break it), and perhaps we
> can find someone to do a test in an eastern-european locale to get some
> more datapoints?
>
> Can you outline the steps one needs to go through to show the problem,
> so we can confirm it's fixed in these locales?

Saito-san and I have been working on another related problem about
changing LC_MESSAGES locale properly under Windows and would be able
to provide a patch in a few days. It seems preferable for us to apply
the patch also so as to change the message catalog easily.

Attached is an example in which LC_MESSAGES is cht_twn(zh_TW) and
the server encoding is EUC-TW. You can see it as a UTF-8 text
because the client_encoding is set to UTF-8 first.

BTW you can see another problem at line 4 in the text.
At the point the LC_MESSAGES is still japanese and postgres fails
to convert a Japanese error message to EUC_TW encoding. There's
no wonder but it doesn't seem preferable.

regards,
Hiroshi Inoue

Attachment Content-Type Size
euctw.txt text/plain 704 bytes

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-09 20:03:55
Message-ID: 200901092003.n09K3tR20529@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers


Hiroshi, is this patch still needed?

---------------------------------------------------------------------------

Hiroshi Inoue wrote:
> Magnus Hagander wrote:
> > On 25 nov 2008, at 05.00, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> >> Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
> >>> Tom Lane wrote:
> >>>> If that's true then this code is presently broken for *every* locale
> >>>> under Windows, not only Japanese.
> >>
> >>> Maybe there are a few languages/countires where 2 encodings are
> >>> widely used.
> >>
> >> UTF8 vs Latin-N?
> >
> > We already special-cases utf8...
> >
> > I think the thing us that as long as the encodings are compatible
> > (latin1 with different names for example) it worked fine.
> >
> >> In any case I think the problem is that gettext is
> >> looking at a setting that is not what we are looking at. Particularly
> >> with the 8.4 changes to allow per-database locale settings, this has
> >> got to be fixed in a bulletproof way.
>
> Attached is a new patch to apply bind_textdomain_codeset() to most
> server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
> and PG_EUC_JIS_2004. "EUC-JP" may be OK for EUC_JIS_2004.
>
> Unfortunately it's hard for Saito-san and me to check encodings
> other than EUC-JP.
>
> regards,
> Hiroshi Inoue

> *** mbutils.c.orig Sun Nov 23 08:42:57 2008
> --- mbutils.c Wed Nov 26 12:17:12 2008
> ***************
> *** 822,830 ****
> --- 822,870 ----
> return clen;
> }
>
> + #ifdef WIN32
> + static const struct codeset_map {
> + int encoding;
> + const char *codeset;
> + } codeset_map_array[] = {
> + {PG_UTF8, "UTF-8"},
> + {PG_LATIN1, "LATIN1"},
> + {PG_LATIN2, "LATIN2"},
> + {PG_LATIN3, "LATIN3"},
> + {PG_LATIN4, "LATIN4"},
> + {PG_ISO_8859_5, "ISO-8859-5"},
> + {PG_ISO_8859_6, "ISO_8859-6"},
> + {PG_ISO_8859_7, "ISO-8859-7"},
> + {PG_ISO_8859_8, "ISO-8859-8"},
> + {PG_LATIN5, "LATIN5"},
> + {PG_LATIN6, "LATIN6"},
> + {PG_LATIN7, "LATIN7"},
> + {PG_LATIN8, "LATIN8"},
> + {PG_LATIN9, "LATIN-9"},
> + {PG_LATIN10, "LATIN10"},
> + {PG_KOI8R, "KOI8-R"},
> + {PG_WIN1250, "CP1250"},
> + {PG_WIN1251, "CP1251"},
> + {PG_WIN1252, "CP1252"},
> + {PG_WIN1253, "CP1253"},
> + {PG_WIN1254, "CP1254"},
> + {PG_WIN1255, "CP1255"},
> + {PG_WIN1256, "CP1256"},
> + {PG_WIN1257, "CP1257"},
> + {PG_WIN1258, "CP1258"},
> + {PG_WIN866, "CP866"},
> + {PG_WIN874, "CP874"},
> + {PG_EUC_CN, "EUC-CN"},
> + {PG_EUC_JP, "EUC-JP"},
> + {PG_EUC_KR, "EUC-KR"},
> + {PG_EUC_TW, "EUC-TW"}};
> + #endif /* WIN32 */
> +
> void
> SetDatabaseEncoding(int encoding)
> {
> + const char *target_codeset = NULL;
> +
> if (!PG_VALID_BE_ENCODING(encoding))
> elog(ERROR, "invalid database encoding: %d", encoding);
>
> ***************
> *** 846,852 ****
> */
> #ifdef ENABLE_NLS
> if (encoding == PG_UTF8)
> ! if (bind_textdomain_codeset("postgres", "UTF-8") == NULL)
> elog(LOG, "bind_textdomain_codeset failed");
> #endif
> }
> --- 886,907 ----
> */
> #ifdef ENABLE_NLS
> if (encoding == PG_UTF8)
> ! target_codeset = "UTF-8";
> ! #ifdef WIN32
> ! else
> ! {
> ! int i;
> !
> ! for (i = 0; i < sizeof(codeset_map_array) / sizeof(struct codeset_map); i++)
> ! if (codeset_map_array[i].encoding == encoding)
> ! {
> ! target_codeset = codeset_map_array[i].codeset;
> ! break;
> ! }
> ! }
> ! #endif /* WIN32 */
> ! if (target_codeset != NULL)
> ! if (bind_textdomain_codeset("postgres", target_codeset) == NULL)
> elog(LOG, "bind_textdomain_codeset failed");
> #endif
> }

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Magnus Hagander <magnus(at)hagander(dot)net>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-09 21:30:49
Message-ID: 1662.1231536649@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Hiroshi, is this patch still needed?

This patch is for a problem that's entirely separate from the LC_TIME
issue, if that's what you were wondering.

regards, tom lane


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-09 21:45:46
Message-ID: 4967C58A.209@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Bruce Momjian wrote:
> Hiroshi, is this patch still needed?

Yes though it should be slightly changed now.
*set lc_messages does not work* issue isn't directly related to this
issue.

regards,
Hiroshi Inoue


From: "Hiroshi Saito" <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
To: "Hiroshi Inoue" <inoue(at)tpf(dot)co(dot)jp>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "PG Hackers" <pgsql-hackers(at)postgresql(dot)org>, "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-10 16:04:18
Message-ID: EE5CC16053E74B31AB19DE1C26A5397A@HIRO57887DE653
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hi.

My swift attack test was the MinGW environment.
But, Inoue-san suggestion.

1. MinGW+gcc build
HIROSHI=# set LC_TIME=Ja;
SET
HIROSHI=# select to_char(now(),'TMDay');
to_char
---------
日曜日
(1 行)
HIROSHI=# set LC_TIME='Japan';
SET
HIROSHI=# select to_char(Now(),'TMDay');
to_char
---------
日曜日
(1 行)
HIROSHI=# set LC_TIME='Japanese';
SET
HIROSHI=# select to_char(Now(),'TMDay');
to_char
---------
日曜日
(1 行)

However, A setup of 'Ja' was strange.?_?
http://msdn.microsoft.com/en-us/library/aa246450(VS.60).aspx

2. MSVC build
HIROSHI=# set LC_TIME='Ja';
ERROR: invalid value for parameter "lc_time": "Ja"
STATEMENT: set LC_TIME='Ja';
ERROR: invalid value for parameter "lc_time": "Ja"
HIROSHI=# set LC_TIME='Japan';
ERROR: invalid value for parameter "lc_time": "Japan"
STATEMENT: set LC_TIME='Japan';
ERROR: invalid value for parameter "lc_time": "Japan"
HIROSHI=# set LC_TIME=Japanese;
SET
HIROSHI=# select to_char(Now(),'TMDay');
to_char
---------
日曜日
(1 行)

Umm, Re-investigation is required for this. :-(
However, If reasonable clear, it will be good for a document at suggestion.

Regards,
Hiroshi Saito


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-11 14:50:18
Message-ID: 496A072A.7020309@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
> Bruce Momjian wrote:
>> Hiroshi, is this patch still needed?
>
> Yes though it should be slightly changed now.
> *set lc_messages does not work* issue isn't directly related to this
> issue.

Though I'm not sure how we can test it, I can provide test results
like the attached one.

The attached is a test result in case LC_MESSAGES=fr causing
the following 3 errors in databases with various encodings.

select a; ==> column "a" does not exist
1; ==> syntax error at or near "1"
select * from a; ==> relation "a" does not exist

Comments?
Please note the encoding of the attached file is utf-8.

regards,
Hiroshi Inoue

Attachment Content-Type Size
fr.txt text/plain 12.4 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: mha(at)postgresql(dot)org, pgsql-committers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-14 22:19:59
Message-ID: 200901142219.n0EMJxH16527@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers


Is this patch still needed?

---------------------------------------------------------------------------

Hiroshi Inoue wrote:
> Hi Magnus and all,
>
> Magnus Hagander wrote:
> > Log Message:
> > -----------
> > Explicitly bind gettext() to the UTF8 locale when in use.
> > This is required on Windows due to the special locale
> > handling for UTF8 that doesn't change the full environment.
>
> Thanks to this change UTF-8 case was solved but Japanese users
> are still unhappy with Windows databases with EUC_JP encoding.
> Shift_JIS which is a Japanese encoding under Windows doesn't
> match any server encoding and causes a crash with the use of
> gettext. So Saito-san removed ja message catalog just before
> the 8.3 release.
>
> Attached is a simple patch to avoid the crash and enable the
> use of Japanese message catalog.
> Please apply the patch if there's no problem.
>
> regards,
> Hiroshi Inoue
>
> > Fixes crash with translated error messages per bugs 4180
> > and 4196.
> >
> > Tom Lane
> >
> > Modified Files:
> > --------------
> > pgsql/src/backend/utils/mb:
> > mbutils.c (r1.70 -> r1.71)
> > (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/mb/mbutils.c?r1=1.70&r2=1.71)
>
>

> *** mbutils.c.orig Sun Nov 23 08:42:57 2008
> --- mbutils.c Sun Nov 23 13:52:41 2008
> ***************
> *** 825,830 ****
> --- 825,832 ----
> void
> SetDatabaseEncoding(int encoding)
> {
> + const char *target_codeset = NULL;
> +
> if (!PG_VALID_BE_ENCODING(encoding))
> elog(ERROR, "invalid database encoding: %d", encoding);
>
> ***************
> *** 846,852 ****
> */
> #ifdef ENABLE_NLS
> if (encoding == PG_UTF8)
> ! if (bind_textdomain_codeset("postgres", "UTF-8") == NULL)
> elog(LOG, "bind_textdomain_codeset failed");
> #endif
> }
> --- 848,860 ----
> */
> #ifdef ENABLE_NLS
> if (encoding == PG_UTF8)
> ! target_codeset = "UTF-8";
> ! #ifdef WIN32
> ! else if (encoding == PG_EUC_JP)
> ! target_codeset = "EUC_JP";
> ! #endif /* WIN32 */
> ! if (NULL != target_codeset)
> ! if (bind_textdomain_codeset("postgres", target_codeset) == NULL)
> elog(LOG, "bind_textdomain_codeset failed");
> #endif
> }
>
>
> --
> Sent via pgsql-committers mailing list (pgsql-committers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-committers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, mha(at)postgresql(dot)org, pgsql-committers(at)postgresql(dot)org, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-14 22:23:37
Message-ID: 23008.1231971817@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Is this patch still needed?

No; this is the beginning of the still-continuing thread about
lc_messages and Windows. I'm not sure what we'll end up applying,
but not this.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-19 15:18:54
Message-ID: 497499DE.5080703@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
> Hiroshi Inoue wrote:
>> Bruce Momjian wrote:
>>> Hiroshi, is this patch still needed?
>>
>> Yes though it should be slightly changed now.

In what way should it be changed?

//Magnus


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-20 03:31:29
Message-ID: 49754591.9020105@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> Hiroshi Inoue wrote:
>> Hiroshi Inoue wrote:
>>> Bruce Momjian wrote:
>>>> Hiroshi, is this patch still needed?
>>> Yes though it should be slightly changed now.
>
> In what way should it be changed?

One is already committed by you.
[COMMITTERS] pgsql: Use the new text domain names

Another is to bind the codeset "EUC-JP" for
PG_EUC_JIS_2004 server encoding.
Though EUC_JP and EUC_JIS_2004 aren't completely
compatible, it seems OK in most cases.

regards,
Hiroshi Inoue


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-20 18:08:38
Message-ID: 49761326.5080609@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
> Magnus Hagander wrote:
>> Hiroshi Inoue wrote:
>>> Hiroshi Inoue wrote:
>>>> Bruce Momjian wrote:
>>>>> Hiroshi, is this patch still needed?
>>>> Yes though it should be slightly changed now.
>>
>> In what way should it be changed?
>
> One is already committed by you.
> [COMMITTERS] pgsql: Use the new text domain names
>
> Another is to bind the codeset "EUC-JP" for
> PG_EUC_JIS_2004 server encoding.

The attached is an updated patch.

regards,
Hiroshi Inoue

Attachment Content-Type Size
mbutils.patch text/plain 2.5 KB

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-21 12:35:53
Message-ID: 497716A9.7050700@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
> Hiroshi Inoue wrote:
>> Magnus Hagander wrote:
>>> Hiroshi Inoue wrote:
>>>> Hiroshi Inoue wrote:
>>>>> Bruce Momjian wrote:
>>>>>> Hiroshi, is this patch still needed?
>>>>> Yes though it should be slightly changed now.
>>>
>>> In what way should it be changed?
>>
>> One is already committed by you.
>> [COMMITTERS] pgsql: Use the new text domain names
>>
>> Another is to bind the codeset "EUC-JP" for
>> PG_EUC_JIS_2004 server encoding.
>
> The attached is an updated patch.

Thanks.

Looking at it, the comment clearly needs updating - I'll do that.

However, one question: The comment currently says it's harmless to do
this on non-windows platforms. Does this still hold true? In that case,
this whole thing shouldn't be #ifdef:ed to WIN32 and can be simplified.
Or does the "middle part" of the comment come into play, in that the
codeset names can be different on different platforms?

Peter, can you comment on that?

If we do keep the thing win32 only, I think we should just wrap the
whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
Unix - that'll make for cleaner code.

//Magnus


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-21 12:54:54
Message-ID: 49771B1E.6040109@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> However, one question: The comment currently says it's harmless to do
> this on non-windows platforms. Does this still hold true?

Yes, the non-WIN32 code path appears to be the same, still. But the
ifdef WIN32 part we don't want, because that presumes something about
the spelling of encoding names in the local iconv library.

> If we do keep the thing win32 only, I think we should just wrap the
> whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
> Unix - that'll make for cleaner code.

Yes, that would be much better.


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-21 13:34:30
Message-ID: 49772466.4090107@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Peter Eisentraut wrote:
> Magnus Hagander wrote:
>> However, one question: The comment currently says it's harmless to do
>> this on non-windows platforms. Does this still hold true?
>
> Yes, the non-WIN32 code path appears to be the same, still. But the
> ifdef WIN32 part we don't want, because that presumes something about
> the spelling of encoding names in the local iconv library.
>
>> If we do keep the thing win32 only, I think we should just wrap the
>> whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
>> Unix - that'll make for cleaner code.
>
> Yes, that would be much better.

Something like this then?

//Magnus

Attachment Content-Type Size
gettext_win32.patch text/x-diff 2.7 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-21 15:54:41
Message-ID: 49774541.5070104@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> Peter Eisentraut wrote:
>> Magnus Hagander wrote:
>>> However, one question: The comment currently says it's harmless to do
>>> this on non-windows platforms. Does this still hold true?
>> Yes, the non-WIN32 code path appears to be the same, still. But the
>> ifdef WIN32 part we don't want, because that presumes something about
>> the spelling of encoding names in the local iconv library.
>>
>>> If we do keep the thing win32 only, I think we should just wrap the
>>> whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
>>> Unix - that'll make for cleaner code.
>> Yes, that would be much better.
>
> Something like this then?

Looks OK to me.


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-22 09:00:00
Message-ID: 49783590.70506@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> Peter Eisentraut wrote:
>> Magnus Hagander wrote:
>>> However, one question: The comment currently says it's harmless to do
>>> this on non-windows platforms. Does this still hold true?
>> Yes, the non-WIN32 code path appears to be the same, still. But the
>> ifdef WIN32 part we don't want, because that presumes something about
>> the spelling of encoding names in the local iconv library.
>>
>>> If we do keep the thing win32 only, I think we should just wrap the
>>> whole thing in #ifdef WIN32 and no longer do the codeset stuff at all on
>>> Unix - that'll make for cleaner code.
>> Yes, that would be much better.
>
> Something like this then?

It seems OK to me.

regards,
Hiroshi Inoue


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2009-01-22 10:10:05
Message-ID: 497845FD.3060608@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hiroshi Inoue wrote:
> Magnus Hagander wrote:
>> Peter Eisentraut wrote:
>>> Magnus Hagander wrote:
>>>> However, one question: The comment currently says it's harmless to do
>>>> this on non-windows platforms. Does this still hold true?
>>> Yes, the non-WIN32 code path appears to be the same, still. But the
>>> ifdef WIN32 part we don't want, because that presumes something about
>>> the spelling of encoding names in the local iconv library.
>>>
>>>> If we do keep the thing win32 only, I think we should just wrap the
>>>> whole thing in #ifdef WIN32 and no longer do the codeset stuff at
>>>> all on
>>>> Unix - that'll make for cleaner code.
>>> Yes, that would be much better.
>>
>> Something like this then?
>
> It seems OK to me.

Applied.

//Magnus