Re: UNICODE/UTF-8 on win32

Lists: pgsql-hackerspgsql-hackers-win32
From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Tatsuo Ishii" <t-ishii(at)sra(dot)co(dot)jp>
Cc: <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers-win32(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNICODE/UTF-8 on win32
Date: 2005-01-02 13:27:54
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCE4764A9@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

>I do understand the problem, but don't undertstand the decision you
>guys made. The fact that UPPER/LOWER and some other functions does not
>work in win32 is surely a problem for some languages, but not a
>problem for otheres. For example, Japanese (and probably Chinese and
>Korean) does not have a concept upper/lower. So the fact UPPER/LOWER
>does not work with UTF-8/win32 is not problem for Japanese (and for
>some other languages). Just using C locale with UTF-8 is enough in
>this case.

The main issue is not with upper/lower, it's with ORDER BY (and doesn't
that affect indexes as well). This affects Japanese as well, no?

I didn't consider the C locale. Do you know for a fact that it works
there on win32 as well, or is that an assumption? (I don't know either
way)

>In summary, I think you guys are going to overkill the multibyte
>support functionality on UTF-8/win32 because of the fact that some
>langauges do not work.

I was under the impression that *no* languages worked. If some do work,
then we definitly should not kill it.

It would be good to have some way of detecting if it worked or not at
the time of creation of the database. But I have no idea on how to do
that in a reasonable way.

//Magnus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Magnus Hagander" <mha(at)sollentuna(dot)net>
Cc: "Tatsuo Ishii" <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers-win32(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNICODE/UTF-8 on win32
Date: 2005-01-02 17:45:33
Message-ID: 11072.1104687933@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

"Magnus Hagander" <mha(at)sollentuna(dot)net> writes:
> I didn't consider the C locale. Do you know for a fact that it works
> there on win32 as well, or is that an assumption?

It should work. The only use of strcoll() in the backend is in
varstr_cmp which uses strncmp() instead for C locale. Lack of
working upper/lower is hardly a fatal objection, considering that
we never had that for UTF8 before 8.0 anyway. But you do have to
have working varstr_cmp.

> It would be good to have some way of detecting if it worked or not at
> the time of creation of the database. But I have no idea on how to do
> that in a reasonable way.

At this point I'd say that any combination of UTF8 encoding with a non
C/POSIX locale probably isn't going to work on Windows. Tatsuo, do you
know of other cases that will work?

regards, tom lane


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: mha(at)sollentuna(dot)net, pgsql-hackers-win32(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNICODE/UTF-8 on win32
Date: 2005-01-03 00:48:02
Message-ID: 20050103.094802.48401513.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

> "Magnus Hagander" <mha(at)sollentuna(dot)net> writes:
> > I didn't consider the C locale. Do you know for a fact that it works
> > there on win32 as well, or is that an assumption?
>
> It should work. The only use of strcoll() in the backend is in
> varstr_cmp which uses strncmp() instead for C locale. Lack of
> working upper/lower is hardly a fatal objection, considering that
> we never had that for UTF8 before 8.0 anyway. But you do have to
> have working varstr_cmp.
>
> > It would be good to have some way of detecting if it worked or not at
> > the time of creation of the database. But I have no idea on how to do
> > that in a reasonable way.
>
> At this point I'd say that any combination of UTF8 encoding with a non
> C/POSIX locale probably isn't going to work on Windows. Tatsuo, do you
> know of other cases that will work?

No. I think C is the only working locale.
--
Tatsuo Ishii


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: mha(at)sollentuna(dot)net
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers-win32(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNICODE/UTF-8 on win32
Date: 2005-01-03 00:48:51
Message-ID: 20050103.094851.123968739.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

> >I do understand the problem, but don't undertstand the decision you
> >guys made. The fact that UPPER/LOWER and some other functions does not
> >work in win32 is surely a problem for some languages, but not a
> >problem for otheres. For example, Japanese (and probably Chinese and
> >Korean) does not have a concept upper/lower. So the fact UPPER/LOWER
> >does not work with UTF-8/win32 is not problem for Japanese (and for
> >some other languages). Just using C locale with UTF-8 is enough in
> >this case.
>
> The main issue is not with upper/lower, it's with ORDER BY (and doesn't
> that affect indexes as well). This affects Japanese as well, no?

As long as used with C locale, indexes should be ok. ORDER BY is not
perfect but we can live with it. Since Japanese is an ideogram, we
cannot rely on ORDER BY character codes to sort Japanese characters
anyway. I believe same thing can be said to Chinese.

> I didn't consider the C locale. Do you know for a fact that it works
> there on win32 as well, or is that an assumption? (I don't know either
> way)

I have not tested 8.0 on win32, but I think it should work with C
locale since I know PowerGres, which is based on 7.4, works.

> >In summary, I think you guys are going to overkill the multibyte
> >support functionality on UTF-8/win32 because of the fact that some
> >langauges do not work.
>
> I was under the impression that *no* languages worked. If some do work,
> then we definitly should not kill it.
>
> It would be good to have some way of detecting if it worked or not at
> the time of creation of the database. But I have no idea on how to do
> that in a reasonable way.
--
Tatsuo Ishii


From: Jonathan Barnhart <jdbarnhart(at)yahoo(dot)com>
To: pgsql-hackers-win32(at)postgresql(dot)org, mha(at)sollentuna(dot)net
Subject: Any chance of a merge module?
Date: 2005-01-03 12:16:39
Message-ID: 20050103121639.36748.qmail@web53706.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

What would it take to make the PG installer into a merge module? I
don't have the stuff to build PG so I can't build the PG install,
though I do have Wix. It would make my life (and anyone else using PG
for a specific app) a lot easier if you guys would allow us to embed
the PG install in our own install. This would let us just pass in the
setup info for the app and let PG install mostly silently. For my app,
the only thing the user needs to see from PG is the license which is
different from the commercial license on the rest of the product. The
rest I can configure from the main install. Right now the end user has
to configure things right and follow directions, and that leads to tech
support issues when they screw up. I tried using the silent install
option on the main MSI and got all sorts of problems. (Besides, many
Win2k setups with their old MSIexec don't even support a silent
install.)

=====
"We'll do the undoable, work the unworkable, scrute the inscrutable and have a long, hard look at the ineffable to see whether it might not be effed after all"