Re: UTF8 with BOM support in psql

From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UTF8 with BOM support in psql
Date: 2009-11-18 04:03:39
Message-ID: 20091118130339.A493.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> Itagaki Takahiro wrote:
> > Multi-byte scripts
> > without encoding are always dangerous whether BOM is present or not.
> > I'd say we can always throw an error when we find queries that contain
> > multi-byte characters if no prior encoding declaration.
>
> You will break a gazillion scripts that today work quite happily if you do.

Sure. That's why I didn't send a patch for it :)
If by any chance we do so, we'll have a boolean option to disable the check.

> Maybe there is a case for a extra command line switch to set the initial
> client encoding for psql, which would make that a little easier and less
> obscure to do. Would that make things simpler for you?

No. There are complex reasons on Windows in Japan. The client encoding is
always SJIS because of Windows restriction, but the database is initialized
with UTF8. Simple interactive works with psql are done under SJIS encoding,
but some scripts are written in UTF8 because it matches the server encoding.
(Of course the script is executed as "psql -f utf8.sql > out.txt")

I don't want user to check the encoding of scripts before executing --
it is far from fail-safe.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-11-18 04:13:17 Re: operator exclusion constraints
Previous Message Itagaki Takahiro 2009-11-18 03:52:53 Re: UTF8 with BOM support in psql