Re: COPY encoding

Lists: pgsql-hackers
From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: COPY encoding
Date: 2008-01-16 00:35:50
Message-ID: 478D5166.70000@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


In helping someome on IRC it has become apparent that unless I am
mistaken "COPY foo from 'filename'" is reading the file according to the
client encoding.

Is that the expected behaviour? The client might have no influence at
all on the contents of the file. Offhand, I would have said that a
server-resident file should be interpreted in the database encoding. Of
course, the client could change its encoding to influence how the file
is interpreted, but that seems rather kludgy. Maybe we really need an
encoding parameter to the COPY command.

At the very least we should document the encoding behaviour.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY encoding
Date: 2008-01-16 00:47:26
Message-ID: 15750.1200444446@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> In helping someome on IRC it has become apparent that unless I am
> mistaken "COPY foo from 'filename'" is reading the file according to the
> client encoding.

> Is that the expected behaviour?

Yes, it is. Not sure if it's adequately documented.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY encoding
Date: 2008-01-16 20:46:25
Message-ID: 478E6D21.20702@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> In helping someome on IRC it has become apparent that unless I am
>> mistaken "COPY foo from 'filename'" is reading the file according to the
>> client encoding.
>>
>
>
>> Is that the expected behaviour?
>>
>
> Yes, it is. Not sure if it's adequately documented.
>

Will this cover the case?

diff -c -r1.80 copy.sgml
*** copy.sgml 18 Apr 2007 02:28:22 -0000 1.80
--- copy.sgml 16 Jan 2008 20:44:02 -0000
***************
*** 363,368 ****
--- 363,376 ----
happened well into a large copy operation. You might wish to invoke
<command>VACUUM</command> to recover the wasted space.
</para>
+
+ <para>
+ Input data is interpreted according to the current client encoding,
+ and output data is encoded in the the current client encoding, even
+ if the data does not pass through the client but is read from or
+ written to a file.
+ </para>
+
</refsect1>

<refsect1>

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY encoding
Date: 2008-01-16 21:48:41
Message-ID: 28734.1200520121@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> Yes, it is. Not sure if it's adequately documented.

> Will this cover the case?

Text looks OK. I think it might fit better a bit further up, adjacent
to the para about DateStyle which is a somewhat comparable
consideration.

regards, tom lane