Re: [COMMITTERS] pgsql: Cause pg_proc.probin to be declared as text, not bytea.

Lists: pgsql-committerspgsql-hackers
From: tgl(at)postgresql(dot)org (Tom Lane)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Cause pg_proc.probin to be declared as text, not bytea.
Date: 2009-08-04 04:04:12
Message-ID: 20090804040412.5A3C975331E@cvs.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Log Message:
-----------
Cause pg_proc.probin to be declared as text, not bytea. Everything was
already treating it as text anyway, to the point that I couldn't find anything
to change except the datatype markings in catalog/*.h. The only effect that
the bytea declaration had was to cause byteaout() to be invoked when pg_dump
(or another client program) inspected the column value. Since pg_dump wasn't
expecting that, but just treating what it got as text, the net result is that
dump and reload would mangle any backslashes or non-ASCII characters in the
filename string for a C-language function. That is a very long-standing bug,
but given the lack of field complaints it doesn't seem worth trying to find
a back-patchable fix. We'll just make this change to fix it going forward.

This change will also forestall problems after the planned change to let bytea
emit hex output instead of escaped characters.

Modified Files:
--------------
pgsql/doc/src/sgml:
catalogs.sgml (r2.204 -> r2.205)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/catalogs.sgml?r1=2.204&r2=2.205)
pgsql/src/include/catalog:
catversion.h (r1.536 -> r1.537)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/catversion.h?r1=1.536&r2=1.537)
pg_attribute.h (r1.150 -> r1.151)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/pg_attribute.h?r1=1.150&r2=1.151)
pg_proc.h (r1.548 -> r1.549)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/pg_proc.h?r1=1.548&r2=1.549)


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)postgresql(dot)org>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Cause pg_proc.probin to be declared as text, not bytea.
Date: 2009-08-04 11:33:23
Message-ID: 407d949e0908040433u33993824q1d8825e3ebc4ced3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Aug 4, 2009 at 5:04 AM, Tom Lane<tgl(at)postgresql(dot)org> wrote:
> Cause pg_proc.probin to be declared as text, not bytea.

Doesn't this relate to the earlier discussion of whether to re-encode
filenames and paths?

What's going to happen if I have filenames which aren't valid encoded
strings in the server encoding -- say UTF8 filenames but I'm using
latin1 in the server or vice versa. Will my CREATE FUNCTION command
end up storing an invalid encoded string? Or re-encode the filename
and then fail to find the file?

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Cause pg_proc.probin to be declared as text, not bytea.
Date: 2009-08-04 13:46:45
Message-ID: 21889.1249393605@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Tue, Aug 4, 2009 at 5:04 AM, Tom Lane<tgl(at)postgresql(dot)org> wrote:
>> Cause pg_proc.probin to be declared as text, not bytea.

> Doesn't this relate to the earlier discussion of whether to re-encode
> filenames and paths?

> What's going to happen if I have filenames which aren't valid encoded
> strings in the server encoding -- say UTF8 filenames but I'm using
> latin1 in the server or vice versa. Will my CREATE FUNCTION command
> end up storing an invalid encoded string? Or re-encode the filename
> and then fail to find the file?

Right at the moment we simply aren't considering any of those cases.
If you'd like to propose and implement a solution, feel free. I think
the last proposal foundered on the fact that it had no idea what
encoding the filesystem was expecting anyway.

I'll point out though that having probin declared bytea would surely
be antithetical to any attempt to treat shlib filenames in an
encoding-aware fashion. Declaring it that way implies that it is
*not* storing a character string that has any particular encoding.

regards, tom lane


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Cause pg_proc.probin to be declared as text, not bytea.
Date: 2009-08-04 14:42:34
Message-ID: 407d949e0908040742i7d8b84f7x6fefbe200eed63b2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Aug 4, 2009 at 2:46 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I'll point out though that having probin declared bytea would surely
> be antithetical to any attempt to treat shlib filenames in an
> encoding-aware fashion.  Declaring it that way implies that it is
> *not* storing a character string that has any particular encoding.

Well that's kind of the point. Unix filesystems traditionally prohibit
'/' and '\0' but otherwise allowing any series of bytes without
requiring any particular encoding. If we used bytea to store
filesystem paths then you could specify any arbitrary series of bytes
without worrying that the server will re-encode it differently.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Cause pg_proc.probin to be declared as text, not bytea.
Date: 2009-08-04 15:00:49
Message-ID: 4A784D21.7000806@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Greg Stark wrote:
> On Tue, Aug 4, 2009 at 2:46 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> I'll point out though that having probin declared bytea would surely
>> be antithetical to any attempt to treat shlib filenames in an
>> encoding-aware fashion. Declaring it that way implies that it is
>> *not* storing a character string that has any particular encoding.
>>
>
> Well that's kind of the point. Unix filesystems traditionally prohibit
> '/' and '\0' but otherwise allowing any series of bytes without
> requiring any particular encoding. If we used bytea to store
> filesystem paths then you could specify any arbitrary series of bytes
> without worrying that the server will re-encode it differently.
>
>

Is this any different from the path in "COPY foo to '/path/to/file'"?

I suspect the probin stuff is a solution in search of a problem.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Cause pg_proc.probin to be declared as text, not bytea.
Date: 2009-08-04 15:37:02
Message-ID: 24868.1249400222@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Is this any different from the path in "COPY foo to '/path/to/file'"?
> I suspect the probin stuff is a solution in search of a problem.

Well, the previous probin behavior is demonstrably broken. Make a shlib
with backslash or non-ASCII in the name, create a function referencing
it, dump and reload. Whatever your opinions are about encodings, you
won't think pg_dump did the right thing.

I'm not sure whether the more general pathname encoding issue is worth
working on or not. In general it's a non-problem if the paths in the
server filesystem are written in the database encoding. If they are
not, then you have to figure out what they *are* written in, and that
seems a bit tough. But anyway that problem is hardly restricted to
probin, and a solution that works only for probin doesn't seem terribly
interesting.

regards, tom lane