newline conversion in SQL command strings

Lists: pgsql-hackers
From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: newline conversion in SQL command strings
Date: 2012-09-20 02:56:03
Message-ID: 1348109763.10816.13.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have received a number of bug reports about plsh choking on
Windows-style line endings. The problem is that the user uses some
Windows-based tool or other to execute an SQL command line this:

CREATE FUNCTION foo() RETURNS something<CR><LF>
LANGUAGE plsh<CR><LF>
AS $$<CR><LF>
#!/bin/sh<CR><LF>
<CR><LF>
do something<CR><LF>
do something<CR><LF>
$$;<CR><LF>

which (apparently, I don't have Windows handy) creates a function with
the prosrc body of

'<CR><LF>
#!/bin/sh<CR><LF>
<CR><LF>
do something<CR><LF>
do something<CR><LF>
'

But executing this fails because Unix shells reject <CR> characters in
inappropriate places as syntax errors.

I don't know how to handle that. It would be unfortunate to have the
behavior of a function depend on the kind of client used to create or
modify it.

I suppose the other more mainstream PLs don't expose that problem so
much because the interpreters can handle either kind of line ending.
But there could still be functional differences if for example a line
ending is embedded into a string constant.

Ideas?


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: newline conversion in SQL command strings
Date: 2012-09-20 06:01:28
Message-ID: 505AB138.9090307@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20.09.2012 05:56, Peter Eisentraut wrote:
> I have received a number of bug reports about plsh choking on
> Windows-style line endings. The problem is that the user uses some
> Windows-based tool or other to execute an SQL command line this:
>
> CREATE FUNCTION foo() RETURNS something<CR><LF>
> LANGUAGE plsh<CR><LF>
> AS $$<CR><LF>
> #!/bin/sh<CR><LF>
> <CR><LF>
> do something<CR><LF>
> do something<CR><LF>
> $$;<CR><LF>
>
> which (apparently, I don't have Windows handy) creates a function with
> the prosrc body of
>
> '<CR><LF>
> #!/bin/sh<CR><LF>
> <CR><LF>
> do something<CR><LF>
> do something<CR><LF>
> '
>
> But executing this fails because Unix shells reject<CR> characters in
> inappropriate places as syntax errors.
>
> I don't know how to handle that. It would be unfortunate to have the
> behavior of a function depend on the kind of client used to create or
> modify it.

Could you strip the CRs? Either at CREATE FUNCTION time, or when the
function is executed.

- Heikki


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: newline conversion in SQL command strings
Date: 2012-09-20 13:12:25
Message-ID: 505B1639.5020702@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/20/12 2:01 AM, Heikki Linnakangas wrote:
> Could you strip the CRs? Either at CREATE FUNCTION time, or when the
> function is executed.

It has been proposed that the plsh handler should strip the CRs before
execution. But I don't think that is a correct solution, because that
is user data which could be relevant. It could be the case, for
example, that plsh is run on a Windows host with a particular shell that
requires CRLF line endings.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: newline conversion in SQL command strings
Date: 2012-09-20 19:23:36
Message-ID: 505B6D38.5080203@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 09/20/2012 09:12 AM, Peter Eisentraut wrote:
> On 9/20/12 2:01 AM, Heikki Linnakangas wrote:
>> Could you strip the CRs? Either at CREATE FUNCTION time, or when the
>> function is executed.
> It has been proposed that the plsh handler should strip the CRs before
> execution. But I don't think that is a correct solution, because that
> is user data which could be relevant. It could be the case, for
> example, that plsh is run on a Windows host with a particular shell that
> requires CRLF line endings.
>
>

I confess I find it hard to take plsh terribly seriously, it just seems
like a bad solution for practically any problem, when compared with,
say, plperlu, and it seems so Unixish that the very idea of using it at
all on Windows strikes me as crazy. Having said that, and devoted all of
5 minutes to looking at the code, I suggest that you make two changes:

if (sourcecode[0] == '\n')
sourcecode++;

would become:

if (sourcecode[0] == '\n' || (sourcecode[0] == '\r' && sourcecode[1] == '\n'))
sourcecode++;

and

len = strcspn(rest, "\n");

would become:

len = strcspn(rest, "\r\n");

Also, there is probably a pretty good case for opening the temp file using PG_BINARY_W rather than "w" so that Windows doesn't produce extra CRs for you (see recent discussion of this w.r.t. pg_upgrade).

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: newline conversion in SQL command strings
Date: 2012-09-20 19:34:19
Message-ID: 12733.1348169659@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 09/20/2012 09:12 AM, Peter Eisentraut wrote:
>> It has been proposed that the plsh handler should strip the CRs before
>> execution. But I don't think that is a correct solution, because that
>> is user data which could be relevant. It could be the case, for
>> example, that plsh is run on a Windows host with a particular shell that
>> requires CRLF line endings.

> I confess I find it hard to take plsh terribly seriously,

Perhaps, but the general problem remains, for any language whose
interpreter is unforgiving about line endings.

However, I think the problem is insoluble as Peter has stated it.
His original complaint was

> ... It would be unfortunate to have the
> behavior of a function depend on the kind of client used to create or
> modify it.

but I don't see a way to reconcile that with the notion that CRs might
be user data. Either you think they're significant, in which case you
should retain the function body as presented, or you think they aren't,
in which case you might as well strip them.

I lean to the position that you're best off stripping them, given that
the lines are to be fed to a Unix shell. The argument about some other
shell possibly requiring them seems pretty tenuous. (IOW, I think that
the PL-specific code is exactly the place to be making such decisions.
I don't want Postgres' generic function support making any decisions
about whether CRs in function bodies are significant, because that
decision could be language-specific.)

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: newline conversion in SQL command strings
Date: 2012-09-20 19:55:59
Message-ID: 505B74CF.2050208@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 09/20/2012 03:34 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 09/20/2012 09:12 AM, Peter Eisentraut wrote:
>>> It has been proposed that the plsh handler should strip the CRs before
>>> execution. But I don't think that is a correct solution, because that
>>> is user data which could be relevant. It could be the case, for
>>> example, that plsh is run on a Windows host with a particular shell that
>>> requires CRLF line endings.
>> I confess I find it hard to take plsh terribly seriously,
> Perhaps, but the general problem remains, for any language whose
> interpreter is unforgiving about line endings.

plsh is the only one I know of that writes the script out to a temp
file, and it opens that file on Windows in text mode (if I'm looking at
the right source), which will convert LF into CRLF. Hence my suggestion
that he change that. Of course, if the function body actually has
embedded CRs then there is a different set of problems.

cheers

andrew