Re: plpython function problem workaround

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Marco Colombo <pgsql(at)esiway(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: plpython function problem workaround
Date: 2005-03-18 04:24:52
Message-ID: 20050318042452.GA13676@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Mar 17, 2005 at 09:48:51PM -0500, Tom Lane wrote:
> Michael Fuhr <mike(at)fuhr(dot)org> writes:
> > Line-ending CRs stripped, even inside quotes; mid-line CRs converted
> > to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder
> > what Python on Windows would do.
>
> Unfortunately, I don't think that proves anything, because according
> to earlier discussion Python will do newline-munging when it reads
> a file (including a script file). The question that we have to deal
> with is what are the rules for a string fed to PyRun_String ... and it
> seems those rules are not the same.

I was curious about how Python's munging works with quotes that
span lines, i.e., when the CRs and LFs might be considered part of
a quoted string. Apparently any CR or LF is considered a line
ending in an ordinary Python script, with CR and CRLF normalized
to LF before being passed to the interpreter, so I'm thinking that
a Python programmer wouldn't expect to be able to embed CRs in a
string literal and have them remain unchanged. If that's the case,
then concerns about CR conversions potentially messing up a user's
strings might be unfounded.

PL/Python currently treats the function source as a string that's
passed unchanged (except for the added "def" and indentation) to
PyRun_String. But that's an implementation detail that the user
shouldn't have to care about: I'm wondering if, instead, PL/Python
should treat the function source as Python would treat a file and
do the same conversions that Python would, namely CRLF => LF and
lone CR => LF. That should solve the complaints, and it should be
justifiable as more than just a hack: PL/Python would simply be
doing the same thing that Python would do if it had read the source
from a file. That might even be less surprising than the current
behavior.

Marco, you've stated that you're against munging the code because
"it's not our job to 'fix' data coming from the client." But I'm
suggesting that we think about the code in a different way than the
current implementation does: not as a literal that we pass untouched
to the Python interpreter, but rather as code that Python would
munge anyway if it had read that code from a file. We could still
store the code exactly as received and have the language handler
munge it on the fly, as we've discovered it's already doing.

Comments? Have I overlooked anything? Could munging CRs have
effects that a Python programmer wouldn't expect if the same code
had been read from a file? Since it mimics Python's own behavior
with code read from a file, can anybody justify not doing it?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2005-03-18 05:35:07 Re: plpython function problem workaround
Previous Message Woodchuck Bill 2005-03-18 04:00:06 Re: 3rd RFD: comp.databases.postgresql (was: comp.databases.postgresql.*)