Re: quoting psql varible as identifier

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: quoting psql varible as identifier
Date: 2009-12-29 16:57:15
Message-ID: 162867790912290857m340ab1f1s76ef48c6e7241398@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I am working on new patch. I see a problem with copy quote_ident on
client side. This function call ScanKeywordLookup function.

const ScanKeyword *keyword = ScanKeywordLookup(ident,

ScanKeywords,

NumScanKeywords);

so we cannot simply implement quote_ident on client side :(. So we
have to use a real query.

It is acceptable for you?

Pavel


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 18:36:21
Message-ID: 27426.1262111781@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> so we cannot simply implement quote_ident on client side :(. So we
> have to use a real query.

> It is acceptable for you?

No, certainly not --- that adds a boatload of failure conditions.
Just quote it unconditionally.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 18:49:13
Message-ID: 51835CD8-6274-458E-96B2-25438F7382D3@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 29, 2009, at 8:57 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
wrote:

> Hello
>
> I am working on new patch. I see a problem with copy quote_ident on
> client side. This function call ScanKeywordLookup function.
>
> const ScanKeyword *keyword = ScanKeywordLookup(ident,
>
> ScanKeywords,
>
> NumScanKeywords);
>
> so we cannot simply implement quote_ident on client side :(. So we
> have to use a real query.
>
> It is acceptable for you?

No. It has to be client-side.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 18:53:32
Message-ID: 162867790912291053w7b10ba1fu52bed7733c0e07e7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> so we cannot simply implement quote_ident on client side :(. So we
>> have to use a real query.
>
>> It is acceptable for you?
>
> No, certainly not --- that adds a boatload of failure conditions.
> Just quote it unconditionally.

ok

this function have to live in libpq - it needs some not exported
functionality. But, it would not be a problem.

Pavel

>
>                        regards, tom lane
>


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 19:46:29
Message-ID: 20091229194629.GG4569@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule escribió:
> 2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> >> so we cannot simply implement quote_ident on client side :(. So we
> >> have to use a real query.
> >
> >> It is acceptable for you?
> >
> > No, certainly not --- that adds a boatload of failure conditions.
> > Just quote it unconditionally.
>
> ok
>
> this function have to live in libpq - it needs some not exported
> functionality. But, it would not be a problem.

Can we use a trick similar to pg_dump's?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 19:47:20
Message-ID: 162867790912291147j3f8deb31v977bc4f547a5f13@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
> Pavel Stehule escribió:
>> 2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> >> so we cannot simply implement quote_ident on client side :(. So we
>> >> have to use a real query.
>> >
>> >> It is acceptable for you?
>> >
>> > No, certainly not --- that adds a boatload of failure conditions.
>> > Just quote it unconditionally.
>>
>> ok
>>
>> this function have to live in libpq - it needs some not exported
>> functionality. But, it would not be a problem.
>
> Can we use a trick similar to pg_dump's?

??

Pavel
>
> --
> Alvaro Herrera                                http://www.CommandPrompt.com/
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 19:50:40
Message-ID: 162867790912291150o724a0b81p6f07b64602fc7b96@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
> Pavel Stehule escribió:
>> 2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> >> so we cannot simply implement quote_ident on client side :(. So we
>> >> have to use a real query.
>> >
>> >> It is acceptable for you?
>> >
>> > No, certainly not --- that adds a boatload of failure conditions.
>> > Just quote it unconditionally.
>>
>> ok
>>
>> this function have to live in libpq - it needs some not exported
>> functionality. But, it would not be a problem.
>
> Can we use a trick similar to pg_dump's?
>

I see it - we can move function (content) fmtId from dumputils.c to libpq.

Pavel

> --
> Alvaro Herrera                                http://www.CommandPrompt.com/
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 19:52:18
Message-ID: 20091229195218.GH4569@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule escribió:
> 2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
> > Pavel Stehule escribió:
> >> 2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> >> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> >> >> so we cannot simply implement quote_ident on client side :(. So we
> >> >> have to use a real query.
> >> >
> >> >> It is acceptable for you?
> >> >
> >> > No, certainly not --- that adds a boatload of failure conditions.
> >> > Just quote it unconditionally.
> >>
> >> ok
> >>
> >> this function have to live in libpq - it needs some not exported
> >> functionality. But, it would not be a problem.
> >
> > Can we use a trick similar to pg_dump's?
>
> ??

See src/bin/pg_dump/keywords.c and fmtId in dumputils.c

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 19:55:34
Message-ID: 162867790912291155y340ac8cfg7523081217947e72@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
> Pavel Stehule escribió:
>> 2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
>> > Pavel Stehule escribió:
>> >> 2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> >> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> >> >> so we cannot simply implement quote_ident on client side :(. So we
>> >> >> have to use a real query.
>> >> >
>> >> >> It is acceptable for you?
>> >> >
>> >> > No, certainly not --- that adds a boatload of failure conditions.
>> >> > Just quote it unconditionally.
>> >>
>> >> ok
>> >>
>> >> this function have to live in libpq - it needs some not exported
>> >> functionality. But, it would not be a problem.
>> >
>> > Can we use a trick similar to pg_dump's?
>>
>> ??
>
> See src/bin/pg_dump/keywords.c and fmtId in dumputils.c

I found it. It is maybe too complex for using in psql.

Pavel

>
> --
> Alvaro Herrera                                http://www.CommandPrompt.com/
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 19:57:23
Message-ID: 2855.1262116643@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
>> Can we use a trick similar to pg_dump's?

> I see it - we can move function (content) fmtId from dumputils.c to libpq.

This is not a good idea. pg_dump can be expected to be up-to-date with
the backend's keyword list, but libpq cannot.

Just quote the thing unconditionally. It's not worth working harder
than that anyway.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 20:05:53
Message-ID: 162867790912291205l2e5b2cabm412756da49464e6e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
>>> Can we use a trick similar to pg_dump's?
>
>> I see it - we can move function (content) fmtId from dumputils.c to libpq.
>
> This is not a good idea.  pg_dump can be expected to be up-to-date with
> the backend's keyword list, but libpq cannot.
>
> Just quote the thing unconditionally.  It's not worth working harder
> than that anyway.

I see it.

Pavel

>
>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-29 20:19:57
Message-ID: 162867790912291219u79be0f6eh8399a9251ac855b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

hello

here is patch

pavel(at)postgres:5432=# \set foo 'hello world'
pavel(at)postgres:5432=# SELECT :'foo' AS :"foo";
hello world
-------------
hello world
(1 row)

Regards
Pavel

2009/12/29 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> 2009/12/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>>> 2009/12/29 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
>>>> Can we use a trick similar to pg_dump's?
>>
>>> I see it - we can move function (content) fmtId from dumputils.c to libpq.
>>
>> This is not a good idea.  pg_dump can be expected to be up-to-date with
>> the backend's keyword list, but libpq cannot.
>>
>> Just quote the thing unconditionally.  It's not worth working harder
>> than that anyway.
>
> I see it.
>
> Pavel
>
>>
>>                        regards, tom lane
>>
>

Attachment Content-Type Size
variable_escaping.diff text/x-patch 6.8 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2009-12-30 21:37:55
Message-ID: 603c8f070912301337o35e7bf9eh1e94b8e870f7ea57@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 29, 2009 at 3:19 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> here is patch

The error handling in quote_literal() doesn't look right to me. The
documentation for PQescapeStringConn says that it stores an error
message in the conn object, but your code ignores that and prints out
a generic message instead. That doesn't seem right: but then it
further goes on to call exit(1), which seems like a considerable
overreaction to an encoding violation, which is apparently the only
class of error PQescapeStringConn() is documented to throw.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-02 18:10:23
Message-ID: 162867791001021010l58b3d7ebjc7dc5dd0fb5f310e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/30 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Dec 29, 2009 at 3:19 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> here is patch
>
> The error handling in quote_literal() doesn't look right to me.  The
> documentation for PQescapeStringConn says that it stores an error
> message in the conn object, but your code ignores that and prints out
> a generic message instead.  That doesn't seem right: but then it
> further goes on to call exit(1), which seems like a considerable
> overreaction to an encoding violation, which is apparently the only
> class of error PQescapeStringConn() is documented to throw.

Actually I am not sure about the adequate solution. Now, the lexer
doesn't return any error. Any handled errors are fatal, and lexer (and
psql) is finished with exit(1) - so there are not checking status
after any lexer call. It is question if we have to do it. Because
error will be raised in next stage:

"Presently the only possible error conditions involve invalid
multibyte encoding in the source string. The output string is still
generated on error, but it can be expected that the server will reject
it as malformed. On error, a suitable message is stored in the conn
object, whether or not error is NULL."

http://www.postgresql.org/docs/8.4/static/libpq-exec.html#LIBPQ-EXEC-ESCAPE-STRING

so probably - we can quietly ignore this error without security or
functionality issues.

I see two solution:

a) print correct message and exit(1)
b) quietly ignore this error - it's more warning than error, because
error will be raised immediately

Third variant, checking lexer status over every call is maybe non
adequate to frequency of this error and an importance of this patch. I
am for a.

Regards, and happy new year
Pavel

>
> ...Robert
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-02 18:41:27
Message-ID: 9334.1262457687@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> a) print correct message and exit(1)

Which part of "no, you're not doing that" wasn't clear to you?

The error handling in this function should be no different from that
in the existing escaping functions. exit(1) is completely unacceptable.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-02 18:53:22
Message-ID: 9473.1262458402@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> here is patch

I looked at this patch a bit, and I think the real problem with it is
that it's not multibyte safe. You've copied backend code that is
allowed to assume it's in a safe encoding (ie, one where multibyte
characters can't contain non-high-bit-set bytes). This is not okay
on the client side, see SJIS and similar encodings.

Where you need to start out is by cloning PQescapeStringConn, which does
a similar type of transformation correctly even in unsafe encodings.
I think we'd agreed upthread that libpq should provide
PQescapeIdentifier functionality anyhow.

Once you've actually read that code, you'll realize that it's okay to
treat the error result as a warning, which resolves the other point
of concern. Just print the message and use the result anyway.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-02 18:53:44
Message-ID: 162867791001021053y4c456926y7375d5fd03b11369@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/2 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> a) print correct message and exit(1)
>
> Which part of "no, you're not doing that" wasn't clear to you?
>

> The error handling in this function should be no different from that
> in the existing escaping functions.  exit(1) is completely unacceptable.
>

Are we talking we about error handling in psql lexer?

What I know, in psql there are no any escaping function now. Yes,
exit(1) is maybe too hard - but it is safe. I didn't write
PQescapeStringConn function, and I am not able to speak, we can ignore
this error result. Full handling of this possible error, means to add
some CHECK over any lexer call.

Pavel

>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-02 18:59:28
Message-ID: 162867791001021059u611e4487mc790ed7ddc1605@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/2 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> here is patch
>
> I looked at this patch a bit, and I think the real problem with it is
> that it's not multibyte safe.  You've copied backend code that is
> allowed to assume it's in a safe encoding (ie, one where multibyte
> characters can't contain non-high-bit-set bytes).  This is not okay
> on the client side, see SJIS and similar encodings.
>

this code is taken from pg_dump, so if I understand it well, this is
littl bit different case.

> Where you need to start out is by cloning PQescapeStringConn, which does
> a similar type of transformation correctly even in unsafe encodings.
> I think we'd agreed upthread that libpq should provide
> PQescapeIdentifier functionality anyhow.
>

ok

> Once you've actually read that code, you'll realize that it's okay to
> treat the error result as a warning, which resolves the other point
> of concern.  Just print the message and use the result anyway.

ok.

Pavel

>
>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-04 10:44:53
Message-ID: 162867791001040244x7b2a729fm6b8fdb0c490d2c6b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

hello

2010/1/2 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> here is patch
>
> I looked at this patch a bit, and I think the real problem with it is
> that it's not multibyte safe.  You've copied backend code that is
> allowed to assume it's in a safe encoding (ie, one where multibyte
> characters can't contain non-high-bit-set bytes).  This is not okay
> on the client side, see SJIS and similar encodings.
>
> Where you need to start out is by cloning PQescapeStringConn, which does
> a similar type of transformation correctly even in unsafe encodings.
> I think we'd agreed upthread that libpq should provide
> PQescapeIdentifier functionality anyhow.
>

I am looking on psql directory. Now I found, so in this directory is
linked dumputil.c - It could little bit to help us.

I have one question. If I understand well, the function fmtId isn't
multibyte safe? So why is possible to use it in pg_dump?

Pavel

> Once you've actually read that code, you'll realize that it's okay to
> treat the error result as a warning, which resolves the other point
> of concern.  Just print the message and use the result anyway.
>
>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-04 12:56:41
Message-ID: 162867791001040456w4ba9a0afg32f5a89c81942c70@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I talked with Hitoshi Harada, and fmtId function is safe (minimally
for Japanese case). This function working without any errors, so we
must not duplicate a code.

Pavel

2010/1/4 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> hello
>
> 2010/1/2 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>>> here is patch
>>
>> I looked at this patch a bit, and I think the real problem with it is
>> that it's not multibyte safe.  You've copied backend code that is
>> allowed to assume it's in a safe encoding (ie, one where multibyte
>> characters can't contain non-high-bit-set bytes).  This is not okay
>> on the client side, see SJIS and similar encodings.
>>
>> Where you need to start out is by cloning PQescapeStringConn, which does
>> a similar type of transformation correctly even in unsafe encodings.
>> I think we'd agreed upthread that libpq should provide
>> PQescapeIdentifier functionality anyhow.
>>
>
> I am looking on psql directory. Now I found, so in this directory is
> linked dumputil.c - It could little bit to help us.
>
> I have one question. If I understand well, the function fmtId isn't
> multibyte safe? So why is possible to use it in pg_dump?
>
> Pavel
>
>> Once you've actually read that code, you'll realize that it's okay to
>> treat the error result as a warning, which resolves the other point
>> of concern.  Just print the message and use the result anyway.
>>
>>                        regards, tom lane
>>
>

Attachment Content-Type Size
variable_escaping.diff application/octet-stream 5.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-04 14:59:48
Message-ID: 10143.1262617188@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> I have one question. If I understand well, the function fmtId isn't
> multibyte safe? So why is possible to use it in pg_dump?

pg_dump is only guaranteed to work correctly in the server encoding.
If you force it to use a client_encoding different from the server's,
it might or might not work, for reasons far beyond that one --- the
big problem usually is data containing characters that have no
equivalent in the client encoding. So I'm not particularly excited
about whether fmtId is multibyte safe within pg_dump. If we were to try
to use it in more general contexts, it would probably need more work.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-04 19:51:03
Message-ID: 162867791001041151w62505b29gfce580ec114220af@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/4 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> I have one question. If I understand well, the function fmtId isn't
>> multibyte safe? So why is possible to use it in pg_dump?
>
> pg_dump is only guaranteed to work correctly in the server encoding.
> If you force it to use a client_encoding different from the server's,
> it might or might not work, for reasons far beyond that one --- the
> big problem usually is data containing characters that have no
> equivalent in the client encoding.  So I'm not particularly excited
> about whether fmtId is multibyte safe within pg_dump.  If we were to try
> to use it in more general contexts, it would probably need more work.

I could agree with this explanation for quote_identifier function, but
not in 100% for fmtId function. We can change encoding for pg_dump
(option -E). I don't have a problem to write second and safe fmtId
function (with technique used in dumputils don't need to modify
libpq), although fmtId do exactly what I need. I would to understand
to behave.

Pavel

>
>                        regards, tom lane
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-05 04:12:55
Message-ID: 603c8f071001042012o6b692919od83c45e1a5c6eb80@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 4, 2010 at 2:51 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/4 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>>> I have one question. If I understand well, the function fmtId isn't
>>> multibyte safe? So why is possible to use it in pg_dump?
>>
>> pg_dump is only guaranteed to work correctly in the server encoding.
>> If you force it to use a client_encoding different from the server's,
>> it might or might not work, for reasons far beyond that one --- the
>> big problem usually is data containing characters that have no
>> equivalent in the client encoding.  So I'm not particularly excited
>> about whether fmtId is multibyte safe within pg_dump.  If we were to try
>> to use it in more general contexts, it would probably need more work.
>
> I could agree with this explanation for quote_identifier function, but
> not in 100% for fmtId function. We can change encoding for pg_dump
> (option -E).

But if it's not guaranteed to work, the fact that you can do it is irrelevant.

> I don't have a problem to write second and safe fmtId
> function (with technique used in dumputils don't need to modify
> libpq), although fmtId do exactly what I need. I would to understand
> to behave.

I think you mean that you would need to understand how it should
behave - in which case I agree, but I think Tom spelled that out
pretty clearly upthread: close PQescapeStringConn and adapt it to be
PQescapeIdentifier.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-05 04:34:36
Message-ID: 15934.1262666076@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Jan 4, 2010 at 2:51 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> I don't have a problem to write second and safe fmtId
>> function (with technique used in dumputils don't need to modify
>> libpq), although fmtId do exactly what I need. I would to understand
>> to behave.

> I think you mean that you would need to understand how it should
> behave - in which case I agree, but I think Tom spelled that out
> pretty clearly upthread: close PQescapeStringConn and adapt it to be
> PQescapeIdentifier.

The more important point here is that fmtId doesn't do "exactly what you
need" in any case. fmtId is safe to use in pg_dump because pg_dump is
only expected to work with the same or older version of the backend.
It would not be safe to use it in libpq, which is expected to still work
with newer backends that might have more reserved words.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-05 13:23:08
Message-ID: 162867791001050523vefb2a25oe26631bef7350a3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/5 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Jan 4, 2010 at 2:51 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> I don't have a problem to write second and safe fmtId
>>> function (with technique used in dumputils don't need to modify
>>> libpq), although fmtId do exactly what I need. I would to understand
>>> to behave.
>
>> I think you mean that you would need to understand how it should
>> behave - in which case I agree, but I think  Tom spelled that out
>> pretty clearly upthread: close PQescapeStringConn and adapt it to be
>> PQescapeIdentifier.
>
> The more important point here is that fmtId doesn't do "exactly what you
> need" in any case.  fmtId is safe to use in pg_dump because pg_dump is
> only expected to work with the same or older version of the backend.
> It would not be safe to use it in libpq, which is expected to still work
> with newer backends that might have more reserved words.

So I finnaly moved to libpq PQescapeIdentConn function

patch is attached.

regards

Pavel

>
>                        regards, tom lane
>

Attachment Content-Type Size
variable_escaping.diff application/octet-stream 22.3 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-11 03:55:50
Message-ID: 603c8f071001101955s44beed76lcbb69c10c606c9e9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 5, 2010 at 8:23 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/5 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> On Mon, Jan 4, 2010 at 2:51 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>> I don't have a problem to write second and safe fmtId
>>>> function (with technique used in dumputils don't need to modify
>>>> libpq), although fmtId do exactly what I need. I would to understand
>>>> to behave.
>>
>>> I think you mean that you would need to understand how it should
>>> behave - in which case I agree, but I think  Tom spelled that out
>>> pretty clearly upthread: close PQescapeStringConn and adapt it to be
>>> PQescapeIdentifier.
>>
>> The more important point here is that fmtId doesn't do "exactly what you
>> need" in any case.  fmtId is safe to use in pg_dump because pg_dump is
>> only expected to work with the same or older version of the backend.
>> It would not be safe to use it in libpq, which is expected to still work
>> with newer backends that might have more reserved words.
>
> So I finnaly moved to libpq PQescapeIdentConn function
>
> patch is attached.

No longer applies, please rebase.

Thanks,

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-11 04:22:25
Message-ID: 162867791001102022p52aef4c1y9674dfa3c43dab2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/11 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Jan 5, 2010 at 8:23 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2010/1/5 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>> On Mon, Jan 4, 2010 at 2:51 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>>> I don't have a problem to write second and safe fmtId
>>>>> function (with technique used in dumputils don't need to modify
>>>>> libpq), although fmtId do exactly what I need. I would to understand
>>>>> to behave.
>>>
>>>> I think you mean that you would need to understand how it should
>>>> behave - in which case I agree, but I think  Tom spelled that out
>>>> pretty clearly upthread: close PQescapeStringConn and adapt it to be
>>>> PQescapeIdentifier.
>>>
>>> The more important point here is that fmtId doesn't do "exactly what you
>>> need" in any case.  fmtId is safe to use in pg_dump because pg_dump is
>>> only expected to work with the same or older version of the backend.
>>> It would not be safe to use it in libpq, which is expected to still work
>>> with newer backends that might have more reserved words.
>>
>> So I finnaly moved to libpq PQescapeIdentConn function
>>
>> patch is attached.
>
> No longer applies, please rebase.

ok I'll do it today

Pavel

>
> Thanks,
>
> ...Robert
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-11 11:54:46
Message-ID: 162867791001110354h61d9ff50w1513796fb3e07590@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

>
> No longer applies, please rebase.
>

fixed, sorry

Pavel

> Thanks,
>
> ...Robert
>

Attachment Content-Type Size
variable_escaping-fix.diff application/octet-stream 20.4 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-15 01:46:47
Message-ID: 603c8f071001141746l414b519fg6e22c32dae2e0c1a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 11, 2010 at 6:54 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> No longer applies, please rebase.
>
> fixed, sorry

Hmm. I think that pqEscapeIdentConn should be in a separate section
of the documentation, entitled "Escaping Identifiers for Inclusion in
SQL Commands". Or else we should merge the existing sections
"Escaping Strings for Inclusion in SQL Commands" and "Escaping Binary
Strings for Inclusion in SQL Commands" and then put this in there too.

On a perhaps-related note, does anyone understand why "Escaping
Strings for Inclusion in SQL Commands" is formatted in a way that is
needlessly inconsistent with the preceding and following sections? I
was surprised by the magnitude of the doc diff hunk in this patch, but
when I looked at it it seems to read more clearly with these changes.

I have yet to fully review the code but on a quick glance it looks reasonable.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-15 05:47:52
Message-ID: 162867791001142147i5518d987i640d8ad6481007cd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/15 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Jan 11, 2010 at 6:54 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> No longer applies, please rebase.
>>
>> fixed, sorry
>

my idea was:

* string
* escape_string
* escape_ident
* bytea
* escape_bytea

But I am not strong in it. Maybe this part of doc needs more love -
long time there are doc to deprecated functions. It could be moved.

Pavel
> Hmm.  I think that pqEscapeIdentConn should be in a separate section
> of the documentation, entitled "Escaping Identifiers for Inclusion in
> SQL Commands".  Or else we should merge the existing sections
> "Escaping Strings for Inclusion in SQL Commands" and "Escaping Binary
> Strings for Inclusion in SQL Commands" and then put this in there too.
>
> On a perhaps-related note, does anyone understand why "Escaping
> Strings for Inclusion in SQL Commands" is formatted in a way that is
> needlessly inconsistent with the preceding and following sections?  I
> was surprised by the magnitude of the doc diff hunk in this patch, but
> when I looked at it it seems to read more clearly with these changes.
>
> I have yet to fully review the code but on a quick glance it looks reasonable.
>
> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-16 18:47:45
Message-ID: 603c8f071001161047m72891cb3p7cc5cd02e1647a77@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 14, 2010 at 8:46 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I have yet to fully review the code but on a quick glance it looks reasonable.

On further review, it looks less reasonable. :-(

The new PQescapeIdentConn function is basically a cut-up version of
PQescapeStringInternal, which seems like a reasonable foundation, but
it rips out a little too much - specifically:

1. the length argument,
2. the size_t return value,
3. the portion of the handling for incomplete multibyte characters
that prevents us from overrunning the output buffer on a maliciously
constructed (or unlucky) input string, and
4. some relevant comments.

I'm inclined to think we should put all of that stuff back, but
certainly #3 at a minimum.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-17 19:04:09
Message-ID: 162867791001171104y6b5f626et7c0aeae0db01bd54@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/16 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Jan 14, 2010 at 8:46 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I have yet to fully review the code but on a quick glance it looks reasonable.
>
> On further review, it looks less reasonable.  :-(
>
> The new PQescapeIdentConn function is basically a cut-up version of
> PQescapeStringInternal, which seems like a reasonable foundation, but
> it rips out a little too much - specifically:
>
> 1. the length argument,
> 2. the size_t return value,
> 3. the portion of the handling for incomplete multibyte characters
> that prevents us from overrunning the output buffer on a maliciously
> constructed (or unlucky) input string, and
> 4. some relevant comments.
>

Yes, I didn't repeat parameter's pattern from PQescapeStringConn, I
would to simplify interface but I hasn't problem to modify it to same
interface.

I rewrote patch so now interface for PQescapeIdentConn is same as
PQescapeStringConn

@3. I though so the protection under incomplete multibyte chars are
enought - missing bytes are replaced by space - like
PQescapeStringConn does. But now - mechanism is exactly same, so this
problem should be solved.

Regards
Pavel Stehule

> I'm inclined to think we should put all of that stuff back, but
> certainly #3 at a minimum.
>
> ...Robert
>

Attachment Content-Type Size
variable_escaping-fix-101701.diff text/x-patch 21.8 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 18:31:43
Message-ID: 603c8f071001181031m1fd03baeoeab08effbfc0fb40@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jan 17, 2010 at 2:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> I rewrote patch so now interface for PQescapeIdentConn is same as
> PQescapeStringConn
>
> @3. I though so the protection under incomplete multibyte chars are
> enought - missing bytes are replaced by space - like
> PQescapeStringConn does.

That much is fine, but the output buffer is only guaranteed to be of
size 2n+1. Imagine the input is two double-quotes followed by a byte
for which pg_encoding_mblen() returns 4. The input is 3 characters
long so the user was responsible to provide 7 bytes of output space,
but you'll try to write 9 bytes to it (including the terminating NUL).

> But now - mechanism is exactly same, so this
> problem should be solved.

This is no better. What the function does no longer matches either
its comments or the documentation (which also contradict each other).

Let me take a crack at this and post a patch. We're making this
harder than it needs to be.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 18:52:38
Message-ID: 162867791001181052l27977bdagf6e906667c8bf093@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Sun, Jan 17, 2010 at 2:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> I rewrote patch so now interface for PQescapeIdentConn is same as
>> PQescapeStringConn
>>
>> @3. I though so the protection under incomplete multibyte chars are
>> enought - missing bytes are replaced by space - like
>> PQescapeStringConn does.
>
> That much is fine, but the output buffer is only guaranteed to be of
> size 2n+1.  Imagine the input is two double-quotes followed by a byte
> for which pg_encoding_mblen() returns 4.  The input is 3 characters
> long so the user was responsible to provide 7 bytes of output space,
> but you'll try to write 9 bytes to it (including the terminating NUL).
>

I don't understand. The "length" is number of bytes, not number of
chars. It is maybe bad documented only. If your input string has 6
bytes, then buffer have to allocated to 13 bytes. Nobody knows how
much is chars there.

>> But now - mechanism is exactly same, so this
>> problem should be solved.
>
> This is no better.  What the function does no longer matches either
> its comments or the documentation (which also contradict each other).
>
> Let me take a crack at this and post a patch.  We're making this
> harder than it needs to be.
>

sure, please.

Pavel

> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 19:13:20
Message-ID: 603c8f071001181113i45e95db4u5356fa9d85bf0dc7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 18, 2010 at 1:52 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Sun, Jan 17, 2010 at 2:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> I rewrote patch so now interface for PQescapeIdentConn is same as
>>> PQescapeStringConn
>>>
>>> @3. I though so the protection under incomplete multibyte chars are
>>> enought - missing bytes are replaced by space - like
>>> PQescapeStringConn does.
>>
>> That much is fine, but the output buffer is only guaranteed to be of
>> size 2n+1.  Imagine the input is two double-quotes followed by a byte
>> for which pg_encoding_mblen() returns 4.  The input is 3 characters
>> long so the user was responsible to provide 7 bytes of output space,
>> but you'll try to write 9 bytes to it (including the terminating NUL).
>>
> I don't understand. The "length" is number of bytes, not number of
> chars. It is maybe bad documented only. If your input string has 6
> bytes, then buffer have to allocated to 13 bytes. Nobody knows how
> much is chars there.

Right, but the point is we can't assume that the input is validly
encoded. If the input ends with a garbage character that looks like
the start of a multi-byte character, we can't assume that there's
enough space in the output buffer to store the required number of
padding spaces.

To take an extreme example, suppose there were an encoding where any
time the first byte of a multi-byte character has the high-bit set,
the character is 100 bytes long. Then suppose someone call
PQescapeStringConn(), or this new function we're adding, with a length
argument of 1, and the first byte of the input buffer has the high-bit
set. The caller is only required to provide a 3-byte output buffer,
and the third byte is needed for the terminating NUL. That means that
after we copy that first character we only have room to insert one
padding space. The way you had it coded, since we were expecting a
character 100 bytes long, we'd always try to insert 99 padding spaces.

>> Let me take a crack at this and post a patch.  We're making this
>> harder than it needs to be.
>
> sure, please.

Working on it...

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 19:20:58
Message-ID: 162867791001181120n38668c91y5fbfd95f7e8219b4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Jan 18, 2010 at 1:52 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Sun, Jan 17, 2010 at 2:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>> I rewrote patch so now interface for PQescapeIdentConn is same as
>>>> PQescapeStringConn
>>>>
>>>> @3. I though so the protection under incomplete multibyte chars are
>>>> enought - missing bytes are replaced by space - like
>>>> PQescapeStringConn does.
>>>
>>> That much is fine, but the output buffer is only guaranteed to be of
>>> size 2n+1.  Imagine the input is two double-quotes followed by a byte
>>> for which pg_encoding_mblen() returns 4.  The input is 3 characters
>>> long so the user was responsible to provide 7 bytes of output space,
>>> but you'll try to write 9 bytes to it (including the terminating NUL).
>>>
>> I don't understand. The "length" is number of bytes, not number of
>> chars. It is maybe bad documented only. If your input string has 6
>> bytes, then buffer have to allocated to 13 bytes. Nobody knows how
>> much is chars there.
>
> Right, but the point is we can't assume that the input is validly
> encoded.  If the input ends with a garbage character that looks like
> the start of a multi-byte character, we can't assume that there's
> enough space in the output buffer to store the required number of
> padding spaces.
>
> To take an extreme example, suppose there were an encoding where any
> time the first byte of a multi-byte character has the high-bit set,
> the character is 100 bytes long.  Then suppose someone call
> PQescapeStringConn(), or this new function we're adding, with a length
> argument of 1, and the first byte of the input buffer has the high-bit
> set.  The caller is only required to provide a 3-byte output buffer,
> and the third byte is needed for the terminating NUL.  That means that
> after we copy that first character we only have room to insert one
> padding space.  The way you had it coded, since we were expecting a
> character 100 bytes long, we'd always try to insert 99 padding spaces.
>

do you speak about previous version?

in current version is garanted new length is <= 2x original length

Pavel

>
>> Let me take a crack at this and post a patch.  We're making this
>>> harder than it needs to be.
>>
>> sure, please.
>
> Working on it...
>
> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 20:19:06
Message-ID: 603c8f071001181219q4261532ere5e07c67b8fca066@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 18, 2010 at 2:20 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Mon, Jan 18, 2010 at 1:52 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>>> On Sun, Jan 17, 2010 at 2:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>>> I rewrote patch so now interface for PQescapeIdentConn is same as
>>>>> PQescapeStringConn
>>>>>
>>>>> @3. I though so the protection under incomplete multibyte chars are
>>>>> enought - missing bytes are replaced by space - like
>>>>> PQescapeStringConn does.
>>>>
>>>> That much is fine, but the output buffer is only guaranteed to be of
>>>> size 2n+1.  Imagine the input is two double-quotes followed by a byte
>>>> for which pg_encoding_mblen() returns 4.  The input is 3 characters
>>>> long so the user was responsible to provide 7 bytes of output space,
>>>> but you'll try to write 9 bytes to it (including the terminating NUL).
>>>>
>>> I don't understand. The "length" is number of bytes, not number of
>>> chars. It is maybe bad documented only. If your input string has 6
>>> bytes, then buffer have to allocated to 13 bytes. Nobody knows how
>>> much is chars there.
>>
>> Right, but the point is we can't assume that the input is validly
>> encoded.  If the input ends with a garbage character that looks like
>> the start of a multi-byte character, we can't assume that there's
>> enough space in the output buffer to store the required number of
>> padding spaces.
>>
>> To take an extreme example, suppose there were an encoding where any
>> time the first byte of a multi-byte character has the high-bit set,
>> the character is 100 bytes long.  Then suppose someone call
>> PQescapeStringConn(), or this new function we're adding, with a length
>> argument of 1, and the first byte of the input buffer has the high-bit
>> set.  The caller is only required to provide a 3-byte output buffer,
>> and the third byte is needed for the terminating NUL.  That means that
>> after we copy that first character we only have room to insert one
>> padding space.  The way you had it coded, since we were expecting a
>> character 100 bytes long, we'd always try to insert 99 padding spaces.
>>
>
> do you speak about previous version?

Yes.

> in current version is garanted new length is <= 2x original length

Actually, strictly less than, but the code gets it correct. However,
your latest version has some other problems. For example, you didn't
update the docs to match your source-code changes. Also, I prefer an
API where the escaping function does include the quotes, so I've done
it that way in the attached patch. This is just the libpq changes, I
figure if we can agree on this, then we can move onto the psql stuff.

Comments?

...Robert

Attachment Content-Type Size
PQescapeIdentifierConn.patch text/x-patch 17.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 20:26:53
Message-ID: 13148.1263846413@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ... Also, I prefer an
> API where the escaping function does include the quotes, so I've done
> it that way in the attached patch.

IMO this function should act as much like PQescapeStringConn as possible.
Random differences like including or not including outer quotes don't
make the user's life better. Random differences like a slightly
different rule for the amount of space required are outright dangerous.

Also, why is this patch changing the documentation of PQescapeStringConn?
It might be only whitespace changes, but I don't particularly wish to
have to determine that.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-18 21:01:44
Message-ID: 603c8f071001181301q25573ed9m9e86294c52a99aec@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 18, 2010 at 3:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ...  Also, I prefer an
>> API where the escaping function does include the quotes, so I've done
>> it that way in the attached patch.
>
> IMO this function should act as much like PQescapeStringConn as possible.

Generally speaking, I agree...

> Random differences like including or not including outer quotes don't
> make the user's life better.  Random differences like a slightly
> different rule for the amount of space required are outright dangerous.

I'm not sure that not including the quotes is any better. If someone
escapes foo and gets back foo, are they going to realize that escaping
fo"o is going to give them back fo""o rather than "fo""o"? One
difference vs. PQescapeStringConn() is that if you fail to include the
surrounding quotes in that case, something will almost certainly break
in a noisy and highly visible fashion. Here that might not happen, or
someone might call one of PQescapeStringConn() and
PQescapeIdentifierConn() and then use the wrong sort of outer quotes.

IMO, it's actually pretty weird that PQescapeStringConn() and
quote_literal() are named differently and do incompatible things. I
think it would be a plus if this new function were a little more
similar to quote_ident(), but that's just MHO, of course.

> Also, why is this patch changing the documentation of PQescapeStringConn?
> It might be only whitespace changes, but I don't particularly wish to
> have to determine that.

See previous discussion upthread.

http://archives.postgresql.org/pgsql-hackers/2010-01/msg01516.php

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 09:13:38
Message-ID: 162867791001190113x1d5f93c0m264909be0c112f15@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Jan 18, 2010 at 3:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> ...  Also, I prefer an
>>> API where the escaping function does include the quotes, so I've done
>>> it that way in the attached patch.
>>
>> IMO this function should act as much like PQescapeStringConn as possible.
>
> Generally speaking, I agree...
>
>> Random differences like including or not including outer quotes don't
>> make the user's life better.  Random differences like a slightly
>> different rule for the amount of space required are outright dangerous.
>
> I'm not sure that not including the quotes is any better.  If someone
> escapes foo and gets back foo, are they going to realize that escaping
> fo"o is going to give them back fo""o rather than "fo""o"?  One
> difference vs. PQescapeStringConn() is that if you fail to include the
> surrounding quotes in that case, something will almost certainly break
> in a noisy and highly visible fashion.  Here that might not happen, or
> someone might call one of PQescapeStringConn() and
> PQescapeIdentifierConn() and then use the wrong sort of outer quotes.
>
> IMO, it's actually pretty weird that PQescapeStringConn() and
> quote_literal() are named differently and do incompatible things.  I
> think it would be a plus if this new function were a little more
> similar to quote_ident(), but that's just MHO, of course.
>

I am afraid so we can do nothing now with this. There are two
arguments - consistency versus robustness. If you use
PQescapeStringConn() without outer quotes, then you have a SQL
injection problem (there could not be error) :(. When there are no
escape function that add outer quotes, then can be strange for
developers working with one different.

I see three solution:

a) use a PQescapeIdentifConn as PQescapeStringConn,
b) move this functionality to psql without change of API,
c) change semantic and name - maybe PQquoteIdentifierConn()

Personally I am for a) and later for b). What I know - php coders
needs some secure function for identifier escaping - but I dislike PHP
because every function is designed different.

regards
Pavel

>> Also, why is this patch changing the documentation of PQescapeStringConn?
>> It might be only whitespace changes, but I don't particularly wish to
>> have to determine that.
>
> See previous discussion upthread.
>
> http://archives.postgresql.org/pgsql-hackers/2010-01/msg01516.php
>
> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 17:48:14
Message-ID: 603c8f071001190948s38d1213ahfa81b30ab0e15450@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 4:13 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Mon, Jan 18, 2010 at 3:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>> ...  Also, I prefer an
>>>> API where the escaping function does include the quotes, so I've done
>>>> it that way in the attached patch.
>>>
>>> IMO this function should act as much like PQescapeStringConn as possible.
>>
>> Generally speaking, I agree...
>>
>>> Random differences like including or not including outer quotes don't
>>> make the user's life better.  Random differences like a slightly
>>> different rule for the amount of space required are outright dangerous.
>>
>> I'm not sure that not including the quotes is any better.  If someone
>> escapes foo and gets back foo, are they going to realize that escaping
>> fo"o is going to give them back fo""o rather than "fo""o"?  One
>> difference vs. PQescapeStringConn() is that if you fail to include the
>> surrounding quotes in that case, something will almost certainly break
>> in a noisy and highly visible fashion.  Here that might not happen, or
>> someone might call one of PQescapeStringConn() and
>> PQescapeIdentifierConn() and then use the wrong sort of outer quotes.
>>
>> IMO, it's actually pretty weird that PQescapeStringConn() and
>> quote_literal() are named differently and do incompatible things.  I
>> think it would be a plus if this new function were a little more
>> similar to quote_ident(), but that's just MHO, of course.
>>
>
> I am afraid so we can do nothing now with this. There are two
> arguments - consistency versus robustness. If you use
> PQescapeStringConn() without outer quotes, then you have a SQL
> injection problem (there could not be error) :(. When there are no
> escape function that add outer quotes, then can be strange for
> developers working with one different.
>
> I see three solution:
>
> a) use a PQescapeIdentifConn as PQescapeStringConn,
> b) move this functionality to psql without change of API,
> c) change semantic and name - maybe PQquoteIdentifierConn()
>
> Personally I am for a) and later for b). What I know - php coders
> needs some secure function for identifier escaping - but I dislike PHP
> because every function is designed different.

I think what you're saying is that you agree with Tom's position that
the new escaping function should not add the necessary surrounding
quotes, instead leaving that to the user. Is that correct?

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 17:54:45
Message-ID: 162867791001190954p418b5b9fl2f94cef279d17707@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/19 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Jan 19, 2010 at 4:13 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2010/1/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Mon, Jan 18, 2010 at 3:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>>> ...  Also, I prefer an
>>>>> API where the escaping function does include the quotes, so I've done
>>>>> it that way in the attached patch.
>>>>
>>>> IMO this function should act as much like PQescapeStringConn as possible.
>>>
>>> Generally speaking, I agree...
>>>
>>>> Random differences like including or not including outer quotes don't
>>>> make the user's life better.  Random differences like a slightly
>>>> different rule for the amount of space required are outright dangerous.
>>>
>>> I'm not sure that not including the quotes is any better.  If someone
>>> escapes foo and gets back foo, are they going to realize that escaping
>>> fo"o is going to give them back fo""o rather than "fo""o"?  One
>>> difference vs. PQescapeStringConn() is that if you fail to include the
>>> surrounding quotes in that case, something will almost certainly break
>>> in a noisy and highly visible fashion.  Here that might not happen, or
>>> someone might call one of PQescapeStringConn() and
>>> PQescapeIdentifierConn() and then use the wrong sort of outer quotes.
>>>
>>> IMO, it's actually pretty weird that PQescapeStringConn() and
>>> quote_literal() are named differently and do incompatible things.  I
>>> think it would be a plus if this new function were a little more
>>> similar to quote_ident(), but that's just MHO, of course.
>>>
>>
>> I am afraid so we can do nothing now with this. There are two
>> arguments - consistency versus robustness. If you use
>> PQescapeStringConn() without outer quotes, then you have a SQL
>> injection problem (there could not be error) :(. When there are no
>> escape function that add outer quotes, then can be strange for
>> developers working with one different.
>>
>> I see three solution:
>>
>> a) use a PQescapeIdentifConn as PQescapeStringConn,
>> b) move this functionality to psql without change of API,
>> c) change semantic and name - maybe PQquoteIdentifierConn()
>>
>> Personally I am for a) and later for b). What I know - php coders
>> needs some secure function for identifier escaping - but I dislike PHP
>> because every function is designed different.
>
> I think what you're saying is that you agree with Tom's position that
> the new escaping function should not add the necessary surrounding
> quotes, instead leaving that to the user.  Is that correct?

yes

Pavel
>
> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 18:32:02
Message-ID: 603c8f071001191032k34b05d85nca5c4001eebe8983@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 12:54 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> I think what you're saying is that you agree with Tom's position that
>> the new escaping function should not add the necessary surrounding
>> quotes, instead leaving that to the user.  Is that correct?
>
> yes

Updated patch attached. I still think this is a bizarre API.

...Robert

Attachment Content-Type Size
PQescapeIdentifierConn-v2.patch text/x-patch 17.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 18:43:07
Message-ID: 16707.1263926587@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Updated patch attached. I still think this is a bizarre API.

Well, if we had it to do over I'm sure we'd have done it differently,
but the functions are there now and we aren't going to change them ...

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 18:46:39
Message-ID: 603c8f071001191046o6e63189cv9ec08941e0c8a5cd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 1:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Updated patch attached.  I still think this is a bizarre API.
>
> Well, if we had it to do over I'm sure we'd have done it differently,
> but the functions are there now and we aren't going to change them ...

I agree, but I don't feel the need to repeat the mistakes we've
already made by slavishly copying them into new places. I think as
long as we're adding a new function, we should make it behave sanely.
We could even take the opportunity to go back and add a saner version
of PQescapeStringConn.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 19:01:44
Message-ID: 17185.1263927704@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ... I think as
> long as we're adding a new function, we should make it behave sanely.
> We could even take the opportunity to go back and add a saner version
> of PQescapeStringConn.

Well, it's a bit late in the devel cycle to be inventing from scratch,
but if we did want to do something "saner" what would it look like?
I can see the following possibilities:

* include boundary quotes (and E too in the literal case). This would
imply telling people they should leave whitespace around the value in
the constructed query ... or should we insert leading/trailing spaces
to prevent that mistake too?

* malloc our own result value instead of expecting caller to provide
space. This adds another failure case (out of memory) but it's not
really much different from OOM on the caller side.

* anything else you want to change?

I suggest we could use PQescapeLiteral + PQescapeIdentifier as names
if we provide a new pair of functions defined with a different API.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 19:03:14
Message-ID: 162867791001191103w63c7900csf30e6c2844a88438@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/19 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Jan 19, 2010 at 1:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Updated patch attached.  I still think this is a bizarre API.
>>
>> Well, if we had it to do over I'm sure we'd have done it differently,
>> but the functions are there now and we aren't going to change them ...
>
> I agree, but I don't feel the need to repeat the mistakes we've
> already made by slavishly copying them into new places.  I think as
> long as we're adding a new function, we should make it behave sanely.
> We could even take the opportunity to go back and add a saner version
> of PQescapeStringConn.

Sometimes is better do nothing :(. I understand you, my first code was
same as your, but any change of basic API is the terrible thing. This
function is used in PHP driver, and bilions web pages. Theoretically
we can work on new libpq with richer API - there could be better
support of binary form, smarter quoting functions. But actually there
are no main request.

Regards
Pavel

>
> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 19:24:39
Message-ID: 603c8f071001191124r4cb2d76fs823a42b07f8ba04b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 2:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ... I think as
>> long as we're adding a new function, we should make it behave sanely.
>> We could even take the opportunity to go back and add a saner version
>> of PQescapeStringConn.
>
> Well, it's a bit late in the devel cycle to be inventing from scratch,
> but if we did want to do something "saner" what would it look like?
> I can see the following possibilities:
>
> * include boundary quotes (and E too in the literal case).  This would
> imply telling people they should leave whitespace around the value in
> the constructed query ... or should we insert leading/trailing spaces
> to prevent that mistake too?

Does the existing PQescapeStringConn() require E'' rather than just
''? It isn't documented so.

I think inserting the whitespace would be overkill, but that's another
thing that's not mentioned in the present documentation and should be,
because it would not be difficult to screw it up.

> * malloc our own result value instead of expecting caller to provide
> space.  This adds another failure case (out of memory) but it's not
> really much different from OOM on the caller side.

I think that would be an anti-improvement - caller might have their
own malloc substitute, might want to stack-allocate, etc. Plus we'd
either have to malloc the worst-case buffer size or else read the
string twice. But see below...

> * anything else you want to change?

Instead of relying on the output buffer to be exactly 2n+1 or 2n+3 or
whatever, I suggest that we add an explicit argument for the size of
the output buffer. If the buffer the caller provides is too short,
ideally we should let the caller know about the error and also
indicate how much space would have been necessary (like sprintf).

> I suggest we could use PQescapeLiteral + PQescapeIdentifier as names
> if we provide a new pair of functions defined with a different API.

I like those names. Maybe something like:

size_t PQescapeLiteral(PGconn *conn, char *to, size_t tolen, char
*from, size_t fromlen, int *error);
size_t PQescapeIdentifier(PGconn *conn, char *to, size_t tolen, char
*from, size_t fromlen, int *error);

On success, *error is 0 and size_t is the number of bytes actually
written. On failure, *error is non-zero, conn->errorMessage is an
appropriate error, and the return value is the shortest value of tolen
adequate to hold the result, or 0 if we bombed out for some reason
other than lack of output space. That way, if people are willing to
incur the overhead of an extra call, they can malloc() (or whatever)
exactly the right amount of space. Or they can just size the buffer
according to the documentation and then they're safe.

Another option would be to swap the return value and *error:

int PQescapeLiteral(PGconn *conn, char *to, int tolen, char *from, int
fromlen, size_t *needed);
int PQescapeIdentifier(PGconn *conn, char *to, int tolen, char *from,
int fromlen, size_t *needed);

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 19:38:51
Message-ID: 17963.1263929931@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Jan 19, 2010 at 2:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> * include boundary quotes (and E too in the literal case). This would
>> imply telling people they should leave whitespace around the value in
>> the constructed query ... or should we insert leading/trailing spaces
>> to prevent that mistake too?

> Does the existing PQescapeStringConn() require E'' rather than just
> ''? It isn't documented so.

It does not expect E'', and would get it wrong if you supplied E'',
which is an anti-feature because it makes it impossible to avoid
escape_string_warning messages. If we were to supply quotes I think
we should also supply E whenever we need to use a backslash.

> I think inserting the whitespace would be overkill, but that's another
> thing that's not mentioned in the present documentation and should be,
> because it would not be difficult to screw it up.

As long as it's documented it's okay ... probably ... note that
conditionally inserting E would get us right back to the place where
an unsafe usage might appear to work fine in light testing. Maybe
prepend a space only if prepending E?

>> * malloc our own result value instead of expecting caller to provide
>> space. This adds another failure case (out of memory) but it's not
>> really much different from OOM on the caller side.

> I think that would be an anti-improvement - caller might have their
> own malloc substitute, might want to stack-allocate, etc. Plus we'd
> either have to malloc the worst-case buffer size or else read the
> string twice. But see below...

We already have multiple functions in the API that do malloc(). I think
that requiring the caller to estimate the size of the space is a bigger
anti-feature than using malloc.

> I like those names. Maybe something like:

> size_t PQescapeLiteral(PGconn *conn, char *to, size_t tolen, char
> *from, size_t fromlen, int *error);
> size_t PQescapeIdentifier(PGconn *conn, char *to, size_t tolen, char
> *from, size_t fromlen, int *error);

This would get a *lot* simpler if we malloced the result space.

char *PQescapeXXX(PQconn *conn, const char *str, size_t len)

with NULL result used to signal any failure (and a message left in
the conn).

> ... Or they can just size the buffer
> according to the documentation and then they're safe.

Wrong, one thing we would absolutely not do is document any hard
guarantee about the maximum space needed. That's isomorphic to
documenting the exact escaping algorithm, and doing that has already
bit us more times than you know. If we're going to alter the API at
all I think we absolutely *must* get rid of that design error.

Keep in mind that a large part of wanting to have these functions at all
is for simplicity in the calling code. Complicating their API for
marginal gains doesn't seem to me to be the right design direction.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 20:01:40
Message-ID: 603c8f071001191201n15a5d236s3f0b372de6592d7b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 2:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> As long as it's documented it's okay ... probably ... note that
> conditionally inserting E would get us right back to the place where
> an unsafe usage might appear to work fine in light testing.  Maybe
> prepend a space only if prepending E?

That's a bit of a kludge, but maybe it's liveable.

>>> * malloc our own result value instead of expecting caller to provide
>>> space.  This adds another failure case (out of memory) but it's not
>>> really much different from OOM on the caller side.
>
>> I think that would be an anti-improvement - caller might have their
>> own malloc substitute, might want to stack-allocate, etc.  Plus we'd
>> either have to malloc the worst-case buffer size or else read the
>> string twice.  But see below...
>
> We already have multiple functions in the API that do malloc().  I think
> that requiring the caller to estimate the size of the space is a bigger
> anti-feature than using malloc.
>
>> I like those names.  Maybe something like:
>
>> size_t PQescapeLiteral(PGconn *conn, char *to, size_t tolen, char
>> *from, size_t fromlen, int *error);
>> size_t PQescapeIdentifier(PGconn *conn, char *to, size_t tolen, char
>> *from, size_t fromlen, int *error);
>
> This would get a *lot* simpler if we malloced the result space.
>
>        char *PQescapeXXX(PQconn *conn, const char *str, size_t len)
>
> with NULL result used to signal any failure (and a message left in
> the conn).

Well, I have to admit that the simplicity of that API is very
appealing. Someone will probably eventually request
PQescapeXXXButUseMyMalloc(PQConn *conn, const char *str, size_t len,
void (*MyMalloc)(size_t)); ... but at least until we change the
escaping algorithm again, we can always tell that person to use
PQescapeStringConn() and roll their own. So, I'm sold.

Do you think we should malloc 2n+X bytes all the time, or do you want
to scan the string once to determine how much space is needed and then
allocate exactly that much space? It seems to me that it might be
sensible to do it this way:

1. Do a locale-aware scan of the input string and count the number of
characters we need to escape (num_to_escape).
2. If num_to_escape is 0, fast path: allocate len+3 bytes and use
memcpy to copy the input data to the output buffer.
3. Otherwise, allocate len+num_to_escape+5 bytes
(space-E-quote-quote-NUL) and do a second locale-aware scan of the
input string, copying and escaping as we go (or do you only want the
space-E if the escapable character that we find is \ rather than '?).

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 20:13:23
Message-ID: 18628.1263932003@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Do you think we should malloc 2n+X bytes all the time, or do you want
> to scan the string once to determine how much space is needed and then
> allocate exactly that much space?

I'd vote for two scans, as I think we'll soon decide 2n doesn't cut it
anyway. One of the issues that needs to be looked at is embedded null
characters. We might fail on that for the moment, but I can foresee
wanting to send \000 instead. You don't want to pay for 4x do you?

> It seems to me that it might be
> sensible to do it this way:

> 1. Do a locale-aware scan of the input string and count the number of
> characters we need to escape (num_to_escape).
> 2. If num_to_escape is 0, fast path: allocate len+3 bytes and use
> memcpy to copy the input data to the output buffer.
> 3. Otherwise, allocate len+num_to_escape+5 bytes
> (space-E-quote-quote-NUL) and do a second locale-aware scan of the
> input string, copying and escaping as we go (or do you only want the
> space-E if the escapable character that we find is \ rather than '?).

Yeah, the fast path if no escaping is needed seems attractive.

I think that E should only be inserted if we have a backslash to deal
with; no reason to generate nonstandard syntax if we don't have to.
That would mean keeping two bits of state (escaping-needed and
malloc-size) from the initial pass, but that's pretty cheap.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 21:25:52
Message-ID: 603c8f071001191325m29d8ac08k3fd38e3a5540d3d0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 3:13 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It seems to me that it might be
>> sensible to do it this way:
>
>> 1. Do a locale-aware scan of the input string and count the number of
>> characters we need to escape (num_to_escape).
>> 2. If num_to_escape is 0, fast path: allocate len+3 bytes and use
>> memcpy to copy the input data to the output buffer.
>> 3. Otherwise, allocate len+num_to_escape+5 bytes
>> (space-E-quote-quote-NUL) and do a second locale-aware scan of the
>> input string, copying and escaping as we go (or do you only want the
>> space-E if the escapable character that we find is \ rather than '?).
>
> Yeah, the fast path if no escaping is needed seems attractive.

Cool.

> I think that E should only be inserted if we have a backslash to deal
> with; no reason to generate nonstandard syntax if we don't have to.
> That would mean keeping two bits of state (escaping-needed and
> malloc-size) from the initial pass, but that's pretty cheap.

That's fine.

I'd like to proceed by committing an initial patch which changes the
"Escaping Strings for Inclusion in SQL Commands" to use a
<variablelist> with one <varlistentry> per function (as we do in
surrounding functions) and consolidates it with the following section,
"Escaping Binary Strings for Inclusion in SQL Commands". Then I'll
submit a patch implementing pqEscapeLiteral() and pqEscapeIdentifier()
as discussed here, and the doc diff hunks will actually be readable.

Objections? If so, please state how you'd like to see the docs
organized because I don't see another way to do it that makes any
sense.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-19 21:40:54
Message-ID: 20267.1263937254@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I'd like to proceed by committing an initial patch which changes the
> "Escaping Strings for Inclusion in SQL Commands" to use a
> <variablelist> with one <varlistentry> per function (as we do in
> surrounding functions) and consolidates it with the following section,
> "Escaping Binary Strings for Inclusion in SQL Commands". Then I'll
> submit a patch implementing pqEscapeLiteral() and pqEscapeIdentifier()
> as discussed here, and the doc diff hunks will actually be readable.

Sounds like a plan.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-20 04:19:18
Message-ID: 603c8f071001192019r5b1e7c78rf55dfc3842ec5c66@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 4:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I'd like to proceed by committing an initial patch which changes the
>> "Escaping Strings for Inclusion in SQL Commands" to use a
>> <variablelist> with one <varlistentry> per function (as we do in
>> surrounding functions) and consolidates it with the following section,
>>  "Escaping Binary Strings for Inclusion in SQL Commands".  Then I'll
>> submit a patch implementing pqEscapeLiteral() and pqEscapeIdentifier()
>> as discussed here, and the doc diff hunks will actually be readable.
>
> Sounds like a plan.

Initial commit done, and follow-on patch attached. The docs took
longer to write than the code. I spent a fair amount of time trying
to make it all make sense, but suggestions are welcome.

...Robert

Attachment Content-Type Size
PQescape.patch text/x-patch 13.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-21 15:00:36
Message-ID: 603c8f071001210700g20f66641oc4e19087ac1c061f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 19, 2010 at 11:19 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jan 19, 2010 at 4:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> I'd like to proceed by committing an initial patch which changes the
>>> "Escaping Strings for Inclusion in SQL Commands" to use a
>>> <variablelist> with one <varlistentry> per function (as we do in
>>> surrounding functions) and consolidates it with the following section,
>>>  "Escaping Binary Strings for Inclusion in SQL Commands".  Then I'll
>>> submit a patch implementing pqEscapeLiteral() and pqEscapeIdentifier()
>>> as discussed here, and the doc diff hunks will actually be readable.
>>
>> Sounds like a plan.
>
> Initial commit done, and follow-on patch attached.  The docs took
> longer to write than the code.  I spent a fair amount of time trying
> to make it all make sense, but suggestions are welcome.

Committed after fixing a couple of oversights in my doc changes.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-21 16:57:18
Message-ID: 162867791001210857i52b3de0ei34756c8dc181d51c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Jan 19, 2010 at 11:19 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Jan 19, 2010 at 4:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>> I'd like to proceed by committing an initial patch which changes the
>>>> "Escaping Strings for Inclusion in SQL Commands" to use a
>>>> <variablelist> with one <varlistentry> per function (as we do in
>>>> surrounding functions) and consolidates it with the following section,
>>>>  "Escaping Binary Strings for Inclusion in SQL Commands".  Then I'll
>>>> submit a patch implementing pqEscapeLiteral() and pqEscapeIdentifier()
>>>> as discussed here, and the doc diff hunks will actually be readable.
>>>
>>> Sounds like a plan.
>>
>> Initial commit done, and follow-on patch attached.  The docs took
>> longer to write than the code.  I spent a fair amount of time trying
>> to make it all make sense, but suggestions are welcome.
>
> Committed after fixing a couple of oversights in my doc changes.

thank you.

I actualised patch

I thing, we need one libpq change more.

+ static void
+ appendLiteral(PGconn *conn, PQExpBuffer buf, const char *str)
+ {
+ char *escaped_str;
+ size_t len;
+
+ len = strlen(str);
+ escaped_str = PQescapeLiteral(conn, str, len);
+
+ if (escaped_str == NULL)
+ {
+ const char *error_message = PQerrorMessage(pset.db);
+
+ if (strlen(error_message))
+ psql_error("%s", error_message);
+ }
+ else
+ {
+ appendPQExpBufferStr(buf, escaped_str);
+ free(escaped_str);
+ }
+ }

the correct result of this function (when is some error) is broken
buffer. But function markPQExpBufferBroken is static.

Regards
Pavel Stehule

>
> ...Robert
>

Attachment Content-Type Size
patch.diff application/octet-stream 4.8 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-21 17:33:35
Message-ID: 603c8f071001210933t69537d32mb81244494bb8ac82@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 21, 2010 at 11:57 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Tue, Jan 19, 2010 at 11:19 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Tue, Jan 19, 2010 at 4:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>>> I'd like to proceed by committing an initial patch which changes the
>>>>> "Escaping Strings for Inclusion in SQL Commands" to use a
>>>>> <variablelist> with one <varlistentry> per function (as we do in
>>>>> surrounding functions) and consolidates it with the following section,
>>>>>  "Escaping Binary Strings for Inclusion in SQL Commands".  Then I'll
>>>>> submit a patch implementing pqEscapeLiteral() and pqEscapeIdentifier()
>>>>> as discussed here, and the doc diff hunks will actually be readable.
>>>>
>>>> Sounds like a plan.
>>>
>>> Initial commit done, and follow-on patch attached.  The docs took
>>> longer to write than the code.  I spent a fair amount of time trying
>>> to make it all make sense, but suggestions are welcome.
>>
>> Committed after fixing a couple of oversights in my doc changes.
>
> thank you.
>
> I actualised patch
>
> I thing, we need one libpq change more.
>
> + static void
> + appendLiteral(PGconn *conn, PQExpBuffer buf, const char *str)
> + {
> +       char    *escaped_str;
> +       size_t          len;
> +
> +       len = strlen(str);
> +       escaped_str = PQescapeLiteral(conn, str, len);
> +
> +       if (escaped_str == NULL)
> +       {
> +               const char *error_message = PQerrorMessage(pset.db);
> +
> +               if (strlen(error_message))
> +                       psql_error("%s", error_message);
> +       }
> +       else
> +       {
> +               appendPQExpBufferStr(buf, escaped_str);
> +               free(escaped_str);
> +       }
> + }
>
> the correct result of this function (when is some error) is broken
> buffer. But function markPQExpBufferBroken is static.

markPQExpBufferBroken is specifically designed to handle out of memory
errors. I don't think we should try to generalize that to handle
encoding violations or other things, at least not without some very
careful thought about the consequences of so doing. I think we need
to find some other way of signalling an error back to the caller,
although it's not exactly obvious to me what the best way to do that
is.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-21 17:53:19
Message-ID: 162867791001210953y1c38715bq7969013145965320@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Jan 21, 2010 at 11:57 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Tue, Jan 19, 2010 at 11:19 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>> On Tue, Jan 19, 2010 at 4:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>>>> I'd like to proceed by committing an initial patch which changes the
>>>>>> "Escaping Strings for Inclusion in SQL Commands" to use a
>>>>>> <variablelist> with one <varlistentry> per function (as we do in
>>>>>> surrounding functions) and consolidates it with the following section,
>>>>>>  "Escaping Binary Strings for Inclusion in SQL Commands".  Then I'll
>>>>>> submit a patch implementing pqEscapeLiteral() and pqEscapeIdentifier()
>>>>>> as discussed here, and the doc diff hunks will actually be readable.
>>>>>
>>>>> Sounds like a plan.
>>>>
>>>> Initial commit done, and follow-on patch attached.  The docs took
>>>> longer to write than the code.  I spent a fair amount of time trying
>>>> to make it all make sense, but suggestions are welcome.
>>>
>>> Committed after fixing a couple of oversights in my doc changes.
>>
>> thank you.
>>
>> I actualised patch
>>
>> I thing, we need one libpq change more.
>>
>> + static void
>> + appendLiteral(PGconn *conn, PQExpBuffer buf, const char *str)
>> + {
>> +       char    *escaped_str;
>> +       size_t          len;
>> +
>> +       len = strlen(str);
>> +       escaped_str = PQescapeLiteral(conn, str, len);
>> +
>> +       if (escaped_str == NULL)
>> +       {
>> +               const char *error_message = PQerrorMessage(pset.db);
>> +
>> +               if (strlen(error_message))
>> +                       psql_error("%s", error_message);
>> +       }
>> +       else
>> +       {
>> +               appendPQExpBufferStr(buf, escaped_str);
>> +               free(escaped_str);
>> +       }
>> + }
>>
>> the correct result of this function (when is some error) is broken
>> buffer. But function markPQExpBufferBroken is static.
>
> markPQExpBufferBroken is specifically designed to handle out of memory
> errors.  I don't think we should try to generalize that to handle
> encoding violations or other things, at least not without some very
> careful thought about the consequences of so doing.  I think we need
> to find some other way of signalling an error back to the caller,
> although it's not exactly obvious to me what the best way to do that
> is.

BufferBroken flag is used as signal for out of memory now. But
everybody can use it for his purposes. There isn't any limit (what I
know). More - the behave is similar like NULL. If you have a broken
buffer, then everything with this is broken. Theoretically we could to
add some parser state flag and check it over every scanner call - but
it is duplicate. minimally when escape function returns null because
is out of memory, then breaking the buffer is semantically correct.

Currently there isn't consistency in memory handling :( - from good
reasons. Bad allocation from libpq is finished with raising a error
message. Out of memory from psql is raising fatal error. I am not
sure, if we can better join these two worlds. Maybe we can add malloc
handler to libpq (maybe in future)?

Still there are two kind of bugs - memory and encoding. These bugs
should be handled differently. In both cases we cannot stop lexers -
because we have to find end of statement, we cannot to execute broken
command.

my plan

add to state structure field like lexer_error. This field will be
checked before execution
it could be ugly for metacommands, there will be lot of new checks :(

Pavel

>
> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-21 18:50:25
Message-ID: 603c8f071001211050h6daab7b1qfc03353ba60b4b32@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 21, 2010 at 12:53 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> add to state structure field like lexer_error. This field will be
> checked before execution
> it could be ugly for metacommands, there will be lot of new checks :(

Eh? The only places where we should need new tests are the places
that check PQExpBufferBroken() now - there are only 6 calls to that
function in src/bin/psql and not all of them need to be changed. The
places that do need to be changed will need to be modified to check
PQExpBufferBroken() || lexer_coughed_up_a_lung.

It should be possible to do this pretty simply, I think.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-21 19:25:50
Message-ID: 162867791001211125i4dfe57f3h6d87bcc2c6d45bde@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Jan 21, 2010 at 12:53 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> add to state structure field like lexer_error. This field will be
>> checked before execution
>> it could be ugly for metacommands, there will be lot of new checks :(
>
> Eh?  The only places where we should need new tests are the places
> that check PQExpBufferBroken() now - there are only 6 calls to that
> function in src/bin/psql and not all of them need to be changed.  The
> places that do need to be changed will need to be modified to check
> PQExpBufferBroken() || lexer_coughed_up_a_lung.

no, it is only 6 calls because we don't check psql_scan_slash_option result.

When we don't allow escaping syntax in slash statement, then it could be simple.

we have to do some like:

if buffer_is_broken(..)
report "out of memory"
else if global_error != true
do ....
end if

Pavel

>
> It should be possible to do this pretty simply, I think.
>
> ...Robert
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-22 04:34:41
Message-ID: 603c8f071001212034m48c58dbah78896c28f97d8337@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 21, 2010 at 2:25 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Thu, Jan 21, 2010 at 12:53 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> add to state structure field like lexer_error. This field will be
>>> checked before execution
>>> it could be ugly for metacommands, there will be lot of new checks :(
>>
>> Eh?  The only places where we should need new tests are the places
>> that check PQExpBufferBroken() now - there are only 6 calls to that
>> function in src/bin/psql and not all of them need to be changed.  The
>> places that do need to be changed will need to be modified to check
>> PQExpBufferBroken() || lexer_coughed_up_a_lung.
>
> no, it is only 6 calls because we don't check psql_scan_slash_option result.

psql_scan_slash_option() already has a way to signal errors - it can
return NULL. Type any backslash command followed by a single quote...

I'm not saying I love the way those errors are handled, but if we make
this patch about revising the way psql does error handling, this is
not going to get committed for this release... what we need to do is
fit what we're trying to do into the existing model.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-22 07:16:29
Message-ID: 162867791001212316hd080c97k976bdf8bfda39630@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/22 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Jan 21, 2010 at 2:25 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Thu, Jan 21, 2010 at 12:53 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>> add to state structure field like lexer_error. This field will be
>>>> checked before execution
>>>> it could be ugly for metacommands, there will be lot of new checks :(
>>>
>>> Eh?  The only places where we should need new tests are the places
>>> that check PQExpBufferBroken() now - there are only 6 calls to that
>>> function in src/bin/psql and not all of them need to be changed.  The
>>> places that do need to be changed will need to be modified to check
>>> PQExpBufferBroken() || lexer_coughed_up_a_lung.
>>
>> no, it is only 6 calls because we don't check psql_scan_slash_option result.
>
> psql_scan_slash_option() already has a way to signal errors - it can
> return NULL.  Type any backslash command followed by a single quote...

NULL means "no value". But I thing, so there is only one important -
\set. For other can be enough some error message and empty value.

>
> I'm not saying I love the way those errors are handled, but if we make
> this patch about revising the way psql does error handling, this is
> not going to get committed for this release...  what we need to do is
> fit what we're trying to do into the existing model.
>

I try to find some simple and I'll send a patch.

Pavel

> ...Robert
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-22 12:19:51
Message-ID: 162867791001220419hf3fcf30j8d7a57ad9f918ba7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

here is new variant. Add scan_state flag "valid" and enhance
protection against execution broken statement.

Regards
Pavel Stehule

2010/1/22 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> 2010/1/22 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Thu, Jan 21, 2010 at 2:25 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> 2010/1/21 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>>> On Thu, Jan 21, 2010 at 12:53 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>>> add to state structure field like lexer_error. This field will be
>>>>> checked before execution
>>>>> it could be ugly for metacommands, there will be lot of new checks :(
>>>>
>>>> Eh?  The only places where we should need new tests are the places
>>>> that check PQExpBufferBroken() now - there are only 6 calls to that
>>>> function in src/bin/psql and not all of them need to be changed.  The
>>>> places that do need to be changed will need to be modified to check
>>>> PQExpBufferBroken() || lexer_coughed_up_a_lung.
>>>
>>> no, it is only 6 calls because we don't check psql_scan_slash_option result.
>>
>> psql_scan_slash_option() already has a way to signal errors - it can
>> return NULL.  Type any backslash command followed by a single quote...
>
> NULL means "no value". But I thing, so there is only one important -
> \set. For other can be enough some error message and empty value.
>
>>
>> I'm not saying I love the way those errors are handled, but if we make
>> this patch about revising the way psql does error handling, this is
>> not going to get committed for this release...  what we need to do is
>> fit what we're trying to do into the existing model.
>>
>
> I try to find some simple and I'll send a patch.
>
> Pavel
>
>> ...Robert
>>
>

Attachment Content-Type Size
variables-escape.diff application/octet-stream 10.8 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-22 19:20:01
Message-ID: 603c8f071001221120ra2b6c7cne5105e8882960f18@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 22, 2010 at 7:19 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> here is new variant. Add scan_state flag "valid" and enhance
> protection against execution broken statement.

This doesn't make sense to me. You've changed the way \set is handled
- in a way that doesn't seem particularly appropriate to me - but most
of the other backslash commands are unmodified - but then there's this
hack at the bottom that overrides the return value if
psql_scan_is_valid() fails. And then there's this:

- /* we need not do psql_scan_reset() here */
+ psql_scan_reset(scan_state);

It's extremely unclear what the rationale for this change is.

Basically, you need to either improve the comments in here so that
someone can understand what is going on, or you need to submit it with
some detailed documentation explaining the rationale behind each
change and why it is right, or more than likely both. But I think the
whole approach is likely off-base and you need to go back and think
about a cleaner way to get this done.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-24 08:57:53
Message-ID: 162867791001240057k447835fbyf8b7ae5daa04e38b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/22 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Fri, Jan 22, 2010 at 7:19 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> here is new variant. Add scan_state flag "valid" and enhance
>> protection against execution broken statement.
>
> This doesn't make sense to me.  You've changed the way \set is handled
> - in a way that doesn't seem particularly appropriate to me - but most
> of the other backslash commands are unmodified - but then there's this
> hack at the bottom that overrides the return value if
> psql_scan_is_valid() fails.  And then there's this:
>
> -                               /* we need not do psql_scan_reset() here */
> +                               psql_scan_reset(scan_state);
>
> It's extremely unclear what the rationale for this change is.

Sure, I'll enhance comments. All \command is verified on broken
scanning. So when command's processing time the scanning is broken,
then we can detect it on the end - flag valid is changed only to false
direction. \set is different - it only one statement, that store
values. And it is reason why I did more checks there. I am not sure -
it is necessary now - simply we can set ERROR on the end and finish.

psql_scan_reset is called on the end of any statement - so we can
there reset info about scanning.

I'll look on it tomorrow.

Regards
Pavel Stehule

>
> Basically, you need to either improve the comments in here so that
> someone can understand what is going on, or you need to submit it with
> some detailed documentation explaining the rationale behind each
> change and why it is right, or more than likely both.  But I think the
> whole approach is likely off-base and you need to go back and think
> about a cleaner way to get this done.
>
> ...Robert
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-25 12:36:49
Message-ID: 162867791001250436u129501e5y84c45ea1c892a106@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I hope, so this version is more readable and more clean. I removed
some not necessary checks.

regards

Pavel

2010/1/22 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Fri, Jan 22, 2010 at 7:19 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> here is new variant. Add scan_state flag "valid" and enhance
>> protection against execution broken statement.
>
> This doesn't make sense to me.  You've changed the way \set is handled
> - in a way that doesn't seem particularly appropriate to me - but most
> of the other backslash commands are unmodified - but then there's this
> hack at the bottom that overrides the return value if
> psql_scan_is_valid() fails.  And then there's this:
>
> -                               /* we need not do psql_scan_reset() here */
> +                               psql_scan_reset(scan_state);
>
> It's extremely unclear what the rationale for this change is.
>
> Basically, you need to either improve the comments in here so that
> someone can understand what is going on, or you need to submit it with
> some detailed documentation explaining the rationale behind each
> change and why it is right, or more than likely both.  But I think the
> whole approach is likely off-base and you need to go back and think
> about a cleaner way to get this done.
>
> ...Robert
>

Attachment Content-Type Size
variable-escaping.diff application/octet-stream 12.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-27 20:08:06
Message-ID: 603c8f071001271208jf9c22d8td9a73f44dbc22c6d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 25, 2010 at 7:36 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> I hope, so this version is more readable and more clean. I removed
> some not necessary checks.

This still seems overly complicated to me. I spent a few hours today
working up the attached patch. Let me know your thoughts.

...Robert

Attachment Content-Type Size
variable-escaping-rmh.patch text/x-patch 9.2 KB

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-28 09:53:35
Message-ID: 162867791001280153w28d74279hf86410bb09e7e83f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/27 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Jan 25, 2010 at 7:36 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> I hope, so this version is more readable and more clean. I removed
>> some not necessary checks.
>
> This still seems overly complicated to me.  I spent a few hours today
> working up the attached patch.  Let me know your thoughts.
>

There is serious issue. The "psql_error" only shows some message on
output, but do nothing more - you don't set a result status for
commands and for statements. So some potential error from parsing is
pseudo quietly ignored - without respect to your setting
ON_ERROR_STOP. This could be a problem for commands. Execution of
broken SQL statements will raise syntax error. But for \set some
variable will be a broken and the content can be used. I don't thing
so it is good. It is limited.

Your version is acceptable only when we don't enable escape syntax for
commands. Then we don't need check it. On your version - I am not sure
if it is fully compatible, and using a global variables isn't nice.

I little bit modify my original code - it is more verbose (- useless
using pqexpbuffer) - and more consistent with previous behave.

Pavel

> ...Robert
>

Attachment Content-Type Size
variable-escaping.diff application/octet-stream 12.9 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-28 14:38:53
Message-ID: 603c8f071001280638r7f971969gc30b412378cb73ec@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 28, 2010 at 4:53 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/27 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Mon, Jan 25, 2010 at 7:36 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> I hope, so this version is more readable and more clean. I removed
>>> some not necessary checks.
>>
>> This still seems overly complicated to me.  I spent a few hours today
>> working up the attached patch.  Let me know your thoughts.
>
> There is serious issue. The "psql_error" only shows some message on
> output, but do nothing more - you don't set a result status for
> commands and for statements. So some potential error from parsing is
> pseudo quietly ignored - without respect to your setting
> ON_ERROR_STOP. This could be a problem for commands. Execution of
> broken SQL statements will raise syntax error. But for \set some
> variable will be a broken and the content can be used. I don't thing
> so it is good. It is limited.

Well, what you seem to be proposing to do is allow the command to
execute (on the screwed-up data) and then afterwards pretend that it
failed by overriding the return status. I think that's unacceptable.
The root of the problem here is that the parsing and execution stages
for backslash commands are not cleanly separated. There's no clean
way for the lexer to return an error that allows the command to finish
parsing normally but then doesn't execute it. Fixing that is going to
require an extensive refactoring of commands.c which I don't think it
makes sense to undertake at this point in the release cycle. Even if
it did, it seems like material for a separate patch rather than
something which has to be done before this goes in.

> Your version is acceptable only when we don't enable escape syntax for
> commands. Then we don't need check it. On your version - I am not sure
> if it is fully compatible, and using a global variables isn't nice.

I'm not adding any new global variables - I'm just using the ones that
are already there to avoid duplicating the same code four times.
Referencing them from within the bodies of the lexer rules doesn't
make the variables not global.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-28 16:59:32
Message-ID: 162867791001280859u71c8540fu948bc4aea06249b9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/28 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Jan 28, 2010 at 4:53 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2010/1/27 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Mon, Jan 25, 2010 at 7:36 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>> I hope, so this version is more readable and more clean. I removed
>>>> some not necessary checks.
>>>
>>> This still seems overly complicated to me.  I spent a few hours today
>>> working up the attached patch.  Let me know your thoughts.
>>
>> There is serious issue. The "psql_error" only shows some message on
>> output, but do nothing more - you don't set a result status for
>> commands and for statements. So some potential error from parsing is
>> pseudo quietly ignored - without respect to your setting
>> ON_ERROR_STOP. This could be a problem for commands. Execution of
>> broken SQL statements will raise syntax error. But for \set some
>> variable will be a broken and the content can be used. I don't thing
>> so it is good. It is limited.
>
> Well, what you seem to be proposing to do is allow the command to
> execute (on the screwed-up data) and then afterwards pretend that it
> failed by overriding the return status.  I think that's unacceptable.
> The root of the problem here is that the parsing and execution stages
> for backslash commands are not cleanly separated.  There's no clean
> way for the lexer to return an error that allows the command to finish
> parsing normally but then doesn't execute it.  Fixing that is going to
> require an extensive refactoring of commands.c which I don't think it
> makes sense to undertake at this point in the release cycle.  Even if
> it did, it seems like material for a separate patch rather than
> something which has to be done before this goes in.

so I removed support for escaping from backslah commands and refactorised code.

I hope so this code is more verbose and clean than previous versions.

Regards
Pavel

>
>> Your version is acceptable only when we don't enable escape syntax for
>> commands. Then we don't need check it. On your version - I am not sure
>> if it is fully compatible, and using a global variables isn't nice.
>
> I'm not adding any new global variables - I'm just using the ones that
> are already there to avoid duplicating the same code four times.
> Referencing them from within the bodies of the lexer rules doesn't
> make the variables not global.

>
> ...Robert
>

Attachment Content-Type Size
variable-escaping-simple.patch text/x-patch 10.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-28 18:06:54
Message-ID: 603c8f071001281006x6319f0dby75511d164b1bba16@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 28, 2010 at 11:59 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/1/28 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Thu, Jan 28, 2010 at 4:53 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> 2010/1/27 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>>> On Mon, Jan 25, 2010 at 7:36 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>>> I hope, so this version is more readable and more clean. I removed
>>>>> some not necessary checks.
>>>>
>>>> This still seems overly complicated to me.  I spent a few hours today
>>>> working up the attached patch.  Let me know your thoughts.
>>>
>>> There is serious issue. The "psql_error" only shows some message on
>>> output, but do nothing more - you don't set a result status for
>>> commands and for statements. So some potential error from parsing is
>>> pseudo quietly ignored - without respect to your setting
>>> ON_ERROR_STOP. This could be a problem for commands. Execution of
>>> broken SQL statements will raise syntax error. But for \set some
>>> variable will be a broken and the content can be used. I don't thing
>>> so it is good. It is limited.
>>
>> Well, what you seem to be proposing to do is allow the command to
>> execute (on the screwed-up data) and then afterwards pretend that it
>> failed by overriding the return status.  I think that's unacceptable.
>> The root of the problem here is that the parsing and execution stages
>> for backslash commands are not cleanly separated.  There's no clean
>> way for the lexer to return an error that allows the command to finish
>> parsing normally but then doesn't execute it.  Fixing that is going to
>> require an extensive refactoring of commands.c which I don't think it
>> makes sense to undertake at this point in the release cycle.  Even if
>> it did, it seems like material for a separate patch rather than
>> something which has to be done before this goes in.
>
> so I removed support for escaping from backslah commands and refactorised code.
>
> I hope so this code is more verbose and clean than previous versions.

I don't see any advantage of this over the code that I wrote.

First, you can't just remove support for the escape syntax from \d
commands without some discussion of whether or not that's the right
thing to do, and I don't think it is. The cases where this will
potentially cause a problem are limited to those where the input is
invalidly encoded, and I don't think that's important enough to
justify the surprise factor of having backslash commands behave
differently from everything else.

Second, even if it were OK to remove support for the escape syntax
from \d commands, you failed to update the documentation you cribbed
from my patch to match the behavior you implemented.

Third, you've reintroduced all of the code duplication that I
eliminated in my version of this patch, as well as at least one bug -
you've used free() where I believe you need PQfreemem(). I am also
thinking that it doesn't make sense to push the result of
PQescapeLiteral() or PQescapeIdentifier() as a new buffer, because we
don't want to process variable expansions recursively. At first I
thought this was a security hole, but on further reflection I don't
think it is: it'll rescan as a quoted string anyway, so any
colon-escapes will be ignored. But I believe it's unnecessary at any
rate.

I would like to go ahead and commit my version of this patch and will
do so later today if no one else objects.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-29 07:08:25
Message-ID: 162867791001282308y110a4c53yb53a36c81e2c1352@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> First, you can't just remove support for the escape syntax from \d
> commands without some discussion of whether or not that's the right
> thing to do, and I don't think it is.  The cases where this will
> potentially cause a problem are limited to those where the input is
> invalidly encoded, and I don't think that's important enough to
> justify the surprise factor of having backslash commands behave
> differently from everything else.
>
> Second, even if it were OK to remove support for the escape syntax
> from \d commands, you failed to update the documentation you cribbed
> from my patch to match the behavior you implemented.

we can discus about programming style, but in this case I am sure. The
problem is \set command. We cannot ignore error in this case. In other
cases invalid escaping raises error, not in this case. So there is two
ways again:

a) remove escaped expansion from \command
b) implement \set command differently

>
> Third, you've reintroduced all of the code duplication that I
> eliminated in my version of this patch, as well as at least one bug -
> you've used free() where I believe you need PQfreemem().

you have a true.

I am also
> thinking that it doesn't make sense to push the result of
> PQescapeLiteral() or PQescapeIdentifier() as a new buffer, because we
> don't want to process variable expansions recursively.  At first I
> thought this was a security hole, but on further reflection I don't
> think it is: it'll rescan as a quoted string anyway, so any
> colon-escapes will be ignored.  But I believe it's unnecessary at any
> rate.
>

I think so it was a back door for scripting support in psql. It can
break backward compatibility!

> I would like to go ahead and commit my version of this patch and will
> do so later today if no one else objects.

yes, I have.

* your patch remove some feature without any warning and documentation
* your patch doesn't solve issue with \set command

Regards
Pavel

>
> ...Robert
>

Attachment Content-Type Size
variable-escaping-simple.patch text/x-patch 10.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quoting psql varible as identifier
Date: 2010-01-29 16:19:57
Message-ID: 603c8f071001290819w2f72d7dam9a37b2907ce3f9ed@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 29, 2010 at 2:08 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>
>> First, you can't just remove support for the escape syntax from \d
>> commands without some discussion of whether or not that's the right
>> thing to do, and I don't think it is.  The cases where this will
>> potentially cause a problem are limited to those where the input is
>> invalidly encoded, and I don't think that's important enough to
>> justify the surprise factor of having backslash commands behave
>> differently from everything else.
>>
>> Second, even if it were OK to remove support for the escape syntax
>> from \d commands, you failed to update the documentation you cribbed
>> from my patch to match the behavior you implemented.
>
> we can discus about programming style, but in this case I am sure. The
> problem is \set command. We cannot ignore error in this case. In other
> cases invalid escaping raises error, not in this case. So there is two
> ways again:
>
> a) remove escaped expansion from \command
> b) implement \set command differently

I don't view either of those solutions as appropriate or acceptable.

> I am also
>> thinking that it doesn't make sense to push the result of
>> PQescapeLiteral() or PQescapeIdentifier() as a new buffer, because we
>> don't want to process variable expansions recursively.  At first I
>> thought this was a security hole, but on further reflection I don't
>> think it is: it'll rescan as a quoted string anyway, so any
>> colon-escapes will be ignored.  But I believe it's unnecessary at any
>> rate.
>>
> I think so it was a back door for scripting support in psql. It can
> break backward compatibility!

I don't understand your point here.

>> I would like to go ahead and commit my version of this patch and will
>> do so later today if no one else objects.
>
> yes, I have.
>
> * your patch remove some feature without any warning and documentation
> * your patch doesn't solve issue with \set command

It does not remove any feature at all that I can see. The fact that
backslash commands don't properly report errors is a problem, but I
don't see why I have to solve that problem as part of this patch, and
even less why I should fix \set and leave all the others alone.

Another problem I've found looking at this code is that \copy does not
support variable substitutions at all, which seems rather unfortunate.
Still another is that the regular expression used for variable
substitutions in SQL commands is subtly different than the one used
for backslash commands: the latter matches a single colon with no
following characters, while the former does not.

All of this stuff may be worth fixing but I'm sticking with my
position that it's not material for this patch. If you want to submit
a separate patch to change the error handling, go ahead. But I don't
think fixing it for \set only is going to fly, and I don't think that
it should be specific to handling only errors resulting from encoding
violations or out of memory conditions in variable substitutions. It
needs to be a general mechanism that can be applied to existing types
of error conditions where it's appropriate and to new error conditions
that may arise in the future.

...Robert