Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen

Lists: pgsql-committerspgsql-hackers
From: petere(at)postgresql(dot)org (Peter Eisentraut)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-22 23:52:53
Message-ID: 20090922235253.4C1A9753FB7@cvs.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Log Message:
-----------
Unicode escapes in E'...' strings

Author: Marko Kreen <markokr(at)gmail(dot)com>

Modified Files:
--------------
pgsql/doc/src/sgml:
syntax.sgml (r1.135 -> r1.136)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/syntax.sgml?r1=1.135&r2=1.136)
pgsql/src/backend/parser:
scan.l (r1.158 -> r1.159)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/parser/scan.l?r1=1.158&r2=1.159)
pgsql/src/include/parser:
gramparse.h (r1.47 -> r1.48)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/parser/gramparse.h?r1=1.47&r2=1.48)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Marko Kreen <markokr(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-25 19:39:50
Message-ID: 2695.1253907590@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

petere(at)postgresql(dot)org (Peter Eisentraut) writes:
> Log Message:
> -----------
> Unicode escapes in E'...' strings

> Author: Marko Kreen <markokr(at)gmail(dot)com>

This patch has broken the no-backup property of the scanner, which
is an absolutely unacceptable penalty for such a second-order feature.
Please fix or revert.

Also, it failed to update psql's scanner to match.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-25 20:23:17
Message-ID: e51f66da0909251323m6a320518s58c3ebc5b755b027@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 9/25/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> petere(at)postgresql(dot)org (Peter Eisentraut) writes:
> > Log Message:
> > -----------
> > Unicode escapes in E'...' strings
>
> > Author: Marko Kreen <markokr(at)gmail(dot)com>
>
> This patch has broken the no-backup property of the scanner, which
> is an absolutely unacceptable penalty for such a second-order feature.
> Please fix or revert.

How do I find out the state of said property?

Currently I assume its related to xeunicodebad pattern?

Will this fix it:

-xeunicodebad [\\]([uU])
+xeunicodebad [\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})

?

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-25 20:39:36
Message-ID: 3432.1253911176@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On 9/25/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> This patch has broken the no-backup property of the scanner, which
>> is an absolutely unacceptable penalty for such a second-order feature.
>> Please fix or revert.

> How do I find out the state of said property?

Per the comment at the head of scan.l, add the -b switch to the flex
call and see what flex says about it.

> Currently I assume its related to xeunicodebad pattern?

Probably, but I didn't check.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-25 20:53:38
Message-ID: 1253912018.29645.1.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Fri, 2009-09-25 at 15:39 -0400, Tom Lane wrote:
> petere(at)postgresql(dot)org (Peter Eisentraut) writes:
> > Log Message:
> > -----------
> > Unicode escapes in E'...' strings
>
> > Author: Marko Kreen <markokr(at)gmail(dot)com>
>
> This patch has broken the no-backup property of the scanner,

Fixed.

> Also, it failed to update psql's scanner to match.

Why does the psql scanner need to know about this? Doesn't it just need
to know the difference between backslash-quote and backslash-something
else?


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-25 21:03:56
Message-ID: 3759.1253912636@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On Fri, 2009-09-25 at 15:39 -0400, Tom Lane wrote:
>> Also, it failed to update psql's scanner to match.

> Why does the psql scanner need to know about this? Doesn't it just need
> to know the difference between backslash-quote and backslash-something
> else?

Maybe it doesn't "need" to know, but I think it would be disastrous from
a maintenance standpoint to not keep the two sets of flex rules in
strict correspondence. It would soon become unclear whether or how to
apply changes in the backend lexer to psql.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-25 21:18:48
Message-ID: e51f66da0909251418v63e9f6bdx203937109d06fb95@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 9/26/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Maybe it doesn't "need" to know, but I think it would be disastrous from
> a maintenance standpoint to not keep the two sets of flex rules in
> strict correspondence. It would soon become unclear whether or how to
> apply changes in the backend lexer to psql.

Patch attached.

--
marko

Attachment Content-Type Size
psql-unicode.diff text/x-diff 1.1 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-26 08:01:18
Message-ID: 1253952078.2880.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Sat, 2009-09-26 at 00:18 +0300, Marko Kreen wrote:
> On 9/26/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Maybe it doesn't "need" to know, but I think it would be disastrous from
> > a maintenance standpoint to not keep the two sets of flex rules in
> > strict correspondence. It would soon become unclear whether or how to
> > apply changes in the backend lexer to psql.
>
> Patch attached.

That patch results in the following message from flex:

psqlscan.l:1039: warning, -s option given but default rule can be
matched


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-26 13:33:29
Message-ID: e51f66da0909260633y5fb71896yac09f537fac369e6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 9/26/09, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On Sat, 2009-09-26 at 00:18 +0300, Marko Kreen wrote:
> > On 9/26/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > Maybe it doesn't "need" to know, but I think it would be disastrous from
> > > a maintenance standpoint to not keep the two sets of flex rules in
> > > strict correspondence. It would soon become unclear whether or how to
> > > apply changes in the backend lexer to psql.
> >
> > Patch attached.
>
>
> That patch results in the following message from flex:
>
> psqlscan.l:1039: warning, -s option given but default rule can be
> matched

Agh. Well, that just means the <xeu> state must be commented out:

-%x xeu
+/* %x xeu */

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-26 16:02:38
Message-ID: 21929.1253980958@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> On 9/26/09, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> That patch results in the following message from flex:
>>
>> psqlscan.l:1039: warning, -s option given but default rule can be
>> matched

> Agh. Well, that just means the <xeu> state must be commented out:

> -%x xeu
> +/* %x xeu */

Ick --- that breaks the whole concept of keeping the two sets of
flex rules in sync. And it's quite unclear why it fixes the problem,
too. At the very least, if you do it that way, it needs a comment
explaining exactly why it's different from the backend.

regards, tom lane


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-26 19:17:39
Message-ID: e51f66da0909261217j4ee21169q2aa33d40fda93bfc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Resend...

On 9/26/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > On 9/26/09, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
> >> That patch results in the following message from flex:
> >>
> >> psqlscan.l:1039: warning, -s option given but default rule can be
> >> matched
>
> > Agh. Well, that just means the <xeu> state must be commented out:
>
> > -%x xeu
> > +/* %x xeu */
>
>
> Ick --- that breaks the whole concept of keeping the two sets of
> flex rules in sync. And it's quite unclear why it fixes the problem,
> too. At the very least, if you do it that way, it needs a comment
> explaining exactly why it's different from the backend.

The commenting-out fixes the problem, because I copy pasted the state
declaration without any rules in it.

Anyway, now I attached a patch, where I filled the section but without
referring it from anywhere. The rules itself are now equal. Is that OK?

--
marko

Attachment Content-Type Size
psql-unicode2.diff text/x-diff 1.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Unicode escapes in E'...' strings Author: Marko Kreen
Date: 2009-09-27 03:29:06
Message-ID: 14352.1254022146@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> Anyway, now I attached a patch, where I filled the section but without
> referring it from anywhere. The rules itself are now equal. Is that OK?

Well, you also have to track the state changes (BEGIN).

In comparing the scanners I realized I'd forgotten to sync psql myself
when I was fooling around with the plpgsql scanner :-(. So mea culpa
as well ...

Fixed and applied.

regards, tom lane