Re: Unicode string literals versus the world

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unicode string literals versus the world
Date: 2009-04-11 18:47:39
Message-ID: e51f66da0904111147xd206355h49bc143eb853bb65@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/11/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> It gets worse though: I have seldom seen such a badly designed piece of
> syntax as the Unicode string syntax --- see
> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE
>
> You scan the string, and then after that they tell you what the escape
> character is!? Not to mention the obvious ambiguity with & as an
> operator.
>
> If we let this go into 8.4, our previous rounds with security holes
> caused by careless string parsing will look like a day at the beach.
> No frontend that isn't fully cognizant of the Unicode string syntax is
> going to parse such things correctly --- it's going to be trivial for
> a bad guy to confuse a quoting mechanism as to what's an escape and what
> isn't.
>
> I think we need to give very serious consideration to ripping out that
> "feature".

Ugh, it's rather dubious indeed. Especially when we are already in
the middle of seriously confusing conversion from stdstr=off -> on.
Is it really OK to introduce even more complexity in the mix?

Alternative proposal - maybe it would be saner to introduce \uXXXX
escape to E'' strings as a non-standard way for quoting unicode.

Later when the standard quoting is our only quoting method we can play
with standard extensions?

--
marko

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2009-04-11 18:47:50 Re: Closing some 8.4 open items
Previous Message Pavel Stehule 2009-04-11 18:46:23 Re: Allow COMMENT ON to accept an expression rather than just a string