Quick Links

Re: Bug in UTF8-Validation Code?

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, Michael Fuhr <mike(at)fuhr(dot)org>, Mario Weilguni <mweilguni(at)sime(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Bug in UTF8-Validation Code?
Date:	2007-03-18 11:36:22
Message-ID:	20070318113622.GA5722@svana.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Mar 17, 2007 at 11:46:01AM -0400, Andrew Dunstan wrote:
> How can we fix this? Frankly, the statement in the docs warning about
> making sure that escaped sequences are valid in the server encoding is a
> cop-out. We don't accept invalid data elsewhere, and this should be no
> different IMNSHO. I don't see why this should be any different from,
> say, date or numeric data. For years people have sneered at MySQL
> because it accepted dates like Feb 31st, and rightly so. But this seems
> to me to be like our own version of the same problem.

It seems to me that the easiest solution would be to forbid \x?? escape
sequences where it's greater than \x7F for UTF-8 server encodings.
Instead introduce a \u escape for specifying the unicode character
directly. Under the basic principle that any escape sequence still has
to represent a single character. The result can be multiple bytes, but
you don't have to check for consistancy anymore.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Re: Bug in UTF8-Validation Code? at 2007-03-17 15:46:01 from Andrew Dunstan

Responses

Re: Bug in UTF8-Validation Code? at 2007-03-18 12:25:56 from Andrew Dunstan
Re: Bug in UTF8-Validation Code? at 2007-03-19 12:42:35 from Mario Weilguni

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Josh Berkus	2007-03-18 12:09:33	Re: Project suggestion: benchmark utility for PostgreSQL
Previous Message	Grzegorz Jaskiewicz	2007-03-18 09:32:29	Re: Bug in UTF8-Validation Code?