Re: Bug in UTF8-Validation Code?

From: Mario Weilguni <mweilguni(at)sime(dot)com>
To: Michael Paesold <mpaesold(at)gmx(dot)at>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bug in UTF8-Validation Code?
Date: 2007-03-16 11:17:14
Message-ID: 200703161217.15110.mweilguni@sime.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Am Mittwoch, 14. März 2007 08:01 schrieb Michael Paesold:
> Andrew Dunstan wrote:
> >
> > This strikes me as essential. If the db has a certain encoding ISTM we
> > are promising that all the text data is valid for that encoding.
> >
> > The question in my mind is how we help people to recover from the fact
> > that we haven't done that.
>
> I would also say that it's a bug that escape sequences can get characters
> into the database that are not valid in the specified encoding. If you
> compare the encoding to table constraints, there is no way to simply
> "escape" a constraint check.
>
> This seems to violate the principle of consistency in ACID. Additionally,
> if you include pg_dump into ACID, it also violates durability, since it
> cannot restore what it wrote itself.
> Is there anything in the SQL spec that asks for such a behaviour? I guess
> not.
>
> A DBA will usually not even learn about this issue until they are presented
> with a failing restore.

Is there anything I can do to help with this problem? Maybe implementing a new
GUC variable that turns off accepting wrong encoded sequences (so DBAs still
can turn it on if they really depend on it)?

For me,

Best regards,
Mario Weilguni

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2007-03-16 11:50:05 Question: pg_class attributes and race conditions ?
Previous Message Grzegorz Jaskiewicz 2007-03-16 09:14:29 Re: [RFC] CLUSTER VERBOSE