Re: invalidly encoded strings

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, andrew(at)dunslane(dot)net, laurenz(dot)albe(at)wien(dot)gv(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: invalidly encoded strings
Date: 2007-09-11 06:35:33
Message-ID: 1189492533.5924.84.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Tue, 2007-09-11 at 14:50 +0900, Tatsuo Ishii wrote:
>
> > On Tue, 2007-09-11 at 12:29 +0900, Tatsuo Ishii wrote:
> > > Please show me concrete examples how I could introduce a
> vulnerability
> > > using this kind of convert() usage.
> >
> > Try the sequence below. Then, try to dump and then reload the
> database.
> > When you try to reload it, you will get an error:
> >
> > ERROR: invalid byte sequence for encoding "UTF8": 0xbd
>
> I know this could be a problem (like chr() with invalid byte pattern).
> What I really want to know is, read query something like this:
>
> SELECT * FROM japanese_table ORDER BY convert(japanese_text using
> utf8_to_euc_jp);

I guess I don't quite understand the question.

I agree that ORDER BY convert() must be safe in the C locale, because it
just passes the strings to strcmp().

Are you saying that we should not remove convert() until we can support
multiple locales in one database?

If we make convert() operate on bytea and return bytea, as Tom
suggested, would that solve your use case?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message db 2007-09-11 06:35:54 Re: invalidly encoded strings
Previous Message Martijn van Oosterhout 2007-09-11 06:31:46 Re: invalidly encoded strings

Browse pgsql-patches by date

  From Date Subject
Next Message db 2007-09-11 06:35:54 Re: invalidly encoded strings
Previous Message Martijn van Oosterhout 2007-09-11 06:31:46 Re: invalidly encoded strings