Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search for
  Advanced Search

Re: strange encoding behavior


  • From: "Albe Laurenz" <all(at)adv(dot)magwien(dot)gv(dot)at>
  • To: "Jeff Davis *EXTERN*" <jdavis(at)laika(dot)com>, <pgsql-general(at)postgresql(dot)org>
  • Subject: Re: strange encoding behavior
  • Date: Mon, 23 Oct 2006 10:26:34 +0200
  • Message-id: <52EF20B2E3209443BC37736D00C3C1380B0918F4(at)EXADV1(dot)host(dot)magwien(dot)gv(dot)at>

Jeff Davis wrote:
> I have a UTF8 encoded database. I can do
> 
> => SELECT '\xb9'::text;
> 
> But that seems to be the only way to get an invalid utf8 byte sequence
> into a text type.
[...]
> So, if I were to sum this up in a single question, why does cstring
not
> accept invalid utf8 sequences? And if it doesn't, why are they allowed
> in any text type?

I would say that it should be impossible to get invalid UTF-8 bytes
into a text on an UTF-8 database, and my opinion is that it is a bug or
oversight if a typecast allows you to do so.

The program you are talking about that needs to be able to store
arbitrary bytes in a text column should be changed - maybe it is enough
to change the data type of the database column from 'text' to 'bytea'.

Yours,
Laurenz Albe



Home | Main Index | Thread Index

Privacy Policy | PostgreSQL Archives hosted by Command Prompt, Inc. | Designed by tinysofa
Copyright © 1996 – 2008 PostgreSQL Global Development Group