Upcoming PG re-releases

From: Gregory Maxwell <gmaxwell(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Upcoming PG re-releases
Date: 2005-12-04 17:19:32
Message-ID: e692861c0512040919x56c7b18fva497a198e4195707@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

On 12/4/05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Paul Lindner <lindner(at)inuus(dot)com> writes:
> > On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
> >> Paul Lindner <lindner(at)inuus(dot)com> writes:
> >>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> >>
> >> Is that really a one-size-fits-all solution? Especially with -c?
>
> > I'd say yes, and the -c flag is needed so iconv strips out the
> > invalid characters.
>
> That's exactly what's bothering me about it. If we recommend that
> we had better put a large THIS WILL DESTROY YOUR DATA warning first.
> The problem is that the data is not "invalid" from the user's point
> of view --- more likely, it's in some non-UTF8 encoding --- and so
> just throwing away some of the characters is unlikely to make people
> happy.

Nor is it even guarenteed to make the data load: If the column is
unique constrained and the removal of the non-UTF characters makes two
rows have the same data where they didn't before...

The way to preserve the data is to switch the column to be a bytea.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-12-04 18:32:56 Re: Reducing relation locking overhead
Previous Message Kevin Brown 2005-12-04 17:13:28 Re: Reducing relation locking overhead

Browse pgsql-www by date

  From Date Subject
Next Message Martijn van Oosterhout 2005-12-04 18:55:05 Re: Upcoming PG re-releases
Previous Message Tom Lane 2005-12-04 16:52:45 Re: Upcoming PG re-releases