Re: [Bacula-users] Catastrophic changes to PostgreSQL 8.4

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Jerome Alet <jerome(dot)alet(at)univ-nc(dot)nc>, Kern Sibbald <kern(at)sibbald(dot)com>, bacula-devel <bacula-devel(at)lists(dot)sourceforge(dot)net>, pgsql-general(at)postgresql(dot)org, bacula-users <bacula-users(at)lists(dot)sourceforge(dot)net>
Subject: Re: [Bacula-users] Catastrophic changes to PostgreSQL 8.4
Date: 2009-12-03 05:35:23
Message-ID: 4B174E1B.2060202@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Stephen Frost wrote:
> * Craig Ringer (craig(at)postnewspapers(dot)com(dot)au) wrote:
>> ... so it's defaulting to SQL_ASCII, but actually supports utf-8 if your
>> systems are all in a utf-8 locale. Assuming there's some way for the
>> filed to find out the encoding of the director's database, it probably
>> wouldn't be too tricky to convert non-matching file names to the
>> director's encoding in the fd (when the director's encoding isn't
>> SQL_ASCII, of course).
>
> I'm not sure which piece of bacula connects to PostgreSQL, but whatever
> it is, it could just send a 'set client_encoding' to the PG backend and
> all the conversion will be done by PG.

The director is responsible for managing all the metadata, and it's the
component that connects to Pg.

If the fd sent the system charset along with the bundle of filenames etc
that it sends to the director, then I don't see why the director
couldn't `SET client_encoding' appropriately before inserting data from
that fd, then `RESET client_encoding' once the batch insert was done.

The only downside is that if even one file has invalidly encoded data,
the whole batch insert fails and is rolled back. For that reason, I'd
personally prefer that the fd handle conversion so that it can exclude
such files (with a loud complaint in the error log) or munge the file
name into something that _can_ be stored.

Come to think of it, if the fd and database are both on a utf-8
encoding, the fd should *still* validate the utf-8 filenames it reads.
There's no guarantee that just because the system thinks the filename
should be utf-8, it's actually valid utf-8, and it'd be good to catch
this at the fd rather than messing up the batch insert by the director,
thus making it much safer than it presently is to use Bacula with a
utf-8 database.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message A. Kretschmer 2009-12-03 06:27:49 Re: How to auto-increment?
Previous Message Chris 2009-12-03 05:24:25 Re: Strange. I can logon with an invalid or no password atall

Browse pgsql-hackers by date

  From Date Subject
Next Message Hitoshi Harada 2009-12-03 06:58:39 Re: Cost of sort/order by not estimated by the query planner
Previous Message Michael Paquier 2009-12-03 04:12:23 Re: pgbench: new feature allowing to launch shell commands