printQuery API change proposal (was Re: psql \dFp's behavior)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: printQuery API change proposal (was Re: psql \dFp's behavior)
Date: 2007-12-11 22:42:35
Message-ID: 7920.1197412955@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> describe.c's whole approach to this has always been pretty thoroughly
> broken in my mind, because it makes untenable assumptions about the
> client-side gettext() producing strings that are in the current
> client_encoding. If they are not, the server will probably reject
> the SQL query as failing encoding verification.

> We should be fixing it so that the translated strings never go to the
> server and back at all. This doesn't seem amazingly hard for column
> headings --- it'd take some API additions in print.c, I think.
> If we are actually embedding translated words in the data
> then it'd be a bigger problem.

I looked at the code a bit closer, and my vague memory was correct:
describe.c mostly uses translated strings for column headers, eg

printfPQExpBuffer(&buf,
"SELECT spcname AS \"%s\",\n"
" pg_catalog.pg_get_userbyid(spcowner) AS \"%s\",\n"
" spclocation AS \"%s\"",
_("Name"), _("Owner"), _("Location"));

but there are also a few places where it wants a column to contain
translated values, for example

" CAST(\n"
" CASE c.relkind WHEN 'r' THEN '%s' WHEN 'v' THEN '%s' WHEN 'i' THEN '%s' WHEN 'S' THEN '%s' END"
" AS pg_catalog.text) as object\n"
...
_("table"), _("view"), _("index"), _("sequence")

It would be reasonably straightforward to get rid of sending the column
headers to the server, since the underlying printTable function already
accepts column headers as a separate array argument; we could ignore
the column headers coming back from the server and just inject correctly
translated strings instead. However the data values are a bit harder.

What I'm tempted to do is add a couple of optional fields to struct
printQueryOpt that specify translatable strings in column headers and
column contents, respectively:

bool translate_headers;
bool *translate_columns; /* translate_columns[i-1] applies to column i */

If these are set then printQuery would run through the headers and/or
contents of specific columns and apply gettext() on the indicated
strings, after it had finished disassembling the PGresult into arrays.
(Since we don't want to be doing gettext() on random strings, we need
to indicate exactly which columns should be processed.) To ensure that
the strings are available for translation, all the _("x") instances in
describe.c would change to gettext_noop("x"), but otherwise that code
would only need to change to the extent of setting the new option fields
in printQueryOpt. This means the server sees only untranslated
plain-ASCII strings and shouldn't get upset about encoding issues.

We'd still have a problem if we wanted to put a single-quote mark in an
untranslated string (or a double-quote, in the case of a column header),
but so far there's been no need for that. If it did come up, we could
handle it in the style Guillaume suggested, that is
appendStringLiteral(gettext_noop("foo's a problem")). So I think it's
not necessary to contort the general solution to make that case easier.

printQueryOpt isn't exported anywhere but bin/psql and bin/scripts,
so changing it doesn't create an ABI break.

Objections, better ideas?

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2007-12-11 22:48:20 Re: archive_command failures report confusing exit status
Previous Message Peter Eisentraut 2007-12-11 22:31:40 Re: archive_command failures report confusing exit status