Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Greg Stark" <stark(at)mit(dot)edu>
Cc: "Andres Freund" <andres(at)2ndquadrant(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII
Date: 2013-09-07 11:06:16
Message-ID: 76E87134576A4731A8AAFE70DD4D3910@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for your opinions and ideas.

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Greg Stark <stark(at)mit(dot)edu> writes:
>> What would be nicer would be to display the C define, EINVAL, EPERM, etc.
>> Afaik there's no portable way to do that though. I suppose we could just
>> have a small array or hash table of all the errors we know about and look
>> it up.
>
> Yeah, I was just thinking the same thing. We could do
>
> switch (errno)
> {
> case EINVAL: str = "EINVAL"; break;
> case ENOENT: str = "ENOENT"; break;
> ...
> #ifdef EFOOBAR
> case EFOOBAR: str = "EFOOBAR"; break;
> #endif
> ...
>
> for all the common or even less-common names, and only fall back on
> printing a numeric value if it's something really unusual.
>
> But I still maintain that we should only do this if we can't get a useful
> string out of strerror().

OK, I'll take this approach. That is:

str = strerror(errnum);
if (str == NULL || *str == '\0' || *str == '?')
{
switch (errnum)
{
case EINVAL: str = "errno=EINVAL"; break;
case ENOENT: str = "errno=ENOENT"; break;
...
#ifdef EFOOBAR
case EFOOBAR: str = "EFOOBAR"; break;
#endif
default:
snprintf(errorstr_buf, sizeof(errorstr_buf),
_("operating system error %d"), errnum);
str = errorstr_buf;
}
}

The number of questionmarks probably depends on the original message, so I
won't strcmp() against "???".

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> There is certainly no way we'd risk back-patching something with as
> many potential side-effects as fooling with libc's textdomain.

Agreed. It should be better to avoid making use of undocumented behavior
(i.e. strerror() uses libc.mo), if we can take another approach.

> BTW: personally, I would say that what you're looking at is a glibc bug.
> I always thought the contract of gettext was to return the ASCII version
> if it fails to produce a translated version. That might not be what the
> end user really wants to see, but surely returning something like "???"
> is completely useless to anybody.

I think so, too. Under the same condition, PostgreSQL built with Oracle
Studio on Solaris outputs correct Japanese for strerror(), and English is
output on Windows. I'll contact glibc team to ask for improvement.

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> I dislike that on grounds of readability and translatability; and
> I'm also of the opinion that errno codes aren't really consistent
> enough across platforms to be all that trustworthy for remote diagnostic
> purposes. I'm fine with printing the code if strerror fails to
> produce anything useful --- but not if it succeeds.

I don't think this is a concern, because we should ask trouble reporters
about the operating system where they are running the database server.

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> There isn't any way to cram this information
> into the current usage of %m without doing damage to the readability and
> translatability of the string. Our style & translatability guidelines
> specifically recommend against assembling messages out of fragments,
> and also against sticking in parenthetical additions.

From: "Andres Freund" <andres(at)2ndquadrant(dot)com>
> If we'd add the errno inside %m processing, I don't see how it's
> a problem for translation?

I'm for Andres. I don't see any problem if we don't translate "errno=%d".

I'll submit a revised patch again next week. However, I believe my original
approach is better, because it outputs user-friendly Japanese message
instead of "errno=ENOENT". Plus, outputing both errno value and its
descriptive text is more useful, because the former is convenient for
OS/library experts and the latter is convenient for PostgreSQL users. Any
better idea would be much appreciated.

Regards
MauMau

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-09-07 11:45:58 Re: ENABLE/DISABLE CONSTRAINT NAME
Previous Message Pavel Stehule 2013-09-07 08:02:07 review: psql and pset without any arguments