Re: Getting the red out (of the buildfarm)

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Getting the red out (of the buildfarm)
Date: 2009-10-04 07:18:01
Message-ID: 1254640681.13996.10.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2009-10-03 at 13:40 -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > OK, the reason I couldn't reproduce this for the life of me is that I
> > had PGCLIENTENCODING=UTF8 in the environment of the server(!). Once I
> > unset that, I could reproduce the problem. This could be made a bit
> > more well-defined if we ran pg_regress with --multibyte=something,
> > although that is then liable to fail in encodings that don't have an
> > equivalent of \u0080. Some with your suggestion above: It will only
> > work for some encodings.
>
> I'm back to wondering why we need a regression test for this at all.
> Wouldn't it be just as useful to be testing a character code that
> is well-defined everywhere? Or just drop this test altogether?
> It's already got way too many expected files for my taste.

Note that I didn't write this test; it has been there for ages. It used
to prove that you couldn't process non-ASCII Unicode characters in
PL/Python at all (for some value of "at all" ...), and after I
implemented Unicode support they now show that you can. So they served
a real purpose, and changing them to use an ASCII character code (which
is presumably the only thing that is "well-defined everywhere") wouldn't
have done the same thing. (In that case I probably would have had to
write the test case myself.)

I understand the annoyance, but I think we do need to have an organized
way to do testing of non-ASCII data and in particular UTF8 data, because
there are an increasing number of special code paths for those. Perhaps
we could have a naming convention for test files like testname.utf8.sql,
so they only get run in the appropriate environment. Any scheme like
that has the disadvantage, however, that the proper rejection of
non-ASCII data in ASCII environments isn't tested. (That's what all
these alternative result files for the plpython_unicode test are for,
btw.)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2009-10-04 08:51:05 taking a stab at agg(foo ORDER BY bar)
Previous Message Noah Misch 2009-10-04 06:00:22 Review of "SQLDA support for ECPG"