From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Getting the red out (of the buildfarm) |
Date: | 2009-10-04 07:18:01 |
Message-ID: | 1254640681.13996.10.camel@fsopti579.F-Secure.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, 2009-10-03 at 13:40 -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > OK, the reason I couldn't reproduce this for the life of me is that I
> > had PGCLIENTENCODING=UTF8 in the environment of the server(!). Once I
> > unset that, I could reproduce the problem. This could be made a bit
> > more well-defined if we ran pg_regress with --multibyte=something,
> > although that is then liable to fail in encodings that don't have an
> > equivalent of \u0080. Some with your suggestion above: It will only
> > work for some encodings.
>
> I'm back to wondering why we need a regression test for this at all.
> Wouldn't it be just as useful to be testing a character code that
> is well-defined everywhere? Or just drop this test altogether?
> It's already got way too many expected files for my taste.
Note that I didn't write this test; it has been there for ages. It used
to prove that you couldn't process non-ASCII Unicode characters in
PL/Python at all (for some value of "at all" ...), and after I
implemented Unicode support they now show that you can. So they served
a real purpose, and changing them to use an ASCII character code (which
is presumably the only thing that is "well-defined everywhere") wouldn't
have done the same thing. (In that case I probably would have had to
write the test case myself.)
I understand the annoyance, but I think we do need to have an organized
way to do testing of non-ASCII data and in particular UTF8 data, because
there are an increasing number of special code paths for those. Perhaps
we could have a naming convention for test files like testname.utf8.sql,
so they only get run in the appropriate environment. Any scheme like
that has the disadvantage, however, that the proper rejection of
non-ASCII data in ASCII environments isn't tested. (That's what all
these alternative result files for the plpython_unicode test are for,
btw.)
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Gierth | 2009-10-04 08:51:05 | taking a stab at agg(foo ORDER BY bar) |
Previous Message | Noah Misch | 2009-10-04 06:00:22 | Review of "SQLDA support for ECPG" |