Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
Date: 2011-02-12 09:18:36
Message-ID: AANLkTinzzzJCJE=Ac_kZOOv4Hirogd9M2Wjjcecx_Si1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Sun, Feb 6, 2011 at 15:31, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Force strings passed to and from plperl to be in UTF8 encoding.
>
> String are converted to UTF8 on the way into perl and to the
> database encoding on the way back. This avoids a number of
> observed anomalies, and ensures Perl a consistent view of the
> world.

So I noticed a problem while playing with this in my discussion with
David Wheeler. pg_do_encoding() does nothing when the src encoding ==
the dest encoding. That means on a UTF-8 database we fail make sure
our strings are valid utf8.

An easy way to see this is to embed a null in the middle of a string:
=> create or replace function zerob() returns text as $$ return
"abcd\0efg"; $$ language plperl;
=> SELECT zerob();
abcd

Also It seems bogus to bogus to do any encoding conversion when we are
SQL_ASCII, and its really trivial to fix.

With the attached:
- when we are on a utf8 database make sure to verify our output string
in sv2cstr (we assume database strings coming in are already valid)

- Do no string conversion when we are SQL_ASCII in or out

- add plperl_helpers.h as a dep to plperl.o in our makefile

- remove some redundant calls to pg_verify_mbstr()

- as utf_e2u only as one caller dont pstrdup() instead have the caller
check (saves some cycles and memory)

Attachment Content-Type Size
plperl_utf8_mbverify.patch text/x-patch 4.2 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2011-02-12 13:43:42 pgsql: Teach ALTER TABLE .. SET DATA TYPE to avoid some table rewrites.
Previous Message Tom Lane 2011-02-12 03:54:00 pgsql: Clean up installation directory choices for extensions.

Browse pgsql-hackers by date

  From Date Subject
Next Message Ralf Wildenhues 2011-02-12 10:10:31 Re: [Mingw-users] mingw64
Previous Message Jan Urbański 2011-02-12 09:07:09 Re: pl/python tracebacks