Re: XML Encoding problem

Lists: pgsql-general
From: rsmogura <rsmogura(at)softperience(dot)eu>
To: Pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: XML Encoding problem
Date: 2011-02-07 11:44:36
Message-ID: d39e5c6a534a64b30c54ab0d0b0f9218@mail.softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

I have test database with UTF-8 encoding. I putted there XML
<a>ЁĄ¡</a>, (U+0401, U+0104, U+00A1). I changed client encoding to
iso8859-2, as the result of select I got
ERROR: character 0xd081 of encoding "UTF8" has no equivalent in
"LATIN2"
Stan SQL:22P05.

I should got result with characters entities for unparsable characters
&#...;.

Kind regards,
Radosław Smogura


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: rsmogura <rsmogura(at)softperience(dot)eu>
Cc: Pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: XML Encoding problem
Date: 2011-02-09 22:29:29
Message-ID: 1297290569.23596.12.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On mån, 2011-02-07 at 12:44 +0100, rsmogura wrote:
> I have test database with UTF-8 encoding. I putted there XML
> <a>ЁĄ¡</a>, (U+0401, U+0104, U+00A1). I changed client encoding to
> iso8859-2, as the result of select I got
> ERROR: character 0xd081 of encoding "UTF8" has no equivalent in
> "LATIN2"
> Stan SQL:22P05.
>
> I should got result with characters entities for unparsable characters
> &#...;.

Hehe, interesting idea, but it's not implemented that way. We don't
alter the XML data, except for the XML declaration.


From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: XML Encoding problem
Date: 2011-02-10 07:51:36
Message-ID: 201102100851.36287.rsmogura@softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I may write some patch, actually text mode will not be affected, becuase it's
text mode, and patch will fail if client encoding is "reacher" then server
(one possiblity in this situation is to XML-encode to client encoding, text-
rencode to server encoding)

But looking at code same thing could occur with binary recv. I saw there text
based XML conversion (it's altering XML in some way). According to doc I can
store XML in any encodign using binary mode.

I think if text conversion fails, then XML rewrite should occur, and all
unparsable character should be converted to XML entities...

Actually it's XML, not varchar with parsing :)

Peter Eisentraut <peter_e(at)gmx(dot)net> Wednesday 09 February 2011 23:29:29
> On mån, 2011-02-07 at 12:44 +0100, rsmogura wrote:
> > I have test database with UTF-8 encoding. I putted there XML
> > <a>ЁĄ¡</a>, (U+0401, U+0104, U+00A1). I changed client encoding to
> > iso8859-2, as the result of select I got
> > ERROR: character 0xd081 of encoding "UTF8" has no equivalent in
> > "LATIN2"
> > Stan SQL:22P05.
> >
> > I should got result with characters entities for unparsable characters
> > &#...;.
>
> Hehe, interesting idea, but it's not implemented that way. We don't
> alter the XML data, except for the XML declaration.