Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Florian Pflug <fgp(at)phlo(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected
Date: 2011-07-12 09:00:29
Message-ID: cf964c14afa1bf7c60466b22fd45a71d@mail.softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 10 Jul 2011 17:06:22 -0500, Robert Haas wrote:
> On Jul 10, 2011, at 1:40 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> Hackers,
>>
>>>> B. 6. Current behaviour _is intended_ (there is "if" to check
>>>> node type) and _"natural"_. In this particular case user ask for
>>>> text content of some node, and this content is actually "<".
>>>
>>> I don't buy that. The check for the node type is there because
>>> two different libxml functions are used to convert nodes to
>>> strings. The if has absolutely *zero* to do with escaping, expect
>>> for that missing escape_xml() call in the "else" case.
>>>
>>> Secondly, there is little point in having an type XML if we
>>> don't actually ensure that values of that type can only contain
>>> well-formed XML.
>>
>> Can anyone else weigh in on this? Peter?
>
> Unless I am missing something, Florian is clearly correct here.
>
> ...Robert
For me not, because this should be fixed internally by making xml type
sefe, currently xml type may be used to keep proper XMLs and any kind of
data, as well.

If I ask, by any means select xpath(/text(...))..... I want to get
text.
1) How I should descape node in client application (if it's part of xml
I don't have header), bear in mind XML must give support for streaming
processing too.
2) Why I should differntly treat text() then select from varchar in
both I ask for xml, driver can't make this, because it doesn't know if
it gets scalar, text, comment, element, or maybe document.
3) What about current applications, folks probably uses this and are
happy they get text, and will not see, that next release of PostgreSQL
will break their applications.

There is of course disadvantage of current behaviour as it may lead to
inserting badly xmls (in one case), but I created example when auto
escaping will create double escaped xmls, and may lead to insert
inproper data (this is about 2nd patch where Florian add escaping, too).

SELECT XMLELEMENT(name root, XMLATTRIBUTES(foo.namespace AS sth)) FROM
(SELECT
(XPATH('namespace-uri(/*)', x))[1] AS namespace FROM (VALUES
(XMLELEMENT(name
"root", XMLATTRIBUTES('<n' AS xmlns, '<v' AS value),'<t'))) v(x)) as
foo;

xmlelement
-------------------------
<root sth="&amp;lt;n"/>

It can't be resolved without storing type in xml or adding xmltext or
adding pseudo xmlany element, which will be returned by xpath.

Regards,
Radek

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2011-07-12 09:45:59 Re: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected
Previous Message Alexander Korotkov 2011-07-12 08:34:09 Re: WIP: Fast GiST index build