Lists: | pgsql-hackers |
---|
From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | PG Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Bug in XPATH() produces invalid XML values and probably un-restorable dumps |
Date: | 2011-05-31 18:50:23 |
Message-ID: | 6F4E39D6-98D7-448E-ABC1-3A52CF884904@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi
While trying to figure out sensible semantics for XPATH() and scalar-value returning XPath expressions, I've stumbled upon a bug in XPATH() that allows invalid XML values to be produced. This is a serious problem because should such invalid values get inserted into an XML column, an un-restorable dump ensues.
Here's an example (REL9_0_STABLE as of a few days ago)
template1=# SELECT (XPATH('/*/text()', '<root><</root>'))[1];
xpath
-------
<
Since XPATH() returns XML[], this value has type XML, but clearly isn't well-formed. And behold, casting to TEXT and back to XML complains loudly.
template1=# SELECT (XPATH('/*/text()', '<root><</root>'))[1]::TEXT::XML;
ERROR: invalid XML content
DETAIL: Entity: line 1: parser error : StartTag: invalid element name
<
^
The culprit is xml_xmlnodetoxmltype() in backend/utils/adt/xml.c. For non-element nodes, it returns the result of xmlXPathCastNodeToString() verbatim, even though that function doesn't reverse the entity replacement that was done during parsing. Adding a call to escape_xml()
for non-element nodes fixes the problem
template1=# SELECT (XPATH('/*/text()', '<root><</root>'))[1];
xpath
-------
<
Patch is attached.
best regards,
Florian Pflug
Attachment | Content-Type | Size |
---|---|---|
pg_xpath_invalidxml.v1.patch | application/octet-stream | 564 bytes |
From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | Florian Pflug <fgp(at)phlo(dot)org> |
Cc: | PG Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bug in XPATH() produces invalid XML values and probably un-restorable dumps |
Date: | 2011-06-09 18:00:31 |
Message-ID: | FC60A4F5-5560-438C-97E8-B6D1FDE1CC2B@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On May31, 2011, at 20:50 , Florian Pflug wrote:
> While trying to figure out sensible semantics for XPATH() and scalar-value returning XPath expressions, I've stumbled upon a bug in XPATH() that allows invalid XML values to be produced. This is a serious problem because should such invalid values get inserted into an XML column, an un-restorable dump ensues.
>
> Here's an example (REL9_0_STABLE as of a few days ago)
>
> template1=# SELECT (XPATH('/*/text()', '<root><</root>'))[1];
> xpath
> -------
> <
>
> ...
>
> Patch is attached.
I've added two tests to the xml regression test which highlight the issue.
Updated patch attached.
best regards,
Florian Pflug
Attachment | Content-Type | Size |
---|---|---|
pg_xpath_invalidxml.v2.patch | application/octet-stream | 3.5 KB |