XML import with DTD

Lists: pgsql-general
From: Roy Walter <walt(at)brookhouse(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: XML import with DTD
Date: 2009-07-10 14:49:00
Message-ID: 4A5754DC.1090502@brookhouse.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi

I'm trying to use the XPath functionality of Postgres.

I can populate a text field (unparsed) with XML data but as far as I can
see the xpath() function [now] only works on the xml data type.

When I try to populate a text field with XML data containing a DTD,
however, the parser chokes. If I strip the DTD the parser chokes on
undefined entities which are defined in the DTD.

(I switched the app' to from MySQL to Postgres because while MySQL works
it returns matches in undelimited form which is next to useless if, for
example, you return multiple attributes from a node.)

Does anyone know of a solution to this problem?

Windows 2000 Server
Postgres 8.4

Regards
Roy Walter


From: artacus(at)comcast(dot)net
To: walt(at)brookhouse(dot)co(dot)uk
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: XML import with DTD
Date: 2009-07-10 17:48:38
Message-ID: 1075040353.110111247248118213.JavaMail.root@sz0018a.emeryville.ca.mail.comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Post a snippet of the xml and xpath you are trying to use.

Scott

----- Original Message -----
From: "Roy Walter" <walt(at)brookhouse(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Sent: Friday, July 10, 2009 7:49:00 AM GMT -08:00 US/Canada Pacific
Subject: [GENERAL] XML import with DTD

Hi

I'm trying to use the XPath functionality of Postgres.

I can populate a text field (unparsed) with XML data but as far as I can see the xpath() function [now] only works on the xml data type.

When I try to populate a text field with XML data containing a DTD, however, the parser chokes. If I strip the DTD the parser chokes on undefined entities which are defined in the DTD.

(I switched the app' to from MySQL to Postgres because while MySQL works it returns matches in undelimited form which is next to useless if, for example, you return multiple attributes from a node.)

Does anyone know of a solution to this problem?

Windows 2000 Server
Postgres 8.4

Regards
Roy Walter


From: Roy Walter <walt(at)brookhouse(dot)co(dot)uk>
To: artacus(at)comcast(dot)net
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: XML import with DTD
Date: 2009-07-11 08:16:31
Message-ID: 4A584A5F.5020104@brookhouse.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

It's not an xpath problem it's an XML import problem. Sorry if I wasn't
clear.

Consider the following example queries. This one works fine:

INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
'<?xml version="1.0" encoding="ISO-8859-1"?>
<shop>
<product>Shoes</product>
</shop>')))

This one does not:

INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
'<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE publicwhip
[
<!ENTITY ndash "&#8211;">
<!ENTITY mdash "&#8212;">
]>
<shop>
<product>Shoes</product>
</shop>')))

Both are valid XML but the second query fails as follows:

ERROR: invalid XML content
DETAIL: Entity: line 2: parser error : StartTag: invalid element name
<!DOCTYPE publicwhip
^
Entity: line 4: parser error : StartTag: invalid element name
<!ENTITY ndash "&#8211;">
^
Entity: line 5: parser error : StartTag: invalid element name
<!ENTITY mdash "&#8212;">

-- Roy

artacus(at)comcast(dot)net wrote:
> Post a snippet of the xml and xpath you are trying to use.
>
>
> Scott
>
> ----- Original Message -----
> From: "Roy Walter" <walt(at)brookhouse(dot)co(dot)uk>
> To: pgsql-general(at)postgresql(dot)org
> Sent: Friday, July 10, 2009 7:49:00 AM GMT -08:00 US/Canada Pacific
> Subject: [GENERAL] XML import with DTD
>
> Hi
>
> I'm trying to use the XPath functionality of Postgres.
>
> I can populate a text field (unparsed) with XML data but as far as I
> can see the xpath() function [now] only works on the xml data type.
>
> When I try to populate a text field with XML data containing a DTD,
> however, the parser chokes. If I strip the DTD the parser chokes on
> undefined entities which are defined in the DTD.
>
> (I switched the app' to from MySQL to Postgres because while MySQL
> works it returns matches in undelimited form which is next to useless
> if, for example, you return multiple attributes from a node.)
>
> Does anyone know of a solution to this problem?
>
> Windows 2000 Server
> Postgres 8.4
>
> Regards
> Roy Walter
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.387 / Virus Database: 270.13.9/2229 - Release Date: 07/10/09 07:05:00
>
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: walt(at)brookhouse(dot)co(dot)uk
Cc: artacus(at)comcast(dot)net, pgsql-general(at)postgresql(dot)org
Subject: Re: XML import with DTD
Date: 2009-07-11 17:41:18
Message-ID: 21536.1247334078@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Roy Walter <walt(at)brookhouse(dot)co(dot)uk> writes:
> This one does not:

> INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
> '<?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE publicwhip
> [
> <!ENTITY ndash "&#8211;">
> <!ENTITY mdash "&#8212;">
> ]>
> <shop>
> <product>Shoes</product>
> </shop>')))

What I know about XML wouldn't fill a thimble, but shouldn't you say
DOCUMENT not CONTENT if you are trying to provide a complete document?
Doing that seems to make this work without error.

The fine manual states near the bottom of 8.13.1
http://www.postgresql.org/docs/8.4/static/datatype-xml.html
that CONTENT is less restrictive than DOCUMENT, but at least for
this specific point that seems not to be true.

regards, tom lane


From: Roy Walter <walt(at)brookhouse(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: artacus(at)comcast(dot)net, pgsql-general(at)postgresql(dot)org
Subject: Re: XML import with DTD
Date: 2009-07-11 18:59:43
Message-ID: 4A58E11F.9050500@brookhouse.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Doh! That's it. Thanks a million.

-- Roy

Tom Lane wrote:
> Roy Walter <walt(at)brookhouse(dot)co(dot)uk> writes:
>
>> This one does not:
>>
>
>
>> INSERT INTO wms_collection (docxml) VALUES (XMLPARSE(content(
>> '<?xml version="1.0" encoding="ISO-8859-1"?>
>> <!DOCTYPE publicwhip
>> [
>> <!ENTITY ndash "&#8211;">
>> <!ENTITY mdash "&#8212;">
>> ]>
>> <shop>
>> <product>Shoes</product>
>> </shop>')))
>>
>
> What I know about XML wouldn't fill a thimble, but shouldn't you say
> DOCUMENT not CONTENT if you are trying to provide a complete document?
> Doing that seems to make this work without error.
>
> The fine manual states near the bottom of 8.13.1
> http://www.postgresql.org/docs/8.4/static/datatype-xml.html
> that CONTENT is less restrictive than DOCUMENT, but at least for
> this specific point that seems not to be true.
>
> regards, tom lane
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.387 / Virus Database: 270.13.10/2231 - Release Date: 07/11/09 05:57:00
>
>