Quick Links

Re: returning PGresult as xml

Lists:	pgsql-hackers

From:	Brian Moore <brianmooreca(at)yahoo(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	returning PGresult as xml
Date:	2004-01-25 09:07:20
Message-ID:	20040125090720.86968.qmail@web13509.mail.yahoo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

hello,

this note is intended to describe my work on beginning to further
integrate xml into postgresql. first, i'd like to thank the
contributers of contrib/xml as their work was instrumental in helping
me understand what support exists and where i wanted to go. thanks.

my first requirement is to export data from the database into a format
which can be read not only by existing (postgresql) clients but by
people and programs that don't know how to use a PGresult. xml is very
verbose, but its popularity makes it closer to universal than anything
else of which i could think. in addition, ideas like XSL/XSLT make an
export of xml very attractive to me.

it's been said that converting a PGresult into xml is "trivial" and
that's why it hasn't been done in the codebase as of yet. i have seen
much code that writes xml, and many mistakes are made. most often
improper escaping, or writing to a schema/DTD that has not been
well-thought out. the transformation into xml is not difficult, but it
does require attention to detail.

i feel badly that i have not been able to use any existing
standards. xmlrpc, i found, was not type-rich enough, and that made it
difficult or impossible to use. in particular, the only way to
represent a matrix is as a struct of structs. this makes it very
verbose for one to encode a PGresult. i found SOAP too difficult for
compliance. so my result was to create a schema, which results in a DTD.

an example of what my code generates can be found below. the following
xml is the result of the query "SELECT 1 as foo 2 as bar":

<?xml version='1.0' encoding='ISO-8859-1'?>
<!DOCTYPE PGresult [
<!ELEMENT PGresult (col_desc*, row*)>
<!ATTLIST PGresult
num_rows CDATA #REQUIRED
num_cols CDATA #REQUIRED>
<!ELEMENT col_desc EMPTY>
<!ATTLIST col_desc
num CDATA #REQUIRED
format (text | binary) #REQUIRED
type CDATA #REQUIRED
name CDATA #REQUIRED>
<!ELEMENT row (col*)>
<!ATTLIST row
num CDATA #REQUIRED>
<!ELEMENT col (#PCDATA)>
<!ATTLIST col
num CDATA #REQUIRED>
<!ENTITY NULL ''>
]>
<PGresult num_rows='1' num_cols='2'>
<col_desc num='0' type='int4' format='text' name='foo' />
<col_desc num='1' type='int4' format='text' name='bar' />
<row num='0'>
<col num='0'>1</col>
<col num='1'>2</col>
</row>
</PGresult>

a slightly more complicated example:
template1=# select oid,typname,typlen,typtype from pg_type where
oid<20;
oid | typname | typlen | typtype
-----+---------+--------+---------
16 | bool | 1 | b
17 | bytea | -1 | b
18 | char | 1 | b
19 | name | 32 | b
(4 rows)

<!DOCTYPE PGresult [
<!ELEMENT PGresult (col_desc*, row*)>
<!ATTLIST PGresult
num_rows CDATA #REQUIRED
num_cols CDATA #REQUIRED>
<!ELEMENT col_desc EMPTY>
<!ATTLIST col_desc
num CDATA #REQUIRED
format (text | binary) #REQUIRED
type CDATA #REQUIRED
name CDATA #REQUIRED>
<!ELEMENT row (col*)>
<!ATTLIST row
num CDATA #REQUIRED>
<!ELEMENT col (#PCDATA)>
<!ATTLIST col
num CDATA #REQUIRED>
<!ENTITY NULL ''>
]>
<PGresult num_rows='4' num_cols='4'>
<col_desc num='0' type='oid' format='text' name='oid' />
<col_desc num='1' type='name' format='text' name='typname' />
<col_desc num='2' type='int2' format='text' name='typlen' />
<col_desc num='3' type='char' format='text' name='typtype' />
<row num='0'>
<col num='0'>16</col>
<col num='1'>bool</col>
<col num='2'>1</col>
<col num='3'>b</col>
</row>
<row num='1'>
<col num='0'>17</col>
<col num='1'>bytea</col>
<col num='2'>-1</col>
<col num='3'>b</col>
</row>
<row num='2'>
<col num='0'>18</col>
<col num='1'>char</col>
<col num='2'>1</col>
<col num='3'>b</col>
</row>
<row num='3'>
<col num='0'>19</col>
<col num='1'>name</col>
<col num='2'>32</col>
<col num='3'>b</col>
</row>
</PGresult>

i have done this work for myself and my own needs, so i fully
understand if this work is not interesting to the postgresql group in
general. however, if there is some chance that the changes could be
incorporated into the tree, i would be interested in contributing, as
integration into a proper version of postgresql will make my build
easier. ;)

i would expect that integration would look something like exposing
from libpq a function that looks something like:
const char *PGresult_as_xml(PGresult *result, int include_dtd);

i would also expect that psql would be modified to take a \X
and to call the above function. there is some strangeness now,
as psql doesn't call methods defined in libpq to print; it has
its own printer. i, of course, would do this work. :) i just
need to know that people are interested.

also, if integration is going to happen, i will need to replace
calls to my hashtables with calls to postgresql's hashtables.
i saw dynamic hashtables in the backend, but not in the interfaces.
i wasn't exactly sure how i should go about introducing another
module to the frontend; there could be problems of which i
remain blissfully unaware.

i look forward to feedback, and i hope this note finds you well,

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/

From:	Hannu Krosing <hannu(at)tm(dot)ee>
To:	Brian Moore <brianmooreca(at)yahoo(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-25 16:24:03
Message-ID:	1075047843.17558.5.camel@fuji.krosing.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Brian Moore kirjutas P, 25.01.2004 kell 11:07:
> hello,
>
> this note is intended to describe my work on beginning to further
> integrate xml into postgresql. first, i'd like to thank the
> contributers of contrib/xml as their work was instrumental in helping
> me understand what support exists and where i wanted to go. thanks.

First, IMHO having unified XML support is a good thing for Postgres.

I still have some questions and suggestions:

At what place do you intend to add your converter ?

I remember someone started abstracting out the FE/BE protocol calls in
server code with an aim of supporting multiple protocols, but stopped
without making too much progress (?)

Also, I would suggest that XML Schema datatypes should be used,
preferrably together with either RelaxNG schema or something from
SQL/XML spec.

-------------
Hannu

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Brian Moore <brianmooreca(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-25 17:23:33
Message-ID:	200401251823.33826.peter_e@gmx.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Brian Moore wrote:
> i feel badly that i have not been able to use any existing
> standards. xmlrpc, i found, was not type-rich enough, and that made
> it difficult or impossible to use. in particular, the only way to
> represent a matrix is as a struct of structs. this makes it very
> verbose for one to encode a PGresult. i found SOAP too difficult for
> compliance. so my result was to create a schema, which results in a
> DTD.

Let me point out an implementation I made last time this subject was
discussed:

http://developer.postgresql.org/~petere/xmltable.tar.bz2

This package contains server-side functions that convert a table (more
generally a query result) to an XML document and/or and XSL schema both
mimicking the SQL/XML standard.

Additionally, it contains a function to convert such an XML document
back to a table source. I also threw in an XSLT stylesheet to convert
an SQL/XML table to an HTML table, so you can more easily view the
results.

I also have some code in development that adds cursor interfaces, an XML
data type, and some integration with the existing XPath functionality.
I think that for processing XML in the database and as far as following
the existing standards, this is the direction to take.

Also last time this subject was dicussed, I believe it was Mike Mascari
who proposed and implemented another solution which is more client-side
oriented. He wrote a piece of code that took a normal libpq result set
and shipped it off as SQL/XML wrapped in SOAP. And it had streaming
capabilities for large result sets.

These are two complementary approaches that exist more or less. Of
course this only covers the C API and would need sensible extensions
for other programming langauges. But I invite you to look at them and
see whether they fit your needs (the ideas, not necessarily the state
of the code).

From:	Mike Mascari <mascarm(at)mascari(dot)com>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Brian Moore <brianmooreca(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-26 02:06:53
Message-ID:	4014763D.8030508@mascari.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Peter Eisentraut wrote:

>Brian Moore wrote:
>
>
>>i feel badly that i have not been able to use any existing
>>standards. xmlrpc, i found, was not type-rich enough, and that made
>>it difficult or impossible to use. in particular, the only way to
>>represent a matrix is as a struct of structs. this makes it very
>>verbose for one to encode a PGresult. i found SOAP too difficult for
>>compliance. so my result was to create a schema, which results in a
>>DTD.
>>
>>
>
>Let me point out an implementation I made last time this subject was
>discussed:
>
>http://developer.postgresql.org/~petere/xmltable.tar.bz2
>
>Also last time this subject was dicussed, I believe it was Mike Mascari
>who proposed and implemented another solution which is more client-side
>oriented.
>
I humbly confess it wasn't me. We use CORBA....

Mike Mascari

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-26 16:51:21
Message-ID:	40154589.8030705@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Peter Eisentraut wrote:

>Let me point out an implementation I made last time this subject was
>discussed:
>
>http://developer.postgresql.org/~petere/xmltable.tar.bz2
>
>This package contains server-side functions that convert a table (more
>generally a query result) to an XML document and/or and XSL schema both
>mimicking the SQL/XML standard.
>
>Additionally, it contains a function to convert such an XML document
>back to a table source. I also threw in an XSLT stylesheet to convert
>an SQL/XML table to an HTML table, so you can more easily view the
>results.
>
>I also have some code in development that adds cursor interfaces, an XML
>data type, and some integration with the existing XPath functionality.
>I think that for processing XML in the database and as far as following
>the existing standards, this is the direction to take.
>
>
>

Peter: this looks very nice. What are your intentions with this code?
Put it in contrib? Also, do you intend to implement the SQL/XML
functions XMLElement, XMLForest, XMLAttributes, XMLConcat and XMLAgg?

cheers

andrew

From:	Scott Lamb <slamb(at)slamb(dot)org>
To:	Brian Moore <brianmooreca(at)yahoo(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-27 12:58:23
Message-ID:	7D909592-50C8-11D8-B400-000A95891440@slamb.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Jan 25, 2004, at 3:07 AM, Brian Moore wrote:
> it's been said that converting a PGresult into xml is "trivial" and
> that's why it hasn't been done in the codebase as of yet. i have seen
> much code that writes xml, and many mistakes are made. most often
> improper escaping, or writing to a schema/DTD that has not been
> well-thought out. the transformation into xml is not difficult, but it
> does require attention to detail.

The escaping, at any rate, is trivial if you use a proper API. It
sounds like your code is not using any XML API, given that you have not
mentioned adding dependencies to libpq and that you've mentioned your
own hashtable algorithm. It would be much easier if you did so, though
I imagine the additional dependency would mean it would not be accepted
into libpq.

> <PGresult num_rows='1' num_cols='2'>
> <col_desc num='0' type='int4' format='text' name='foo' />
> <col_desc num='1' type='int4' format='text' name='bar' />
> <row num='0'>
> <col num='0'>1</col>
> <col num='1'>2</col>
> </row>
> </PGresult>

How would you filter for a column in XSLT based on column name with
this schema? It's certainly not trivial. I have similar code, and I
included the column name as an attribute in each column element for
this reason.

I also used the java.sql type names rather than PostgreSQL ones, as my
code is not specific to PostgreSQL.

> i would expect that integration would look something like exposing
> from libpq a function that looks something like:
> const char *PGresult_as_xml(PGresult *result, int include_dtd);

Ugh. So it returns the whole thing as one big string? That won't hold
up well if your resultset is large.

A better way would be to pump out SAX events. This is what I did for
three reasons:

1) The escaping becomes trivial, as mentioned above. In fact, not only
does SAX escape things correctly, but it makes you explicitly specify
that the string you're giving it is character data, an element name, an
attribute name, an attribute value, etc, and handles everything
properly based on that. So you'd really have to work to screw it up,
unlike code that just does like

printf("<elem foo='%s' bar='%s'>%s</elem>",
xml_attr_escape(foo_val), xml_attr_escape(bar_val),
xml_char_escape(elem_val));

where it would be quite easy to lose track of what needs to be escaped
how, what variables are already escaped, etc.

2) It can stream large result sets, provided that the next stage
supports doing so. Certainly a raw SAX serializer would, also some XSLT
stylesheets with Xalan, and STX/Joost is designed for streaming
transformations.

3) If the next stage is a transformation, this makes it unnecessary to
serialize and parse the data between. So the SAX way is faster.

You're welcome to take a look at my code. I imagine it will not be
directly useful to you, as it is written in Java, but I have a live
example which puts this stuff to use. Designing an acceptable API and
schema is always much easier when you see how it is put to use.

<http://www.slamb.org/projects/xmldb/> - my (so far poorly-named) xmldb
project, which includes the org.slamb.xmldb.ResultSetProducer class to
transform a java.sql.ResultSet to SAX events in my resultset schema.

<http://www.slamb.org/svn/repos/projects/xmldb/src/java/org/slamb/
xmldb/ResultSetProducer.java> - source code for said class

<http://www.slamb.org/projects/mb/> - a message board which uses this
code and some XSLT

<https://www.slamb.org/mb/> - a live example of said message board

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/
resultset.xsl> - simple XSLT to take an arbitrary resultset and convert
it to an HTML table

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/main.xsl> -
an example XSLT file that inherits this and then provides exceptions
for a couple columns (not displaying the id column, instead including
it as a hyperlink in the name column).

Good luck.

Scott Lamb

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-27 13:51:30
Message-ID:	40166CE2.2060703@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Scott Lamb wrote:

> On Jan 25, 2004, at 3:07 AM, Brian Moore wrote:
>
>> <PGresult num_rows='1' num_cols='2'>
>> <col_desc num='0' type='int4' format='text' name='foo' />
>> <col_desc num='1' type='int4' format='text' name='bar' />
>> <row num='0'>
>> <col num='0'>1</col>
>> <col num='1'>2</col>
>> </row>
>> </PGresult>
>
>
> How would you filter for a column in XSLT based on column name with
> this schema? It's certainly not trivial. I have similar code, and I
> included the column name as an attribute in each column element for
> this reason.

Close to trivial if you set up a key on the col-desc elements, I should
think. Maybe something like:

<xsl:key name="coldesc" match="col-desc" use="@num" />

...

<xsl:for-each select=" key('coldesc',@num)/@name = 'colname' " >
...

Alternatively you can get there using the parent and preceding-sibling
axes, but it's less clear.

cheers

andrew

From:	Dave Cramer <pg(at)fastcrypt(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	building plperl on 7.4.1
Date:	2004-01-27 14:21:55
Message-ID:	1075213315.1612.280.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I tried to build plperl on 7.4.1,

On my system

perl -MConfig -e 'print $Config{ccdlflags}'

returns

-rdynamic -Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE

however the build ends up using

-rpath,$prefix/lib

Dave
--
Dave Cramer
519 939 0336
ICQ # 1467551

From:	Dave Cramer <pg(at)fastcrypt(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	index scan with functional indexes
Date:	2004-01-27 14:26:50
Message-ID:	1075213610.1611.287.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I'm curious what the result of a reverse index does on a table with url
like data, so I did the following

create function fn_strrev(text) returns text as 'return reverse($_[0])'
language 'plperl' with (iscachable);

create index r_url_idx on url( fn_strrev(url));

vacuum analyze;

explain select * from url where url like fn_strrev('%beta12.html');
QUERY PLAN
---------------------------------------------------------
Seq Scan on url (cost=0.00..13281.70 rows=1 width=454)
Filter: ((url)::text ~~ 'lmth.21ateb%'::text)

Is it possible to get the planner to use an index scan ?

How?

the db is using locale 'C'

--
Dave Cramer
519 939 0336
ICQ # 1467551

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pg(at)fastcrypt(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 17:33:01
Message-ID:	24060.1075224781@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> create index r_url_idx on url( fn_strrev(url));

> explain select * from url where url like fn_strrev('%beta12.html');
> QUERY PLAN
> ---------------------------------------------------------
> Seq Scan on url (cost=0.00..13281.70 rows=1 width=454)
> Filter: ((url)::text ~~ 'lmth.21ateb%'::text)

> Is it possible to get the planner to use an index scan ?

Sure, but not that way. Try "fn_strrev(url) like something".
You have to compare the indexed value to something...

regards, tom lane

From:	Dave Cramer <pg(at)fastcrypt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 17:41:41
Message-ID:	1075225301.1611.294.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

same answer

davec=# show enable_seqscan;
enable_seqscan
----------------
off
(1 row)

davec=# explain analyze select * from url where fn_strrev(url) like
'%beta12.html';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on url (cost=100000000.00..100013533.04 rows=503 width=454)
(actual time=3851.636..3851.636 rows=0 loops=1)
Filter: (fn_strrev((url)::text) ~~ '%beta12.html'::text)
Total runtime: 3851.712 ms
(3 rows)

On Tue, 2004-01-27 at 12:33, Tom Lane wrote:
> Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> > create index r_url_idx on url( fn_strrev(url));
>
> > explain select * from url where url like fn_strrev('%beta12.html');
> > QUERY PLAN
> > ---------------------------------------------------------
> > Seq Scan on url (cost=0.00..13281.70 rows=1 width=454)
> > Filter: ((url)::text ~~ 'lmth.21ateb%'::text)
>
> > Is it possible to get the planner to use an index scan ?
>
> Sure, but not that way. Try "fn_strrev(url) like something".
> You have to compare the indexed value to something...
>
> regards, tom lane
>
--
Dave Cramer
519 939 0336
ICQ # 1467551

From:	Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To:	Dave Cramer <pg(at)fastcrypt(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 18:01:07
Message-ID:	20040127095921.X65927@megazone.bigpanda.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 27 Jan 2004, Dave Cramer wrote:

> same answer
>
> davec=# show enable_seqscan;
> enable_seqscan
> ----------------
> off
> (1 row)
>
> davec=# explain analyze select * from url where fn_strrev(url) like
> '%beta12.html';

That's still an unanchored like clause, besides I think that would get
urls that begin with lmth.21ateb.

I think the condition you want would be:
fn_strrev(url) like 'lmth.21ateb%'

From:	Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To:	Dave Cramer <pg(at)fastcrypt(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 18:02:03
Message-ID:	20040127180203.GB22417@dcc.uchile.cl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jan 27, 2004 at 12:41:41PM -0500, Dave Cramer wrote:

> davec=# explain analyze select * from url where fn_strrev(url) like
> '%beta12.html';

Reverse the constant too:

davec=# explain analyze select * from url where fn_strrev(url) like
fn_strrev('%beta12.html');

You won't get an indexscan if you have a % in front of the string.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Hay quien adquiere la mala costumbre de ser infeliz" (M. A. Evans)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pg(at)fastcrypt(dot)com
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 18:02:51
Message-ID:	24803.1075226571@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> davec=# explain analyze select * from url where fn_strrev(url) like
> '%beta12.html';

Don't you need the % at the right end to have an indexable plan?
I suspect that both of your tries so far are actually semantically
wrong, and that what you intend is

select * from url where fn_strrev(url) like fn_strrev('%beta12.html');

regards, tom lane

From:	Dave Cramer <pg(at)fastcrypt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 18:09:13
Message-ID:	1075226952.1610.296.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tried, all the suggestions

--dc--
davec=# explain analyze select * from url where fn_strrev(url) like
fn_strrev('%beta12.html');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on url (cost=100000000.00..100013533.04 rows=503 width=454)
(actual time=1416.448..3817.221 rows=12 loops=1)
Filter: (fn_strrev((url)::text) ~~ 'lmth.21ateb%'::text)
Total runtime: 3817.315 ms
(3 rows)

davec=# explain analyze select * from url where fn_strrev(url) like
'lmth.21ateb%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Seq Scan on url (cost=100000000.00..100013533.04 rows=503 width=454)
(actual time=1412.181..3843.998 rows=12 loops=1)
Filter: (fn_strrev((url)::text) ~~ 'lmth.21ateb%'::text)
Total runtime: 3844.106 ms
(3 rows)

davec=# explain analyze select * from url where fn_strrev(url) like
'%lmth.21ateb';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on url (cost=100000000.00..100013533.04 rows=503 width=454)
(actual time=3853.501..3853.501 rows=0 loops=1)
Filter: (fn_strrev((url)::text) ~~ '%lmth.21ateb'::text)
Total runtime: 3853.583 ms
(3 rows)

On Tue, 2004-01-27 at 13:02, Tom Lane wrote:
> Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> > davec=# explain analyze select * from url where fn_strrev(url) like
> > '%beta12.html';
>
> Don't you need the % at the right end to have an indexable plan?
> I suspect that both of your tries so far are actually semantically
> wrong, and that what you intend is
>
> select * from url where fn_strrev(url) like fn_strrev('%beta12.html');
>
> regards, tom lane
>
--
Dave Cramer
519 939 0336
ICQ # 1467551

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pg(at)fastcrypt(dot)com
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 18:28:05
Message-ID:	25342.1075228085@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> Tried, all the suggestions

Mph. It works for me... what PG version are you using exactly,
and are you certain you've selected C locale? (Do you get LIKE
optimization on plain indexes?)

regards, tom lane

From:	Dave Cramer <pg(at)fastcrypt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 18:38:57
Message-ID:	1075228737.1611.303.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I'm using 7.4.1, the db was initdb --locale='C'

and no I don't get them on plain indexes ????

Dave
On Tue, 2004-01-27 at 13:28, Tom Lane wrote:
> Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> > Tried, all the suggestions
>
> Mph. It works for me... what PG version are you using exactly,
> and are you certain you've selected C locale? (Do you get LIKE
> optimization on plain indexes?)
>
> regards, tom lane
>
--
Dave Cramer
519 939 0336
ICQ # 1467551

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pg(at)fastcrypt(dot)com
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes
Date:	2004-01-27 18:48:15
Message-ID:	25856.1075229295@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> I'm using 7.4.1, the db was initdb --locale='C'
> and no I don't get them on plain indexes ????

Oh? If it's 7.4 then you can confirm the locale selection with
"show lc_collate" and "show lc_ctype" (I think the first of these
is what the LIKE optimization checks).

regards, tom lane

From:	Dave Cramer <pg(at)fastcrypt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes -- solved
Date:	2004-01-27 19:09:13
Message-ID:	1075230553.21166.311.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Interesting it works now, and the good news is it is *WAY* faster, this
might be able to speed up marc's doc search by orders of magnitude

this is searching 100536 rows

select * from url where fn_strrev(url) like fn_strrev('%beta12.html');
1.57ms

explain select * from url where url like '%beta12.html';
3310.38 ms

Dave

On Tue, 2004-01-27 at 13:48, Tom Lane wrote:
> Dave Cramer <pg(at)fastcrypt(dot)com> writes:
> > I'm using 7.4.1, the db was initdb --locale='C'
> > and no I don't get them on plain indexes ????
>
> Oh? If it's 7.4 then you can confirm the locale selection with
> "show lc_collate" and "show lc_ctype" (I think the first of these
> is what the LIKE optimization checks).
>
> regards, tom lane
>
--
Dave Cramer
519 939 0336
ICQ # 1467551

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-29 17:31:59
Message-ID:	200401291831.59937.peter_e@gmx.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
> Peter: this looks very nice. What are your intentions with this code?

Once we figure out how to handle the on-the-wire character set recoding
when faced with XML documents (see separate thread a few weeks ago), I
would like to finish it.

> Put it in contrib? Also, do you intend to implement the SQL/XML
> functions XMLElement, XMLForest, XMLAttributes, XMLConcat and XMLAgg?

You have to implement these directly in the parser, which I'm not yet
excited about.

From:	Hannu Krosing <hannu(at)tm(dot)ee>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: returning PGresult as xml
Date:	2004-01-30 22:48:03
Message-ID:	1075502882.4007.35.camel@fuji.krosing.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Peter Eisentraut kirjutas N, 29.01.2004 kell 19:31:
> Andrew Dunstan wrote:
> > Peter: this looks very nice. What are your intentions with this code?
>
> Once we figure out how to handle the on-the-wire character set recoding
> when faced with XML documents (see separate thread a few weeks ago), I
> would like to finish it.
>
> > Put it in contrib? Also, do you intend to implement the SQL/XML
> > functions XMLElement, XMLForest, XMLAttributes, XMLConcat and XMLAgg?
>
> You have to implement these directly in the parser, which I'm not yet
> excited about.

Why not use some standard parser ?

libxml2 (www.xmlsoft.org) seems nice and is either available as a
separate dynamic library or can also be (IIRC) configured to build with
just the minimal needed functionality.

---------------
Hannu

From:	Kevin Brown <kevin(at)sysexperts(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: index scan with functional indexes -- solved
Date:	2004-01-31 03:35:52
Message-ID:	20040131033551.GB2608@filer
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dave Cramer wrote:
> Interesting it works now, and the good news is it is *WAY* faster, this
> might be able to speed up marc's doc search by orders of magnitude
>
> this is searching 100536 rows
>
> select * from url where fn_strrev(url) like fn_strrev('%beta12.html');
> 1.57ms
>
>
> explain select * from url where url like '%beta12.html';
> 3310.38 ms

The nice thing about this is that you can create your query thusly:

SELECT * from table WHERE column like 'string' AND fn_strrev(column)
LIKE fn_strrev('string')

and, if you have both a standard index on column and a functional index
on fn_strrev(column), the query will be fast (well, as fast as the
pattern in question allows) as long as 'string' is anchored on either end.

I've implemented the 'locate' utility in Perl using a PG backend instead
of the standard locate database. I internally convert globs given as
arguments into LIKE strings, and with a functional index like that the
searches are now blazingly fast -- faster than the original 'locate'
utility. It has the added advantage that you can specify a file type
to further narrow the search (thus 'locate --type file "core"' will find
all regular files named 'core' in the database).

I'll be happy to share my code with anyone who's interested.

--
Kevin Brown kevin(at)sysexperts(dot)com