Re: xmlconcat performance

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Davide Berra <d(dot)berra(at)esitelsrl(dot)it>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: xmlconcat performance
Date: 2013-02-28 17:48:05
Message-ID: CAHyXU0y-ZfDSjzJhz7tFMdf-WYVAWOP7Lm2oryxevJexGqX7SA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Feb 22, 2013 at 3:21 AM, Davide Berra <d(dot)berra(at)esitelsrl(dot)it> wrote:
> I got a problem with the performance of a PL/PGsql stored procedure
> outputting an xml.
>
> Server version: PostgreSQL 8.3.6 on i686-pc-linux-gnu, compiled by GCC gcc
> (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)
> CPU: Intel(R) Core(TM) i3 CPU 540 @ 3.07GHz
> RAM installed: 4GB
> Hard Disk: Seagate 500Gb SATA 2
>
> This is a simplified content of the function showing the xmlconcat
> behaviour.
>
> CREATE OR REPLACE FUNCTION test_function (v_limit int)
> RETURNS xml AS
> $BODY$
> DECLARE
> v_xml xml;
> BEGIN
>
> FOR i IN 1..v_limit LOOP
> v_xml := xmlconcat(v_xml, xmlelement(name content, 'aaaaaaa'));
> END LOOP;
>
> RETURN v_xml ;
> END
> $BODY$
> LANGUAGE 'plpgsql' SECURITY DEFINER ;
>
>
> As long as the v_limit parameter grows (and then the size of the output xml,
> the time needed increase exponentially.
> Look at this examples:
>
> pang=# explain analyze select test_function(1000);
> QUERY PLAN
> --------------------------------------------------------------------------------------
> Result (cost=0.00..0.26 rows=1 width=0) (actual time=65.430..65.431 rows=1
> loops=1)
> Total runtime: 65.457 ms
> (2 rows)
>
> pang=# explain analyze select test_function(5000);
> QUERY PLAN
> ----------------------------------------------------------------------------------------
> Result (cost=0.00..0.26 rows=1 width=0) (actual time=473.318..473.318
> rows=1 loops=1)
> Total runtime: 473.340 ms
> (2 rows)
>
> pang=# explain analyze select test_function(15000);
> QUERY PLAN
> ------------------------------------------------------------------------------------------
> Result (cost=0.00..0.26 rows=1 width=0) (actual time=4044.903..4044.904
> rows=1 loops=1)
> Total runtime: 4044.928 ms
> (2 rows)
>
> pang=# explain analyze select test_function(50000);
> QUERY PLAN
> --------------------------------------------------------------------------------------------
> Result (cost=0.00..0.26 rows=1 width=0) (actual time=94994.337..94994.369
> rows=1 loops=1)
> Total runtime: 94994.396 ms
> (2 rows)
>
> I already tried to update to 8.3.23 service version but i didn't see any
> improvement.
>
> Do you have any suggestion about how to increase the performance of
> xmlconcat?
>
> My need is to use stored procedures that calls xmlconcat more than 50000
> times, but it is unacceptable 94 seconds to complete the job.
>
> Thanks in advance

typically for high performance string manipulation you have to do
things on more purely textual level and manipulate through arrays to
get really good performance. iterative string concatenation is
typically wrong approach -- you have to think in set terms.

also your database version is obsolete -- time to start thinking about upgrade.

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2013-02-28 17:58:49 Re: Wrong actual number of rows in the Query Plan
Previous Message Pavan Deolasee 2013-02-28 17:20:33 Re: pgbench intriguing results: better tps figures for larger scale factor