Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL
Date: 2013-10-07 14:00:54
Message-ID: CAFj8pRB8=cGxi1XAtjiTD0Sn3ZqYkTomEd96OdzGPUVs7tX9Cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

I fixed patch - there was a missing cast to domain when it was used (and
all regress tests are ok now).

a some performance tests

set array_fast_update to off;
postgres=# select fill_2d_array(300,300);
fill_2d_array
───────────────
90000
(1 row)

Time: 33570.087 ms
postgres=# set array_fast_update to on;
SET
Time: 0.279 ms
postgres=# select fill_2d_array(300,300);
fill_2d_array
───────────────
90000
(1 row)

Time: 505.202 ms

CREATE OR REPLACE FUNCTION public.quicksort(l integer, r integer, a
integer[])
RETURNS integer[]
LANGUAGE plpgsql
IMMUTABLE STRICT
AS $function$
DECLARE akt int[] = a;
i integer := l; j integer := r; x integer = akt[(l+r) / 2];
w integer;
BEGIN
LOOP
WHILE akt[i] < x LOOP i := i + 1; END LOOP;
WHILE x < akt[j] loop j := j - 1; END LOOP;
IF i <= j THEN
w := akt[i];
akt[i] := akt[j]; akt[j] := w;
i := i + 1; j := j - 1;
END IF;
EXIT WHEN i > j;
END LOOP;
IF l < j THEN akt := quicksort(l,j,akt); END IF;
IF i < r then akt := quicksort(i,r,akt); END IF;
RETURN akt;
END;
$function$

postgres=# set array_fast_update to off;
SET
Time: 0.282 ms
postgres=# SELECT array_upper(quicksort(1,21000,array_agg(a)),1) FROM test;
array_upper
─────────────
21000
(1 row)

Time: 5086.781 ms
postgres=# set array_fast_update to on;
SET
Time: 0.702 ms
postgres=# SELECT array_upper(quicksort(1,21000,array_agg(a)),1) FROM test;
array_upper
─────────────
21000
(1 row)

Time: 1751.920 ms

So in first test - fast update is 66x faster, second test - 3x faster

I found so for small arrays (about 1000 fields) a difference is not
significant.

This code can be cleaned (a domains can be better optimized generally - a
IO cast can be expensive for large arrays and domain_check can be used
there instead), but it is good enough for discussion if we would this
optimization or not.

Regards

Pavel

2013/10/3 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>

> Hello
>
> a very ugly test shows a possibility about 100% speedup on reported
> example (on small arrays, a patch is buggy and doesn't work for larger
> arrays).
>
> I updated a code to be read only
>
> CREATE OR REPLACE FUNCTION public.fill_2d_array(rows integer, cols integer)
> RETURNS integer
> LANGUAGE plpgsql
> AS $function$
> DECLARE
> img double precision[][];
> i integer; j integer;
> cont integer; r double precision;
> BEGIN
> img := ARRAY( SELECT 0 FROM generate_series(1, rows * cols) ) ;
> cont:= 0;
> For i IN 1..rows LOOP
> For j IN 1..cols LOOP r := img[i * cols + j];
> r := (i * cols + j)::double precision;
> cont := cont + 1; --raise notice '%', img;
> END LOOP;
> END LOOP;
> return cont;
> END;
> $function$
>
> It exec all expressions
>
> -- original
> postgres=# select fill_2d_array(200,200);
> fill_2d_array
> ---------------
> 40000
> (1 row)
>
> Time: 12726.117 ms
>
> -- read only version
> postgres=# select fill_2d_array(200,200); fill_2d_array
> ---------------
> 40000
> (1 row)
>
> Time: 245.894 ms
>
> so there is about 50x slowdown
>
>
> 2013/10/3 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>
>>
>>
>>
>> 2013/10/3 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>>
>>> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>>> > If you can do a update of some array in plpgsql now, then you have to
>>> work
>>> > with local copy only. It is a necessary precondition, and I am think
>>> it is
>>> > valid.
>>>
>>> If the proposal only relates to assignments to elements of plpgsql local
>>> variables, it's probably safe, but it's also probably not of much value.
>>> plpgsql has enough overhead that I'm doubting you'd get much real-world
>>> speedup. I'm also not very excited about putting even more low-level
>>> knowledge about array representation into plpgsql.
>>>
>>
>> I looked to code, and I am thinking so this can be done inside array
>> related routines. We just have to signalize request for inplace update (if
>> we have a local copy).
>>
>> I have not idea, how significant speedup can be (if any), but current
>> behave is not friendly (and for multidimensional arrays there are no
>> workaround), so it is interesting way - and long time I though about some
>> similar optimization.
>>
>> Regards
>>
>> Pavel
>>
>>
>>>
>>> regards, tom lane
>>>
>>
>>
>

Attachment Content-Type Size
fast-array-update.patch application/octet-stream 8.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-10-07 14:05:51 Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL
Previous Message Andres Freund 2013-10-07 14:00:30 Re: logical changeset generation v6.2