Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL
Date: 2013-11-25 21:28:41
Message-ID: 5293C109.90206@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(sorry for possible duplicates, used wrong account on first try)

On 07.10.2013 17:00, Pavel Stehule wrote:
> Hello
>
> I fixed patch - there was a missing cast to domain when it was used (and
> all regress tests are ok now).

This doesn't work correctly for varlen datatypes. I modified your
quicksort example to work with text[], and got this:

postgres=# SELECT
array_upper(quicksort(1,20000,array_agg((g::text))),1) FROM
generate_series(1,20000) g;
ERROR: invalid input syntax for integer: ""
CONTEXT: PL/pgSQL function quicksort(integer,integer,text[]) line 10 at
assignment
PL/pgSQL function quicksort(integer,integer,text[]) line 16 at assignment

The handling of array domains is also wrong. You replace the new value
to the array and only check the domain constraint after that. If the
constraint is violated, it's too late to roll that back. For example:

postgres=# create domain posarray as int[] check (value[1] > 0);
CREATE DOMAIN
postgres=# do $$
declare
iarr posarray;
begin
begin
iarr[1] := 1;
iarr[1] := -1;
exception when others then raise notice 'got error: %', sqlerrm; end;
raise notice 'value: %', iarr[1];
end;
$$;
NOTICE: got error: value for domain posarray violates check constraint
"posarray_check"
NOTICE: value: -1
DO

Without the patch, it prints 1, but with the patch, -1.

In general, I'm not convinced this patch is worth the trouble. The
speedup isn't all that great; manipulating large arrays in PL/pgSQL is
still so slow that if you care about performance you'll want to rewrite
your function in some other language or use temporary tables. And you
only get a gain with arrays of fixed-length element type with no NULLs.

So I think we should drop this patch in its current form. If we want to
make array manipulation in PL/pgSQL, I think we'll have to do something
similar to how we handle "row" variables, or something else entirely.
But that's a much bigger patch, and I'm not sure it's worth the trouble
either.

Looking at perf profile and the functions involved, though, I see some
low-hanging fruit: in array_set, the new array is allocated with
palloc0'd, but we only need to zero out a few alignment bytes where the
new value is copied. Replacing it with palloc() will save some cycles,
per the attached patch. Nowhere near as much as your patch, but this is
pretty uncontroversial.

- Heikki

Attachment Content-Type Size
quicksort-example.sql text/x-sql 630 bytes
avoid-unnecessary-zeroing-in-array_set-1.patch text/x-diff 1.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2013-11-25 21:29:59 Re: Cube extension split algorithm fix
Previous Message Kevin Grittner 2013-11-25 21:25:42 Re: why semicolon after begin is not allowed in postgresql?