Re: Function array_agg(array)

From: Ali Akbar <the(dot)apaan(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Function array_agg(array)
Date: 2014-10-22 14:58:10
Message-ID: CACQjQLrMsvc=xDv7Rn4QB6exQVOTQ5xHmdFS7dXAD5NocEdbUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review

2014-10-22 20:51 GMT+07:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:

> I agree with your proposal. I have a few comments to design:
>

> 1. patch doesn't hold documentation and regress tests, please append it.
>
OK, i'll add the documentation and regression test

> 2. this functionality (multidimensional aggregation) can be interesting
> more times, so maybe some interface like array builder should be preferred.
>
We already have accumArrayResult and makeArrayResult/makeMdArrayResult,
haven't we?

Actually array_agg(anyarray) can be implemented by using accumArrayResult
and makeMdArrayResult, but in the process we will deconstruct the array
data and null bit-flags into ArrayBuildState->dvalues and dnulls. And we
will reconstruct it through makeMdArrayResult. I think this way it's not
efficient, so i decided to reimplement it and memcpy the array data and
null flags as-is.

In other places, i think it's clearer if we just use accumArrayResult and
makeArrayResult/makeMdArrayResult as API for building (multidimensional)
arrays.

> 3. array_agg was consistent with array(subselect), so it should be fixed
> too
>
> postgres=# select array_agg(a) from test;
> array_agg
> -----------------------
> {{1,2,3,4},{1,2,3,4}}
> (1 row)
>
> postgres=# select array(select a from test);
> ERROR: could not find array type for data type integer[]
>

I'm pretty new in postgresql development. Can you point out where is
array(subselect) implemented?

> 4. why you use a magic constant (64) there?
>
> + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
> + astate->aitems = 64 * nitems;
>
> + astate->nullbitmap = (bits8 *)
> + repalloc(astate->nullbitmap, (astate->aitems + 7) / 8);
>

just follow the arbitrary size choosen in accumArrayResult (arrayfuncs.c):
astate->alen = 64; /* arbitrary starting array size */

it can be any number not too small and too big. Too small, and we will
realloc shortly. Too big, we will end up wasting memory.

Regards,
--
Ali Akbar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2014-10-22 15:05:29 Re: Support UPDATE table SET(*)=...
Previous Message Antonin Houska 2014-10-22 14:57:42 Re: from_collapse_limit considerations