Re: Parallel Aggregates for string_agg and array_agg

From: Andres Freund <andres(at)anarazel(dot)de>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregates for string_agg and array_agg
Date: 2018-05-01 21:38:32
Message-ID: 20180501213832.h6dp5zjophkqdz4h@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-05-01 14:35:46 -0700, Mark Dilger wrote:
>
> > On May 1, 2018, at 2:11 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > Hi,
> >
> > On 2018-05-01 14:09:39 -0700, Mark Dilger wrote:
> >> I don't care which order the data is in, as long as x[i] and y[i] are
> >> matched correctly. It sounds like this patch would force me to write
> >> that as, for example:
> >>
> >> select array_agg(a order by a, b) AS x, array_agg(b order by a, b) AS y
> >> from generate_a_b_func(foo);
> >>
> >> which I did not need to do before.
> >
> > Why would it require that? Rows are still processed row-by-row even if
> > there's parallelism, no?
>
> I was responding in part to Tom's upthread statement:
>
> Your own example of assuming that separate aggregates are computed
> in the same order reinforces my point, I think. In principle, anybody
> who's doing that should write
>
> array_agg(e order by x),
> array_agg(f order by x),
> string_agg(g order by x)
>
> because otherwise they shouldn't assume that;
>
> It seems Tom is saying that you can't assume separate aggregates will be
> computed in the same order. Hence my response. What am I missing here?

Afaict Tom was just making a theoretical argument, and one that seems
largely independent of the form of parallelism we're discussing here.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-05-01 22:14:25 Re: Parallel Aggregates for string_agg and array_agg
Previous Message Mark Dilger 2018-05-01 21:35:46 Re: Parallel Aggregates for string_agg and array_agg