Re: array_agg versus repeated execution of aggregate final functions

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: array_agg versus repeated execution of aggregate final functions
Date: 2009-06-20 17:38:05
Message-ID: b42b73150906201038mc433678mc5cfeb1a29c8d6ff@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 20, 2009 at 1:31 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I've identified the cause of Merlin Moncure's crash report here:
> http://archives.postgresql.org/pgsql-hackers/2009-06/msg01171.php
> (Thanks to Merlin for making some proprietary stuff available to me
> for testing it.)
>
> The problem query generates a plan in which a HashAggregate node is on
> the inside of a NestLoop, where it can be (and is) executed more than
> once.  ExecReScanAgg decides that it can rescan the existing hash table
> instead of forcing it to be rebuilt.  What that means is that the
> aggregate final-functions will be re-executed again on the latest
> transition values.  And array_agg_finalfn has got a couple of issues
> with that.  The obvious one is that it thinks it can release the
> ArrayBuildState when called by an Agg node.  The less obvious one
> is that if any of the original input values were toasted,
> construct_md_array does this:
>        if (elmlen == -1)
>                elems[i] = PointerGetDatum(PG_DETOAST_DATUM(elems[i]));
> which means it's clobbering an element of the ArrayBuildState's elems
> array with a pointer to short-lived data.  So on the re-execution it
> may find elems[] pointing to bogus data.
>
> There are basically two ways that we could respond to this:
>
> (1) Decide that the bug is array_agg_finalfn's, and make the necessary
> fixes so that it doesn't modify or free its argument ArrayBuildState.
> This would amount to legislating that aggregate final functions cannot
> scribble on their inputs *ever*, rather than the compromise position
> we thought we had adopted last fall, namely that they can do so when
> called by an Agg node but not when called by WindowAgg.

This definitely sounds like the right answer.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-06-20 18:02:31 Re: array_agg versus repeated execution of aggregate final functions
Previous Message Tom Lane 2009-06-20 17:31:49 array_agg versus repeated execution of aggregate final functions