Aggregates, group, and order by

From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Aggregates, group, and order by
Date: 2005-11-07 08:12:05
Message-ID: CB6144E8-C207-4644-9145-B18B56465101@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm trying to concatenate strings in variable orders using a custom
aggregate. However, I'm having a difficult time figuring out the SQL
I need to use to accomplish this. Here's a test case that shows the
error I'm getting.

select version();

version
------------------------------------------------------------------------
----------------------------------------------------------------------
PostgreSQL 8.1.0 on powerpc-apple-darwin8.3.0, compiled by GCC
powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 (Apple Computer, Inc.
build 5026)
(1 row)

create table ordered_foo (
foo_id integer not null
, bar_id integer not null
, foo_value text not null
, foo_pos integer not null
, unique (foo_id, bar_id)
) without oids;

copy ordered_foo (foo_id, bar_id, foo_value, foo_pos) from stdin;
1 1 delta 4
2 1 alpha 1
3 1 charlie 3
4 1 bravo 2
5 2 C 3
6 2 B 2
7 2 A 1
8 2 D 4
\.

CREATE AGGREGATE array_accum (
sfunc = array_append,
basetype = anyelement,
stype = anyarray,
initcond = '{}'
);

select bar_id, array_accum(foo_value)
from ordered_foo
group by bar_id
order by bar_id;
bar_id | array_accum
--------+-----------------------------
1 | {delta,alpha,charlie,bravo}
2 | {C,B,A,D}

The result I'd like to see is
bar_id | array_accum
--------+-----------------------------
1 | {alpha,bravo,charlie,delta}
2 | {A,B,C,D}

select bar_id, array_accum(foo_value)
from ordered_foo
group by bar_id
order by bar_id, foo_pos;
ERROR: column "ordered_foo.foo_pos" must appear in the GROUP BY
clause or be used in an aggregate function

So, I thought I'd just use a meaningless aggregate and throw away the
result.

select bar_id, array_accum(foo_value), sum(foo_pos)
from ordered_foo
group by bar_id
order by bar_id, foo_pos;
ERROR: column "ordered_foo.foo_pos" must appear in the GROUP BY
clause or be used in an aggregate function

select version();

version
------------------------------------------------------------------------
----------------------------------------------------------------------
PostgreSQL 8.1.0 on powerpc-apple-darwin8.3.0, compiled by GCC
powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 (Apple Computer, Inc.
build 5026)
(1 row)

I get the same error if I use built-in aggregates, so I know it's not
the custom aggregate:

select bar_id, sum(foo_id), sum(foo_pos)
from ordered_foo
group by bar_id
order by bar_id, foo_pos;
ERROR: column "ordered_foo.foo_pos" must appear in the GROUP BY
clause or be used in an aggregate function

Taking foo_pos out of the ORDER BY clause gets rid of the error, but
of course I'm no longer guaranteed the ordering of the result.

test=# select bar_id, sum(foo_id), sum(foo_pos)
test-# from ordered_foo
test-# group by bar_id
test-# order by bar_id;
bar_id | sum | sum
--------+-----+-----
1 | 10 | 10
2 | 26 | 10

I'm guessing this is happening because the ORDER BY clause is
evaluated relatively late, and on the result set (rather than on an
intermediate stage) and the result set doesn't include the foo_ps
column.

I can't think of a way to push the aggregate with the order by into a
subquery that still produces the proper grouping (by bar_id). Any
hints or suggestions?

Michael Glaesemann
grzm myrealbox com

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard van den Berg 2005-11-07 08:33:59 Re: Using native win32 psql.exe using alternative cygwin
Previous Message Tino Wildenhain 2005-11-07 07:38:22 Re: PygreSQL ByteA