Re: Returning array of IDs as a sub-query with group-by

Lists: pgsql-sql
From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-sql(at)postgresql(dot)org
Subject: Returning array of IDs as a sub-query with group-by
Date: 2007-08-25 15:10:57
Message-ID: 200708251710.57557.andreak@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hi all.
I have the following schema:

create table item(
id serial primary key
);

create table item_log(
id serial primary key,
item_id integer not null references item(id),
price numeric NOT NULL
);

insert into item(id) values(1);
insert into item(id) values(2);
insert into item(id) values(3);
insert into item(id) values(4);

insert into item_log(item_id, price) values(1, 100);
insert into item_log(item_id, price) values(1, 100);
insert into item_log(item_id, price) values(1, 100);
insert into item_log(item_id, price) values(1, 200);
insert into item_log(item_id, price) values(1, 200);
insert into item_log(item_id, price) values(1, 200);
insert into item_log(item_id, price) values(1, 200);
insert into item_log(item_id, price) values(1, 200);
insert into item_log(item_id, price) values(1, 200);
insert into item_log(item_id, price) values(2, 200);
insert into item_log(item_id, price) values(2, 200);

Now, to get out all log-entries grouped on price with count the following
query gives me what I want

SELECT COUNT(il.price), i.id AS item_id, il.price FROM item i, item_log il
WHERE i.id = il.item_id GROUP BY il.price, i.id;

count | item_id | price
-------+---------+-------
3 | 1 | 100
6 | 1 | 200
2 | 2 | 200
(3 rows)

Now - I would like to return an ARRAY of item_log.id for each of the two rows.
The result I'm looking for would look like this:

count | item_id | price | item_id_array
-------+---------+-------+---------------
3 | 1 | 100 | {1,2,3}
6 | 1 | 200 | {4,5,6,7,8,9}
2 | 2 | 200 | {10,11}

I tried this query which complains about an ungruoped column:

SELECT COUNT(il.price), i.id AS item_id, il.price,
ARRAY(SELECT a.id FROM item_log a WHERE a.id = il.id) AS item_id_array
FROM item i, item_log il WHERE i.id = il.item_id GROUP BY il.price, i.id;

ERROR: subquery uses ungrouped column "il.id" from outer query

Any hints?

--
Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Senior Software Developer / Manager
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Karenslyst Allé 11 | know how to do a thing and to watch |
PO. Box 529 Skøyen | somebody else doing it wrong, without |
0214 Oslo | comment. |
NORWAY | |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+


From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Returning array of IDs as a sub-query with group-by
Date: 2007-08-25 15:55:30
Message-ID: 200708251755.30291.andreak@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On Saturday 25 August 2007 17:10:57 Andreas Joseph Krogh wrote:
> Hi all.
> I have the following schema:
>
> create table item(
> id serial primary key
> );
>
>
> create table item_log(
> id serial primary key,
> item_id integer not null references item(id),
> price numeric NOT NULL
> );
>
>
> insert into item(id) values(1);
> insert into item(id) values(2);
> insert into item(id) values(3);
> insert into item(id) values(4);
>
> insert into item_log(item_id, price) values(1, 100);
> insert into item_log(item_id, price) values(1, 100);
> insert into item_log(item_id, price) values(1, 100);
> insert into item_log(item_id, price) values(1, 200);
> insert into item_log(item_id, price) values(1, 200);
> insert into item_log(item_id, price) values(1, 200);
> insert into item_log(item_id, price) values(1, 200);
> insert into item_log(item_id, price) values(1, 200);
> insert into item_log(item_id, price) values(1, 200);
> insert into item_log(item_id, price) values(2, 200);
> insert into item_log(item_id, price) values(2, 200);
>
> Now, to get out all log-entries grouped on price with count the following
> query gives me what I want
>
> SELECT COUNT(il.price), i.id AS item_id, il.price FROM item i, item_log il
> WHERE i.id = il.item_id GROUP BY il.price, i.id;
>
> count | item_id | price
> -------+---------+-------
> 3 | 1 | 100
> 6 | 1 | 200
> 2 | 2 | 200
> (3 rows)
>
> Now - I would like to return an ARRAY of item_log.id for each of the two
> rows. The result I'm looking for would look like this:
>
> count | item_id | price | item_id_array
> -------+---------+-------+---------------
> 3 | 1 | 100 | {1,2,3}
> 6 | 1 | 200 | {4,5,6,7,8,9}
> 2 | 2 | 200 | {10,11}
>
> I tried this query which complains about an ungruoped column:
>
> SELECT COUNT(il.price), i.id AS item_id, il.price,
> ARRAY(SELECT a.id FROM item_log a WHERE a.id = il.id) AS item_id_array
> FROM item i, item_log il WHERE i.id = il.item_id GROUP BY il.price, i.id;
>
> ERROR: subquery uses ungrouped column "il.id" from outer query
>
> Any hints?

I found the following CREATE AGGREGATE suggestion in the PG-docs:

CREATE AGGREGATE array_accum (anyelement)
(
sfunc = array_append,
stype = anyarray,
initcond = '{}'
);

With this I can easily issue:
SELECT COUNT(il.price), i.id AS item_id, il.price,
array_accum(il.id) AS item_id_array
FROM item i, item_log il WHERE i.id = il.item_id GROUP BY il.price, i.id;

Which does what I want:
count | item_id | price | item_id_array
-------+---------+-------+---------------
3 | 1 | 100 | {1,2,3}
6 | 1 | 200 | {4,5,6,7,8,9}
2 | 2 | 200 | {10,11}
(3 rows)

If someone knows of a way without introducing a new AGGREGATE I'm still
interrested.

--
Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Senior Software Developer / Manager
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Karenslyst Allé 11 | know how to do a thing and to watch |
PO. Box 529 Skøyen | somebody else doing it wrong, without |
0214 Oslo | comment. |
NORWAY | |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+


From: Ragnar <gnari(at)hive(dot)is>
To: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Returning array of IDs as a sub-query with group-by
Date: 2007-08-25 21:02:19
Message-ID: 1188075739.12268.50.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On lau, 2007-08-25 at 17:55 +0200, Andreas Joseph Krogh wrote:
> On Saturday 25 August 2007 17:10:57 Andreas Joseph Krogh wrote:

[snip]

> > count | item_id | price | item_id_array
> > -------+---------+-------+---------------
> > 3 | 1 | 100 | {1,2,3}
> > 6 | 1 | 200 | {4,5,6,7,8,9}
> > 2 | 2 | 200 | {10,11}
> >
> > I tried this query which complains about an ungruoped column:
> >
> > SELECT COUNT(il.price), i.id AS item_id, il.price,
> > ARRAY(SELECT a.id FROM item_log a WHERE a.id = il.id) AS item_id_array
> > FROM item i, item_log il WHERE i.id = il.item_id GROUP BY il.price, i.id;
> >
> > ERROR: subquery uses ungrouped column "il.id" from outer query
> >
> > Any hints?
>
> I found the following CREATE AGGREGATE suggestion in the PG-docs:

[aggregate solution snipped]

> If someone knows of a way without introducing a new AGGREGATE I'm still
> interrested.

you can allways do the ARRAY(SELECT...) outside the grouping:
# select *,(select ARRAY(
SELECT a.id
FROM item_log as a
WHERE foo.item_id=a.item_id
AND foo.price=a.price
)
) AS item_id_array
from (
select count(*),item_id, price
from item_log
group by item_id, price
) as foo;

count | item_id | price | item_id_array
-------+---------+-------+---------------
3 | 1 | 100 | {1,2,3}
6 | 1 | 200 | {4,5,6,7,8,9}
2 | 2 | 200 | {10,11}
(3 rows)

but i suspect the aggregate will perform better

gnari


From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Returning array of IDs as a sub-query with group-by
Date: 2007-08-25 22:42:56
Message-ID: 200708260042.56251.andreak@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On Saturday 25 August 2007 23:02:19 Ragnar wrote:
> On lau, 2007-08-25 at 17:55 +0200, Andreas Joseph Krogh wrote:
> > On Saturday 25 August 2007 17:10:57 Andreas Joseph Krogh wrote:
>
> [snip]
>
> > > count | item_id | price | item_id_array
> > > -------+---------+-------+---------------
> > > 3 | 1 | 100 | {1,2,3}
> > > 6 | 1 | 200 | {4,5,6,7,8,9}
> > > 2 | 2 | 200 | {10,11}
> > >
> > > I tried this query which complains about an ungruoped column:
> > >
> > > SELECT COUNT(il.price), i.id AS item_id, il.price,
> > > ARRAY(SELECT a.id FROM item_log a WHERE a.id = il.id) AS item_id_array
> > > FROM item i, item_log il WHERE i.id = il.item_id GROUP BY il.price,
> > > i.id;
> > >
> > > ERROR: subquery uses ungrouped column "il.id" from outer query
> > >
> > > Any hints?
> >
> > I found the following CREATE AGGREGATE suggestion in the PG-docs:
>
> [aggregate solution snipped]
>
> > If someone knows of a way without introducing a new AGGREGATE I'm still
> > interrested.
>
> you can allways do the ARRAY(SELECT...) outside the grouping:
> # select *,(select ARRAY(
> SELECT a.id
> FROM item_log as a
> WHERE foo.item_id=a.item_id
> AND foo.price=a.price
> )
> ) AS item_id_array
> from (
> select count(*),item_id, price
> from item_log
> group by item_id, price
> ) as foo;
>
> count | item_id | price | item_id_array
> -------+---------+-------+---------------
> 3 | 1 | 100 | {1,2,3}
> 6 | 1 | 200 | {4,5,6,7,8,9}
> 2 | 2 | 200 | {10,11}
> (3 rows)
>
>
> but i suspect the aggregate will perform better
>
> gnari

Ok, thanks.

--
Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Senior Software Developer / Manager
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Karenslyst Allé 11 | know how to do a thing and to watch |
PO. Box 529 Skøyen | somebody else doing it wrong, without |
0214 Oslo | comment. |
NORWAY | |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+