Re: hpw to Count without group by

From: Edmund Bacon <ebacon-xlii(at)onesystem(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: hpw to Count without group by
Date: 2005-06-01 22:49:56
Message-ID: m3zmu9sne3.fsf@elb-lx.onesystem.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

yudiepg(at)gmail(dot)com (Yudie Pg) writes:

> Hello,
> I have a table, structure like this:
> create table product(
> sku, int4 not null,
> category int4 null,
> display_name varchar(100) null,
> rank int4 null
> )
> let say example data:
> sku, category, display_name
> =======================
> 10001, 5, postgresql, 132
> 10002, 5, mysql, 243
> 10003, 5, oracle, 323
> 10006, 7, photoshop, 53
> 10007, 7, flash mx, 88
> 10008, 9, Windows XP, 44
> 10008, 9, Linux, 74
> Expected query result:
> sku, category, display_name, category_count
> ====================================
> 10001, 5, postgresql, 3
> 10006, 7, photoshop, 2
> 10008, 9, Windows XP, 2
> The idea is getting getting highest ranking each product category and COUNT
> how many products in the category with SINGLE query.
> the first 3 columns can be done with select distinct on (category) ...
> order by category, rank desc but it still missing the category_count. I wish
> no subquery needed for having simplest query plan.
> Thank you.
> Yudie G.

I do not believe you can do this without a subquery - you are trying
to get 2 separate pieces of information from your data
* some data about the record having MAX(rank) for each category
and
* the count of records in each category

Note, however that you can get MAX(rank) and COUNT(category) in one
sequential pass of the data: e.g
SELECT category, MAX(rank), COUNT(category) FROM product;

Joining this with the orignal table is not too dificult :

SELECT sku, category, display_name, category_count
FROM product
JOIN (SELECT category, MAX(rank) AS rank, COUNT(category) AS category_count
FROM product
GROUP BY category) subq
USING(category, rank)
ORDER BY sku;

Depending on what your data looks like, you might improve things by
having an index on category, and perhaps on (category, rank).

Note that there is may be a problem with this query: If you have more
than one product with the same rank in the same category, you may get
more than one record for that category. Apply distinct on as
neccessary.

--
Remove -42 for email

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ragnar Hafstað 2005-06-01 22:52:50 Re: hpw to Count without group by
Previous Message Simon Riggs 2005-06-01 22:22:07 Re: PITR restore hot standby