Re: people who buy A, also buy C, D, E

From: "Ramakrishnan Muralidharan" <ramakrishnanm(at)pervasive-postgres(dot)com>
To: "Dan Langille" <dan(at)langille(dot)org>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: people who buy A, also buy C, D, E
Date: 2005-04-26 04:28:05
Message-ID: 02767D4600E59A4487233B23AEF5C5992A407F@blrmail1.aus.pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi

I am bit confused.. If you want to display first 5 the following query will fetch top 5 book id's. I am not able to understand, why there is a sub-query.

SELECT ELEMENT_ID , COUNT( * ) FROM WATCH_LIST_ELEMENT
GROUP BY ELEMENT_ID
ORDER BY COUNT(*) DESC
LIMIT 5

Regards,
R.Muralidharan


-----Original Message-----
From: pgsql-sql-owner(at)postgresql(dot)org
[mailto:pgsql-sql-owner(at)postgresql(dot)org]On Behalf Of Dan Langille
Sent: Tuesday, April 26, 2005 7:52 AM
To: pgsql-sql(at)postgresql(dot)org
Cc: dan(at)langille(dot)org
Subject: [SQL] people who buy A, also buy C, D, E

The goal of my query is: given a book, what did other people who
bought this book also buy? I plan the list the 5 most popular such
books. In reality, this isn't about books, but that makes it easier
to understand I think.

We have a table of customer_id (watch_list_id) and book_id
(element_id).

freshports.org=# \d watch_list_element
Table "public.watch_list_element"
Column | Type | Modifiers
---------------+---------+-----------
watch_list_id | integer | not null
element_id | integer | not null
Indexes:
"watch_list_element_pkey" primary key, btree (watch_list_id,
element_id)
"watch_list_element_element_id" btree (element_id)
Foreign-key constraints:
"$2" FOREIGN KEY (watch_list_id) REFERENCES watch_list(id) ON
UPDATE CASCADE ON DELETE CASCADE
"$1" FOREIGN KEY (element_id) REFERENCES element(id) ON UPDATE
CASCADE ON DELETE CASCADE

freshports.org=#

I have a query which returns the needed results:

SELECT W.element_id
FROM watch_list_element W
WHERE w.watch_list_id in (select watch_list_id from
watch_list_element where element_id = 54968)
GROUP BY W.element_id
ORDER BY count(W.watch_list_id) DESC
LIMIT 5;

But performance is an issue here. So I'm planning to calculate all
the possible values and cache them. That is, given each element_id in
a watch_list, what are the top 5 element_id values on all the lists
on which the original element_id appears?

I'm having trouble constructing the query. I'm not even sure I can
do this in one select, but that would be nice. Examples and clues
are appreciated.

Any ideas?

Thank you.
--
Dan Langille : http://www.langille.org/
BSDCan - The Technical BSD Conference - http://www.bsdcan.org/
NEW brochure available at http://www.bsdcan.org/2005/advocacy/

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Browse pgsql-sql by date

  From Date Subject
Next Message Igor Kryltsov 2005-04-26 06:51:15 Re: Help to drop table
Previous Message Rodrigo Carvalhaes 2005-04-26 03:12:32 UPDATE WITH ORDER BY