Re: Bucketing Row Data in columns

From: "Oliveiros Cristina" <oliveiros(dot)cristina(at)marktest(dot)pt>
To: "Rob Sargent" <robjsargent(at)gmail(dot)com>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Bucketing Row Data in columns
Date: 2009-06-25 16:13:42
Message-ID: 02e701c9f5af$e8cb4f90$ec5a3d0a@marktestcr.marktest.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I admit that must be a more elegant and faster solution with pl/psql (or
whatever other languages)

As I don't know nothing about pl/psql I tried with pure sql (if you don't
have a hunting dog, hunt with a cat)

But obviously this solution doesn't scale well if you have a giant table
with lots of columns

----- Original Message -----
From: "Rob Sargent" <robjsargent(at)gmail(dot)com>
To: <pgsql-sql(at)postgresql(dot)org>
Sent: Thursday, June 25, 2009 4:57 PM
Subject: Re: [SQL] Bucketing Row Data in columns

>I would be suspicious of this sort of solution of turning rows into columns
>by mean of a series of correlated sub-selects. Once the data set gets
>large and the number of columns goes over 2 or 3 this will in all
>likelihood not perform well. I had the pleasure of re-writing a "report"
>which was based on count() (similar to sum()) per user_id with the counts
>going into various columns per user. 18000 users, a dozen columns from
>table of 2 million rows, report took >1,000,000 seconds (yes almost 12
>days) to complete. Re-write runs in 5-10 minutes (now at 10M rows) by
>getting the counts as rows (user, item, count) into a temp table and making
>the columns from the temp table (pl/psql) Getting the counts takes half
>the time, making the flattened report takes half the time.
>
>
>
>
> Oliveiros Cristina wrote:
>> Hello, Sandeep,
>> I am not sure if this is what you want.
>> I came up with this query
>> SELECT *
>> FROM (SELECT user_id,SUM(amount) as day1 FROM t1 WHERE create_timestamp =
>> '2009-1-1' GROUP BY "user_id") a
>> NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day2 FROM t1 WHERE
>> create_timestamp = '2009-1-2' GROUP BY "user_id") b
>> NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day3 FROM t1 WHERE
>> create_timestamp = '2009-1-3' GROUP BY "user_id") c
>> The solution is not totally correct because it returns NULL in the
>> places you return 0.
>> It seems the SUM() returns NULL when it gets an all NULL column...
>> Is it problematic for your application ?
>> Also, i 'm not sure if I fully understand your last sentence
>> /lets assume the buckets are fixed i.e 3 only. but I wish to get them
>> unlimited i.e day 1 to day 20./
>> You say that the buckets are fixed at 3. So, you mean the table output
>> will always have 4 columns? 3 days plus one for user_id ?
>> If you want 20 buckets it must be a different query...
>> Could you please clarify what you mean when you say that you want to get
>> a bucket unlimited ?
>> Best,
>> Oliveiros
>>
>> ----- Original Message -----
>> *From:* Sandeep <mailto:gibsosmat(at)gmail(dot)com>
>> *To:* pgsql-sql(at)postgresql(dot)org <mailto:pgsql-sql(at)postgresql(dot)org>
>> *Sent:* Wednesday, June 24, 2009 5:39 PM
>> *Subject:* [SQL] Bucketing Row Data in columns
>>
>> Hi all,
>> I need help on creating a sql, not a problem even if its pl/sql
>>
>> I have orders table schema is as follow
>>
>> orders(order_id,user_id, create_timestamp, amount)
>>
>> and I want to generate a report like
>> for the past 3 days bucketing purchases i.e SUM(amount) every day
>> in columns
>> i.e result will be having these columns.
>>
>> (user_id, amount_day1, amount_day2, amount_day3)
>>
>> ex:
>> am leaving order_id assume they are auto incrementing and unique,
>> date format dd/mm/yyyy
>> (user_id, create_timestamp, amount)
>> (user1, 01/01/2009,100)
>> (user1, 01/01/2009,100)
>> (user2, 01/01/2009,100)
>> (user2, 02/01/2009,100)
>> (user2, 02/01/2009,100)
>> (user1, 02/01/2009,100)
>> (user2, 03/01/2009,100)
>> (user2, 03/01/2009,100)
>> (user3, 03/01/2009,100)
>>
>>
>> result
>>
>> (user_id, amount_day1, amount_day2, amount_day3)
>> (user1, 200, 200, 0)
>> (user2, 100, 200, 200)
>> (user3, 0, 0, 100)
>>
>>
>> hope you guys got what I am trying to generate through sql.
>>
>> I could get this data in each row, but I want it in columns.
>> Can anyone help me on this? lets assume the buckets are fixed i.e
>> 3 only. but I wish to get them unlimited i.e day 1 to day 20.
>>
>> Regards
>> Sandeep Bandela
>>
>
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Osvaldo Kussama 2009-06-25 17:37:49 Re: Bucketing Row Data in columns
Previous Message Rob Sargent 2009-06-25 15:57:34 Re: Bucketing Row Data in columns