to partition or not to partition that is the question

Lists: pgsql-hackers
From: Timasmith <timasmith(at)hotmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: to partition or not to partition that is the question
Date: 2007-06-19 13:00:11
Message-ID: 1182258011.225394.80330@g4g2000hsf.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have two types of tables, for sake of argument lets call it these:

1) product 10,000,000 rows
2) product_activity 1,000,000,000 rows

90% of the type the product table is accessed by product_id, 80% of
the time that product id would be in the last 1,000,000 rows of the
table i.e. a recent product id.

90% of the time the product_activity table is accessed by product_id +
a date range - often the last 24 hours.

One option seems to be not to use partitions at all and have a
product_history table and a product_activity_history table. The work
falls on the application to use UNION queries to extract data when
needed across both tables, implement a criteria for moving the records
into the history table and dealing with the issues of related tables
referencing product ids that are in other tables.

Alternatively I could partition the two tables by date range. I am
not sure how effective that would be for product but I guess I could
hit that table first and if it hits, great - performance saved, and if
it doesnt oh well get it from the history table.

Certainly product_activity seems like a good partitioning contender.
Of course maintaining monthly partitions is a lot of work but I guess
you could create them in advance for several years.

Is partitioning the way to go in this case?


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Timasmith <timasmith(at)hotmail(dot)com>
Subject: Re: to partition or not to partition that is the question
Date: 2007-06-19 18:23:37
Message-ID: 200706191123.37776.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tim,

> I have two types of tables, for sake of argument lets call it these:
>
> 1) product                  10,000,000 rows
> 2) product_activity  1,000,000,000 rows

pgsql-performance is the correct list for your question. Please re-post it
there. -hackers is for PostgreSQL development.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Timasmith <timasmith(at)hotmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: to partition or not to partition that is the question
Date: 2007-06-21 15:08:06
Message-ID: 1182438486.844450.100440@w5g2000hsg.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 19, 2:23 pm, j(dot)(dot)(dot)(at)agliodbs(dot)com (Josh Berkus) wrote:
> Tim,
>
> > I have two types of tables, for sake of argument lets call it these:
>
> > 1) product 10,000,000 rows
> > 2) product_activity 1,000,000,000 rows
>
> pgsql-performance is the correct list for your question. Please re-post it
> there. -hackers is for PostgreSQL development.
>
> --
> Josh Berkus
> PostgreSQL @ Sun
> San Francisco
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

Well no reply in pgsql-performance so it was pointless re-post, I
posted here because there is more activity in this forum.