From: | Robert James <srobertjames(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | CLUSTER versus a dedicated table |
Date: | 2011-06-01 23:54:35 |
Message-ID: | BANLkTim93LKid3efencyeC-ASObCJuwzNA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi. I'm interested in understanding the differences between
CLUSTERing a table and making a dedicated one.
We have a table with about 1 million records. On a given day, only
about 1% of them are of interest. That 1% changes every day (it's
WHERE active_date = today), and so we index and cluster on it.
Even so, the planner shows a very large cost for the Index Scan: about
3500. If I instead do a SELECT INTO temp_table FROM big_table WHERE
active_date = today, and then do SELECT * FROM temp_table, I get a
planned cost of 65. Yet, the actual time for both queries is almost
identical.
Questions:
1. Why is there such a discrepancy between the planner's estimate and
the actual cost?
2. In a case like this, will I in general see a performance gain by
doing a daily SELECT INTO and then querying from that table? My ad hoc
test doesn't indicate I would (despite the planner's prediction), and
I'd rather avoid this if it won't help.
3. In general, does CLUSTER provide all the performance benefits of a
dedicated table? If it doesn't, what does it lack?
Thank you.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert James | 2011-06-02 00:10:32 | Understanding Hash Join performance |
Previous Message | panam | 2011-06-01 22:49:31 | Re: [PERFORM] Hash Anti Join performance degradation |