Re: Configuration Advice
Steve wrote:
SO ... our goal here is to make this load process take less time. It
seems the big part is building the big summary table; this big summary
table is currently 9 million rows big. Every night, we drop the table,
re-create it, build the 9 million rows of data (we use COPY to put hte
data in when it's prepared, not INSERT), and then build the indexes on
it -- of which there are many.
Would it be possible to just update the summary table, instead of
recreating it from scratch every night?
Unfortunately this table gets queried in
a lot of different ways and needs these indexes; also unfortunately, we
have operator class indexes to support both ASC and DESC sorting on
columns so these are for all intents and purposes duplicate but required
under Postgres 8.1 (we've recently upgraded to Postgres 8.2, is this
still a requirement?)
I don't think this has changed in 8.2.
Building these indexes takes forever! It's a long grind through inserts
and then building the indexes takes a hefty amount of time too. (about
9 hours). Now, the application is likely part at fault, and we're
working to make it more efficient, but it has nothing to do with the
index building time. I'm wondering what we can do to make this better
if anything; would it be better to leave the indexes on? It doesn't
seem to be. Would it be better to use INSERTs instead of copies?
Doesn't seem to be.
Would it help if you created multiple indexes simultaneously? You have
enough CPU to do it. Is the index creation CPU or I/O bound? 9 million
rows should fit in 16 GB of memory, right?
- The load process itself takes about 6 gigs of memory, the rest is free
for postgres because this is basically all the machine does.
Can you describe the load process in more detail? What's it doing with
the 6 gigs?
- If this was your machine and situation, how would you lay out the
emmory settings? What would you set the FSM to?
FSM seems irrelevant here..
Do wal_buffers/full_page_writes matter of FSYNC is off?
Better turn off full_page_writes, since you can kiss goodbye to data
integrity anyway with fsync=off.
Anyway... any advice would be appreciated :)
What's your maintenance_work_mem setting? It can make a big difference
in sorting the data for indexes.
If you could post the schema including the indexes, people might have
more ideas...
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Home |
Main Index |
Thread Index