Re: Performance Optimization for Dummies 2 - the SQL

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Scott Marlowe" <smarlowe(at)g2switchworks(dot)com>
Cc: "Carlo Stonebanks" <stonec(dot)register(at)sympatico(dot)ca>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance Optimization for Dummies 2 - the SQL
Date: 2006-10-06 18:53:35
Message-ID: b42b73150610061153i3d11d7a4r23e1d7c5317feb9f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 10/6/06, Scott Marlowe <smarlowe(at)g2switchworks(dot)com> wrote:
> On Fri, 2006-10-06 at 11:44, Carlo Stonebanks wrote:
> > This didn't work right away, but DID work after running a VACUUM FULL. In
> > other words, i was still stuck with a sequential scan until after the
> > vacuum.
> >
> > I turned autovacuum off in order to help with the import, but was perfoming
> > an ANALYZE with every 500 rows imported.

how did you determine that it is done every 500 rows? this is the
default autovacuum paramater. if you followed my earlier
recommendations, you are aware that autovacuum (which also analyzes)
is not running during bulk inserts, right?

imo, best way to do big data import/conversion is to:
1. turn off all extra features, like stats, logs, etc
2. use copy interface to load data into scratch tables with probably
all text fields
3. analyze (just once)
4. use big queries to transform, normalize, etc
5. drop scratch tables
6. set up postgresql.conf for production use, fsync, stats, etc

important feature of analyze is to tell the planner approx. how big
the tables are.

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Carlo Stonebanks 2006-10-06 20:45:07 Re: Performance Optimization for Dummies 2 - the SQL
Previous Message Scott Marlowe 2006-10-06 16:53:43 Re: Performance Optimization for Dummies 2 - the SQL