Re: Running update in chunks?

From: Tim Uckun <timuckun(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Running update in chunks?
Date: 2013-01-21 20:09:31
Message-ID: CAGuHJrMPKf-E6wb40F7UCWFxFA9=6mi4e=N8OHqBnZGpqS50qQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Just to close this up and give some guidance to future googlers...

There are no foreign keys. The table is empty when I start. I copy the
data into it from a text file.

Removing indexes didn't help much (made a very slight difference).

running a query CREATE TEMP TABLE tt AS SELECT .... using a massive
join takes about 8 seconds. I presume that's the baseline for the disk
and RAM given my current postgres configuration. Note that this is
not a satisfactory option for me because I can't do what I want in one
step (the update I specified is one of many).

running a very simple update "UPDATE imports set make_id = null"
takes over 50 seconds so that's the minimum amount of time any update
is going to take.

Running a complex update where I join all the tables together and
update all the fields takes about 106 seconds.

Just running a complex select with the joins takes no time at all.

I tried chunking the updates using chunks of 100 records and 1000
records (where ID between X and Y repeatedly) and it was even slower.

Conclusion. Updates on postgres are slow (given the default
postgresql.conf). I presume this is due to MVCC or the WAL or
something and there are probably some things I can do to tweak the
conf file to make them go faster but out of the box running an update
on a table with lots of rows is going to cost you a lot. Removing the
indexes doesn't help that much.

Suggestion for the PG team. Deliver a more realistic postgres.conf by
default. The default one seems to be aimed at ten year old PCs with
very little RAM and disk space. At least deliver additional conf files
for small, medium, large, huge setups.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Edson Richter 2013-01-21 20:12:06 Re: What is impact of "varchar_opts"?
Previous Message Tom Lane 2013-01-21 20:03:41 Re: What is impact of "varchar_opts"?