BUG #5779: Large INSERT transaction with FOREIGN KEY constraints exhausts available RAM

From: "Michel Alexandre Salim" <michel(dot)salim(at)cs(dot)fau(dot)de>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5779: Large INSERT transaction with FOREIGN KEY constraints exhausts available RAM
Date: 2010-12-01 08:09:03
Message-ID: 201012010809.oB18933X096032@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5779
Logged by: Michel Alexandre Salim
Email address: michel(dot)salim(at)cs(dot)fau(dot)de
PostgreSQL version: 9.0.1
Operating system: Ubuntu 10.04
Description: Large INSERT transaction with FOREIGN KEY constraints
exhausts available RAM
Details:

Explanation:
I have a table with close to a billion rows that was initially imported in
csv form; as such, a lot of the columns were not normalized. For speed
reasons, I created one temporary table for each column that needs to be
normalized, and then did an INSERT with JOINs to populate the new table with
the normalized data.

If foreign key constraints are enabled, this runs out of memory (the machine
has 4GB RAM + 6 GB swap and nothing else is running that requires much RAM;
4 GB of the swap was actually added to try and debug this). With
vm.overcommit_memory set to the default 0 the OOM killer consistently kills
the process handling the query; with it set to the recommended 2 postgresql
itself aborts the query when it cannot allocate more RAM.

I tried changing the fkey constraints to DEFERRABLE hoping that that would
consume less RAM, but the same result occurs. Why should the memory usage
pattern be different when integrity checks are done as part of the
transaction (even when pushed back to the end), and in a separate
transaction?

The dataset I'm using is the Eclipse UDC dataset for 2009 --
http://archive.eclipse.org/technology/phoenix/usagedata/ -- and the problem
is independently described at this other site, with a simpler query:

http://www.jory.info/serendipity/archives/30-PostgreSQL-Out-of-Memory-with-l
arge-INSERT.html

My own invocations were
CREATE TABLE udc (id INTEGER PRIMARY KEY, eclipse_id INTEGER NOT NULL,
ac│
ion_type_id INTEGER NOT NULL, target_type_id INTEGER, source_id INTEGER NOT
NUL│
, version_id INTEGER NOT NULL, target_id INTEGER, tstamp TIMESTAMPTZ NOT
NULL,│
FOREIGN KEY (action_type_id) REFERENCES actions (id) DEFERRABLE, FOREIGN KEY
(t│
rget_type_id) REFERENCES class_types (id) DEFERRABLE, FOREIGN KEY
(source_id) R│
FERENCES classes (id) DEFERRABLE, FOREIGN KEY (version_id) REFERENCES
versions │
id) DEFERRABLE , FOREIGN KEY (target_id) REFERENCES classes (id)
DEFERRABLE); │
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "udc_pkey"
for t│
ble "udc"

CREATE TABLE

udc=# INSERT INTO udc (id, eclipse_id, action_type_id, target_type_id,
source_i│
, version_id, target_id, tstamp) SELECT old_udc.id, old_udc.eclipse_id,
old_udc│
action_type, newttype.type_id, old_udc.source, newv.v_id, newt.t,
old_udc.tstam│
FROM old_udc NATURAL JOIN newttype NATURAL JOIN newv NATURAL JOIN newt;

ERROR: out of memory

DETAIL: Failed on request of size 1048576.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Euler Taveira de Oliveira 2010-12-01 14:19:24 Re: BUG #5778: pgAdminIII 1.12.1 Japanese Translation File
Previous Message carriager 2010-12-01 05:12:02 BUG #5778: pgAdminIII 1.12.1 Japanese Translation File