Re: problem with large inserts

From: Lutz Fischer <lfischer(at)staffmail(dot)ed(dot)ac(dot)uk>
To: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: problem with large inserts
Date: 2012-12-13 16:33:28
Message-ID: 50CA0358.9040401@staffmail.ed.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks a lot you saved my day

create temp table foo AS SELECT DISTINCT ...
did take a mere 77464.744 ms
And an additional
Insert into LPP select * from foo;
Just 576.909 ms

I don't really understand why it's working via a temp table but not
directly (or in any reasonable amount of time) - but at least I have a
solution I can work with.

On 13/12/12 16:09, Filip Rembiałkowski wrote:
> Just an idea - how long does it take to run _only_
> CREATE TEMP TABLE foo AS <your SELECT here>
>
>
>
>
> On Thu, Dec 13, 2012 at 4:37 PM, Lutz Fischer
> <lfischer(at)staffmail(dot)ed(dot)ac(dot)uk> wrote:
>> Hi
>>
>> I have currently some trouble with inserts into a table
>>
>> INSERT INTO LPP (PPID, LID)
>> SELECT DISTINCT PPid, LID FROM
>> (SELECT * FROM PP WHERE s_id = sid) pp
>> INNER JOIN
>> has_protein hp1
>> ON pp.p1id = hp1.pid
>> INNER JOIN
>> has_protein hp2
>> ON pp.p2_id = hp2.pid
>> INNER JOIN
>> (SELECT * FROM L WHERE s_id = sid) l
>> ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos +
>> pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2)
>> OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos +
>> pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1)
>> ;
>>
>> If I run only
>>
>> SELECT DISTINCT PPid, LID FROM
>> (SELECT * FROM PP WHERE s_id = 708) pp
>> INNER JOIN
>> has_protein hp1
>> ON pp.p1id = hp1.pid
>> INNER JOIN
>> has_protein hp2
>> ON pp.p2_id = hp2.pid
>> INNER JOIN
>> (SELECT * FROM L WHERE s_id = 708) l
>> ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos +
>> pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2)
>> OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos +
>> pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1)
>> ;
>>
>> it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I
>> stopped the actual insert after about 8h.
>>
>> The table that the insert happens to, is following:
>> CREATE TABLE LPP
>> (
>> ppid bigint NOT NULL,
>> lid bigint NOT NULL,
>> CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid)
>> )
>>
>> I also tried without the primary key but that one is still running for
>> more that a day.
>>
>> Currently the table LPP holds 471139 rows. Its linking the PP and the L
>> table.
>>
>> There are no foreign keys referring to that table nor are there any
>> other constraints on it.
>> Previously I had foreign keys on lid and ppid refering to the L and PP
>> table. But in a desperate try to get some speed up I deleted these. -
>> But still...
>>
>> I am running postgresql 9.2 on a windows 2008 R2 server with 256 GB and
>> the database is on something like a raid 1+0 (actually a raid1e)
>> consisting of 3x4TB disks (limit of what could easily be fitted into the
>> server).
>>
>> At the given time there were no concurrent access to any of the
>> involved tables.
>>
>> Has anybody some idea why the insert takes so long and/or how to speed
>> things up a bit? I could live with something like half an hour - better
>> would be in minutes.
>>
>>
>> Thanks for any responds,
>>
>> Lutz Fischer
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance

--
Lutz Fischer
lfischer(at)staffmail(dot)ed(dot)ac(dot)uk
+44 131 6517057

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Filip Rembiałkowski 2012-12-13 16:44:01 Re: problem with large inserts
Previous Message Tom Lane 2012-12-13 16:20:48 Re: Limit & offset effect on query plans