Re: Bulk Inserts

From: Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>
To: "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bulk Inserts
Date: 2009-09-15 07:23:05
Message-ID: op.u0aesrf8cke6l8@soyouz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> Yes, I did not consider that to be a problem because I did not think it
> would be used on indexed tables. I figured that the gain from doing bulk
> inserts into the table would be so diluted by the still-bottle-necked
> index maintenance that it was OK not to use this optimization for
> indexed tables.

I've tested with indexes, and the index update time is much larger than
the inserts time. Bulk inserts still provide a little bonus though, and
having a solution that works in all cases is better IMHO.

> My original thought was based on the idea of still using heap_insert, but
> with a modified form of bistate which would hold the exclusive lock and
> not
> just a pin. If heap_insert is being driven by the unmodified COPY code,
> then it can't guarantee that COPY won't stall on a pipe read or
> something,
> and so probably shouldn't hold an exclusive lock while filling the block.

Exactly, that's what I was thinking too, and reached the same conclusion.

> That is why I decided a local buffer would be better, as the exclusive
> lock
> is really a no-op and wouldn't block anyone. But if you are creating a
> new
> heap_bulk_insert and modifying the COPY to go with it, then you can
> guarantee it won't stall from the driving end, instead.

I think it's better, but you have to buffer tuples : at least a full
page's worth, or better, several pages' worth of tuples, in case inline
compression kicks in and shrinks them, since the purpose is to be able to
fill a complete page in one go.

> Whether any of these approaches will be maintainable enough to be
> integrated into the code base is another matter. It seems like there is
> already a lot of discussion going on around various permutations of copy
> options.

It's not really a COPY mod, since it would also be good for big INSERT
INTO SELECT FROM which is wal-bound too (even more so than COPY, since
there is no parsing to do).

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pierre Frédéric Caillaud 2009-09-15 07:28:29 Re: Bulk Inserts
Previous Message Fujii Masao 2009-09-15 06:32:48 Re: Streaming Replication patch for CommitFest 2009-09