Re: Inserting heap tuples in bulk in COPY

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inserting heap tuples in bulk in COPY
Date: 2011-09-25 08:43:13
Message-ID: CADyhKSVjY5_WJfH18uK-5-WD4VPbnpVhdP+CZo-5w0hZY02qUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Heikki,

I checked your patch, then I have a comment and two questions here.

The heap_prepare_insert() seems a duplication of code with earlier
half of existing heap_insert(). I think it is a good question to
consolidate these portion of the code.

I'm not clear the reason why the argument of
CheckForSerializableConflictIn() was
changed from the one in heap_insert(). Its source code comment describes as:
:
* For a heap insert, we only need to check for table-level SSI locks.
* Our new tuple can't possibly conflict with existing tuple locks, and
* heap page locks are only consolidated versions of tuple locks; they do
* not lock "gaps" as index page locks do. So we don't need to identify
* a buffer before making the call.
*/
Is it feasible that newly inserted tuples conflict with existing tuple
locks when
we expand insert to support multiple tuples at once?
It is a bit confusing for me.

This patch disallow multiple-insertion not only when before-row
trigger is configured,
but volatile functions are used to compute a default value also.
Is it a reasonable condition to avoid multiple-insertion?
All the datums to be delivered to heap_form_tuple() is calculated in
NextCopyFrom,
and default values are also constructed here; sequentially.
So, it seems to me we have no worry about volatile functions are not
invoked toward
each tuples. Or, am I missing something?

Thanks,

2011/9/14 Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>:
> On 13.08.2011 17:33, Tom Lane wrote:
>>
>> Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com>  writes:
>>>
>>> The patch is WIP, mainly because I didn't write the WAL replay routines
>>> yet, but please let me know if you see any issues.
>>
>> Why do you need new WAL replay routines?  Can't you just use the existing
>> XLOG_HEAP_NEWPAGE support?
>>
>> By any large, I think we should be avoiding special-purpose WAL entries
>> as much as possible.
>
> I tried that, but most of the reduction in WAL-size melts away with that.
> And if the page you're copying to is not empty, logging the whole page is
> even more expensive. You'd need to fall back to retail inserts in that case
> which complicates the logic.
>
>> Also, I strongly object to turning regular heap_insert into a wrapper
>> around some other more complicated operation.  That will likely have
>> bad consequences for the performance of every other operation.
>
> Ok. I doubt it makes any noticeable difference for performance, but having
> done that, it's more readable, too, to duplicate the code.
>
> Attached is a new version of the patch. It is now complete, including WAL
> replay code.
>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2011-09-25 13:03:19 Re: Inserting heap tuples in bulk in COPY
Previous Message Marti Raudsepp 2011-09-25 02:09:13 [PATCH] Caching for stable expressions with constant arguments v3