Re: Inserting heap tuples in bulk in COPY

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inserting heap tuples in bulk in COPY
Date: 2011-10-06 14:24:07
Message-ID: 4E8DBA07.3010801@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06.10.2011 15:11, Robert Haas wrote:
> On Thu, Oct 6, 2011 at 7:33 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> A regular heap_insert record leaves out a lot of information that can be
>> deduced at replay time. It can leave out all the headers, including just the
>> null bitmap + data. In addition to that, there's just the location of the
>> tuple (RelFileNode+ItemPointer). At replay, xmin is taken from the WAL
>> record header.
>>
>> For a multi-insert record, you don't even need to store the RelFileNode and
>> the block number for every tuple, just the offsets.
>>
>> In comparison, a full-page image will include the full tuple header, and
>> also the line pointers. If I'm doing my math right, a full-page image takes
>> 25 bytes more data per tuple, than the special-purpose multi-insert record.
>
> Interesting. It's always seemed to me fairly inefficient in general
> to store the whole RelFileNode. For many people, the database and
> tablespace OID will be constants, and even if they aren't, there
> certainly aren't going to be 96 bits of entropy in the relfilenode. I
> thought about whether we could create some sort of mapping layer,
> where say once per checkpoint we'd allocate a 4-byte integer to denote
> a relfilenode, and WAL-log that mapping. Then after that everyone
> could just refer to the 4-byte integer instead of the whole
> relfilenode. But it seems like a lot of work for 8 bytes per record.
> Then again, if you're getting that much benefit from shaving off 25
> bytes per tuple, maybe it is, although I feel like FPW is the elephant
> in the room.

A very simple optimization would be to leave out tablespace OID
altogether if it's DEFAULTTABLESPACE_OID, and just set a flag somewhere.
Then again, we could also just compress the WAL wholesale.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-10-06 15:55:37 Re: checkpoints are duplicated even while the system is idle
Previous Message Bruce Momjian 2011-10-06 14:11:55 Re: pg_upgrade - add config directory setting