Re: HEAD seems to generate larger WAL regarding GIN index

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HEAD seems to generate larger WAL regarding GIN index
Date: 2014-03-17 13:20:00
Message-ID: CAHGQGwGU4_SVFt2EztmwDAkTQKRH3zStF-fbYRKny0Qtha9OVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 16, 2014 at 7:15 AM, Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
> On Sat, Mar 15, 2014 at 11:27 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>>
>> On 03/15/2014 08:40 PM, Fujii Masao wrote:
>>>
>>> Hi,
>>>
>>> I executed the following statements in HEAD and 9.3, and compared
>>> the size of WAL which were generated by data insertion in GIN index.
>>>
>>> ---------------------
>>> CREATE EXTENSION pg_trgm;
>>> CREATE TABLE hoge (col1 text);
>>> CREATE INDEX hogeidx ON hoge USING gin (col1 gin_trgm_ops) WITH
>>> (FASTUPDATE = off);
>>>
>>> CHECKPOINT;
>>> SELECT pg_switch_xlog();
>>> SELECT pg_switch_xlog();
>>>
>>> SELECT pg_current_xlog_location();
>>> INSERT INTO hoge SELECT 'POSTGRESQL' FROM generate_series(1, 1000000);
>>> SELECT pg_current_xlog_location();
>>> ---------------------
>>>
>>> The results of WAL size are
>>>
>>> 960 MB (9.3)
>>> 2113 MB (HEAD)
>>>
>>> The WAL size in HEAD was more than two times bigger than that in 9.3.
>>> Recently the source code of GIN index has been changed dramatically.
>>> Is the increase in GIN-related WAL intentional or a bug?
>>
>>
>> It was somewhat expected. Updating individual items on the new-format GIN
>> pages requires decompressing and recompressing the page, and the
>> recompressed posting lists need to be WAL-logged. Which generates much
>> larger WAL records.
>>
>> That said, I didn't expect the difference to be quite that big when you're
>> appending to the end of the table. When the new entries go to the end of the
>> posting lists, you only need to recompress and WAL-log the last posting
>> list, which is max 256 bytes long. But I guess that's still a lot more WAL
>> than in the old format.

I ran "pg_xlogdump | grep Gin" and checked the size of GIN-related WAL,
and then found its max seems more than 256B. Am I missing something?

What I observed is

[In HEAD]
At first, the size of GIN-related WAL is gradually increasing up to about 1400B.

rmgr: Gin len (rec/tot): 48/ 80, tx: 1813,
lsn: 0/020020D8, prev 0/02000070, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 1 isdata: F isleaf: T isdelete: F
rmgr: Gin len (rec/tot): 56/ 88, tx: 1813,
lsn: 0/02002440, prev 0/020023F8, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 1 isdata: F isleaf: T isdelete: T
rmgr: Gin len (rec/tot): 64/ 96, tx: 1813,
lsn: 0/020044D8, prev 0/02004490, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 1 isdata: F isleaf: T isdelete: T
...
rmgr: Gin len (rec/tot): 1376/ 1408, tx: 1813,
lsn: 0/02A7EE90, prev 0/02A7E910, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 2 isdata: F isleaf: T isdelete: T
rmgr: Gin len (rec/tot): 1392/ 1424, tx: 1813,
lsn: 0/02A7F458, prev 0/02A7F410, bkp: 0000, desc: Create posting
tree, node: 1663/12945/16441 blkno: 4

Then the size decreases to about 100B and is gradually increasing
again up to 320B.

rmgr: Gin len (rec/tot): 116/ 148, tx: 1813,
lsn: 0/02A7F9E8, prev 0/02A7F458, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 4 isdata: T isleaf: T unmodified: 1280 length:
1372 (compressed)
rmgr: Gin len (rec/tot): 40/ 72, tx: 1813,
lsn: 0/02A7FA80, prev 0/02A7F9E8, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 3 isdata: F isleaf: T isdelete: T
...
rmgr: Gin len (rec/tot): 118/ 150, tx: 1813,
lsn: 0/02A83BA0, prev 0/02A83B58, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 4 isdata: T isleaf: T unmodified: 1280 length:
1374 (compressed)
...
rmgr: Gin len (rec/tot): 288/ 320, tx: 1813,
lsn: 0/02AEDE28, prev 0/02AEDCE8, bkp: 0000, desc: Insert item, node:
1663/12945/16441 blkno: 14 isdata: T isleaf: T unmodified: 1280
length: 1544 (compressed)

Then the size decreases to 66B and is gradually increasing again up to 320B.
This increase and decrease of WAL size seems to continue.

[In 9.3]
At first, the size of GIN-related WAL is gradually increasing up to about 2700B.

rmgr: Gin len (rec/tot): 52/ 84, tx: 1812,
lsn: 0/02000430, prev 0/020003D8, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 1 offset: 11 nitem: 1 isdata: F isleaf T
isdelete F updateBlkno:4294967295
rmgr: Gin len (rec/tot): 60/ 92, tx: 1812,
lsn: 0/020004D0, prev 0/02000488, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 1 offset: 1 nitem: 1 isdata: F isleaf T
isdelete T updateBlkno:4294967295
...
rmgr: Gin len (rec/tot): 2740/ 2772, tx: 1812,
lsn: 0/026D1670, prev 0/026D0B98, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 5 offset: 2 nitem: 1 isdata: F isleaf T
isdelete T updateBlkno:4294967295
rmgr: Gin len (rec/tot): 2714/ 2746, tx: 1812,
lsn: 0/026D21A8, prev 0/026D2160, bkp: 0000, desc: Create posting
tree, node: 1663/12896/16441 blkno: 6

The size decreases to 66B and then is never changed.

rmgr: Gin len (rec/tot): 34/ 66, tx: 1812,
lsn: 0/026D9F00, prev 0/026D9EB8, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 6 offset: 451 nitem: 1 isdata: T isleaf T
isdelete F updateBlkno:4294967295
rmgr: Gin len (rec/tot): 34/ 66, tx: 1812,
lsn: 0/026D9F48, prev 0/026D9F00, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 7 offset: 451 nitem: 1 isdata: T isleaf T
isdelete F updateBlkno:4294967295
rmgr: Gin len (rec/tot): 34/ 66, tx: 1812,
lsn: 0/026D9F90, prev 0/026D9F48, bkp: 0000, desc: Insert item, node:
1663/12896/16441 blkno: 8 offset: 451 nitem: 1 isdata: T isleaf T
isdelete F updateBlkno:4294967295
...

This difference in GIN-related WAL size seems to cause HEAD to generate more
than two times bigger WAL. Unfortunately the gap of WAL size would be
continuously increasing :(

>>
>> That could be optimized, but I figured we can live with it, thanks to the
>> fastupdate feature. Fastupdate allows amortizing that cost over several
>> insertions. But of course, you explicitly disabled that...
>
>
> Let me know if you want me to write patch addressing this issue.

Yeah, I really want you to address this problem! That's definitely useful
for every users disabling FASTUPDATE option for some reasons.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-03-17 13:20:03 Re: Changeset Extraction v7.9.1
Previous Message Andres Freund 2014-03-17 13:16:38 Re: Changeset Extraction v7.9.1