Re: GIN improvements part 1: additional information

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 1: additional information
Date: 2013-12-16 11:30:13
Message-ID: 52AEE445.1060206@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/12/2013 06:44 PM, Alexander Korotkov wrote:
> I've thought about different algorithms little more. General problem I see
> is online update. We need it while it is typically not covered by
> researches at all. We already have to invent small index in the end of
> page. Different encoding methods adds more challenges. In general, methods
> can be classified in two groups:
> 1) Values aren't aligned by bytes (gamma-codes, PFOR etc.)
> 2) Multiple values are packed together in small group (simple-9, simple-18)

Ok.

> For the first group of methods when inserting in the middle of the page we
> would have to do not byte-aligned shift of right part of values. I don't
> know how expensive is this shift but I expect that it would be much slower
> than memmove.

Agreed.

> When values are packed into small groups, we have to either insert
> inefficiently encoded value or re-encode whole right part of values.

It would probably be simplest to store newly inserted items
uncompressed, in a separate area in the page. For example, grow the list
of uncompressed items downwards from pg_upper, and the compressed items
upwards from pg_lower. When the page fills up, re-encode the whole page.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Albe Laurenz 2013-12-16 11:30:21 pg_dump behaves differently for different archive formats
Previous Message Andres Freund 2013-12-16 11:01:38 Re: logical changeset generation v6.8