Re: GIN improvements part 1: additional information

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 1: additional information
Date: 2013-12-18 14:50:39
Message-ID: 52B1B63F.6030104@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/18/2013 01:45 PM, Alexander Korotkov wrote:
> On Tue, Dec 17, 2013 at 2:49 AM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
>> wrote:
>
>> On 12/17/2013 12:22 AM, Alexander Korotkov wrote:
>> 2) Storage would be easily extendable to hold additional information as
>>> well.
>>> Better compression shouldn't block more serious improvements.
>>>
>>
>> I'm not sure I agree with that. For all the cases where you don't care
>> about additional information - which covers all existing users for example
>> - reducing disk size is pretty important. How are you planning to store the
>> additional information, and how does using another encoding gets in the way
>> of that?
>
> I was planned to store additional information datums between
> varbyte-encoded tids. I was expected it would be hard to do with PFOR.
> However, I don't see significant problems in your implementation of Simple9
> encoding. I'm going to dig deeper in your version of patch.

Ok, thanks.

I had another idea about the page format this morning. Instead of having
the item-indexes at the end of the page, it would be more flexible to
store a bunch of self-contained posting list "segments" on the page. So
I propose that we get rid of the item-indexes, and instead store a bunch
of independent posting lists on the page:

typedef struct
{
ItemPointerData first; /* first item in this segment (unpacked) */
uint16 nwords; /* number of words that follow */
uint64 words[1]; /* var length */
} PostingListSegment;

Each segment can be encoded and decoded independently. When searching
for a particular item (like on insertion), you skip over segments where
'first' > the item you're searching for.

This format offers a lot more flexibility compared to the separate item
indexes. First, we don't need to have another fixed sized area on the
page, which simplifies the page format. Second, we can more easily
re-encode only one segment on the page, on insertion or vacuum. The
latter is particularly important with the Simple-9 encoding, which
operates one word at a time rather than one item at a time; removing or
inserting an item in the middle can require a complete re-encoding of
everything that follows. Third, when a page is being inserted into and
contains only uncompressed items, you don't waste any space for unused
item indexes.

While we're at it, I think we should use the above struct in the inline
posting lists stored directly in entry tuples. That wastes a few bytes
compared to the current approach in the patch (more alignment, and
'words' is redundant with the number of items stored on the tuple
header), but it simplifies the functions handling these lists.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2013-12-18 14:55:41 Re: ALTER SYSTEM SET command to change postgresql.conf parameters
Previous Message Stephen Frost 2013-12-18 14:45:24 Re: SQL objects UNITs (was: Extension Templates S03E11)