Re: Understanding GIN posting trees

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: Understanding GIN posting trees
Date: 2011-07-14 14:28:48
Message-ID: 4E1EFD20.4040006@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I have a couple of questions on GIN:
>
> The code seems to assume that it's possible for the same TID to appear
> twice for a single key (see addItemPointersToTuple()). I understand that
> it's possible for a single heap tuple to contain the same key twice. For
> example if you index an array of integers like [1,2,1]. But once you've
> inserted all the keys for a single heap item, you never try to insert
> the same TID again, so no duplicates should occur.
>
> Looking at the history, it looks like pre-8.4 we assumed that no such
> duplicates are possible. Duplicates of a single key for one column are
> eliminated in extractEntriesSU(), but apparently when the multi-column
> support was added, we didn't make the de-duplication to run across the
> keys extracted from all columns. Now that the posting tree/list
> insertion code has to deal with duplicates anyway, the de-duplication
> performed in extractEntriesSU() seems pointless. But I wonder if it
> would be better to make extractEntriesSU() remove duplicates across all
> columns, so that the insertion code wouldn't need to deal with duplicates.

During vacuuming of pending list we could get a powerloss and some data will be
in tree and pending list both, after restart pgsql will try to insert the same
data to tree.

> Dealing with the duplicates in the insertion code isn't particularly
> difficult. And in fact, now that we only support the getbitmap method,
> we wouldn't really need to eliminate duplicates anyway. But I have an
> ulterior motive:

> Why is the posting tree a tree? AFAICS, we never search it using the
> TID, it's always scanned in whole. It would be simpler to store the TIDs
> in a posting list in no particular order. This could potentially make
> insertions cheaper, as you could just append to the last posting list
> page for the key, instead of traversing the posting tree to a particular
> location. You could also pack the tids denser, as you wouldn't need to
> reserve free space for additions in the middle.
For consistentFn call we need to collect all data for current tid. We do that by
scanning over logical ordered arrays of tids and trees allows to do that by
scanning a leafs pages.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-07-14 14:59:14 Re: SAVEPOINTs and COMMIT performance
Previous Message Magnus Hagander 2011-07-14 14:23:47 Re: [BUG] SSPI authentication fails on Windows when server parameter is localhost or domain name