Re: Todo item: Support amgettuple() in GIN

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Todo item: Support amgettuple() in GIN
Date: 2013-11-29 12:57:24
Message-ID: 52988F34.2020407@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/29/2013 09:54 AM, Antonin Houska wrote:
> On 11/29/2013 01:13 AM, Andreas Karlsson wrote:
>
>> When doing partial matching the code need to be able to return the union
>> of all TIDs in all the matching posting trees in TID order (to be able
>> to do AND and OR operations with multiple search keys later). It does
>> this by traversing them posting tree after posting tree and collecting
>> them all in a TIDBitmap which is later iterated over.
>
> I think it's not a plain union. My understanding is that - to evaluate a
> single key (typically array) - you first need to get all the TID streams
> for that key (i.e. one posting list/tree per element of the key array)
> and then iterate all these streams in parallel and 'merge' them using
> consistent() function. That's how I understand ginget.c:keyGetItem().

For partial matches the merging is done in two steps: first a simple
union of all the streams per key and then second merging those union
streams using the consistent() function.

It is the first step that can be lossy.

> So the problem of partial match is (IMO) that there can be too many TID
> streams to merge - much more than the number of elements of the key array.

Agreed.

--
Andreas Karlsson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2013-11-29 13:15:52 commit fest 2013-11 week 2 report
Previous Message Amit Khandekar 2013-11-29 11:16:48 Re: COPY table FROM STDIN doesn't show count tag