From: | Andreas Karlsson <andreas(at)proxel(dot)se> |
---|---|
To: | Antonin Houska <antonin(dot)houska(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Todo item: Support amgettuple() in GIN |
Date: | 2013-11-29 12:57:24 |
Message-ID: | 52988F34.2020407@proxel.se |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/29/2013 09:54 AM, Antonin Houska wrote:
> On 11/29/2013 01:13 AM, Andreas Karlsson wrote:
>
>> When doing partial matching the code need to be able to return the union
>> of all TIDs in all the matching posting trees in TID order (to be able
>> to do AND and OR operations with multiple search keys later). It does
>> this by traversing them posting tree after posting tree and collecting
>> them all in a TIDBitmap which is later iterated over.
>
> I think it's not a plain union. My understanding is that - to evaluate a
> single key (typically array) - you first need to get all the TID streams
> for that key (i.e. one posting list/tree per element of the key array)
> and then iterate all these streams in parallel and 'merge' them using
> consistent() function. That's how I understand ginget.c:keyGetItem().
For partial matches the merging is done in two steps: first a simple
union of all the streams per key and then second merging those union
streams using the consistent() function.
It is the first step that can be lossy.
> So the problem of partial match is (IMO) that there can be too many TID
> streams to merge - much more than the number of elements of the key array.
Agreed.
--
Andreas Karlsson
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2013-11-29 13:15:52 | commit fest 2013-11 week 2 report |
Previous Message | Amit Khandekar | 2013-11-29 11:16:48 | Re: COPY table FROM STDIN doesn't show count tag |