Re: GIN improvements part2: fast scan

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part2: fast scan
Date: 2013-06-18 20:59:54
Message-ID: CAPpHfdsAz71ZK1mYvsOT6juCjo0vFp4r4b7nErzbEvSTo2eMoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 17, 2013 at 5:09 PM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 17.06.2013 15:55, Alexander Korotkov wrote:
>
>> On Sat, Jun 15, 2013 at 2:55 AM, Alexander Korotkov<aekorotkov(at)gmail(dot)com>
>> **wrote:
>>
>> attached patch implementing "fast scan" technique for GIN. This is second
>>> patch of GIN improvements, see the 1st one here:
>>>
>>> http://www.postgresql.org/**message-id/CAPpHfduxv-**
>>> iL7aedwPW0W5fXrWGAKfxijWM63_**hZujaCRxnmFQ(at)mail(dot)gmail(dot)com<http://www.postgresql.org/message-id/CAPpHfduxv-iL7aedwPW0W5fXrWGAKfxijWM63_hZujaCRxnmFQ@mail.gmail.com>
>>> This patch allow to skip parts of posting trees when their scan is not
>>> necessary. In particular, it solves "frequent_term& rare_term" problem
>>> of
>>>
>>> FTS.
>>> It introduces new interface method pre_consistent which behaves like
>>> consistent, but:
>>> 1) allows false positives on input (check[])
>>> 2) allowed to return false positives
>>>
>>> Some example: "frequent_term& rare_term" becomes pretty fast.
>>>
>>>
>>> create table test as (select to_tsvector('english', 'bbb') as v from
>>> generate_series(1,1000000));
>>> insert into test (select to_tsvector('english', 'ddd') from
>>> generate_series(1,10));
>>> create index test_idx on test using gin (v);
>>>
>>> postgres=# explain analyze select * from test where v @@
>>> to_tsquery('english', 'bbb& ddd');
>>>
>>> QUERY PLAN
>>>
>>> ------------------------------**------------------------------**
>>> ------------------------------**-----------------------------
>>> Bitmap Heap Scan on test (cost=942.75..7280.63 rows=5000 width=17)
>>> (actual time=0.458..0.461 rows=10 loops=1)
>>> Recheck Cond: (v @@ '''bbb''& ''ddd'''::tsquery)
>>> -> Bitmap Index Scan on test_idx (cost=0.00..941.50 rows=5000
>>> width=0) (actual time=0.449..0.449 rows=10 loops=1)
>>> Index Cond: (v @@ '''bbb''& ''ddd'''::tsquery)
>>> Total runtime: 0.516 ms
>>> (5 rows)
>>>
>>>
>> Attached version of patch has some refactoring and bug fixes.
>>
>
> Good timing, I just started looking at this.
>
> I think you'll need to explain how this works. There are no docs, and
> almost no comments.
>

Sorry for that. I'll post patch with docs and comments in couple days.

(and this shows how poorly I understand this, but) Why does this require
> the "additional information" patch?

In principle, it doesn't require "additional information" patch. Same
optimization could be done without "additional information". The reason why
it requires "additional information" patch is that otherwise I have to
implement and maintain 2 versions of "fast scan": with and without
"additional information".

> What extra information do you store on-disk, in the additional information?
>

It depends on opclass. In regex patch it is part of graph associated with
trigram. In fulltext search it is word positions. In array similarity
search it could be length of array for better estimation of similarity (not
implemented yet). So it's anything which is stored with each ItemPointer
and is useful for consistent or index driven sorting (see patch #3).

> The pre-consistent method is like the consistent method, but it allows
> false positives. I think that's because during the scan, before having
> scanned for all the keys, the gin AM doesn't yet know if the tuple contains
> all of the keys. So it passes the keys it doesn't yet know about as 'true'
> to pre-consistent.

Could that be generalized, to pass a tri-state instead of a boolean for
> each key to the pre-consistent method? For each key, you would pass "true",
> "false", or "don't know". I think you could then also speed up queries like
> "!english & bbb".

I would like to illustrate that on example. Imagine you have fulltext query
"rare_term & frequent_term". Frequent term has large posting tree while
rare term has only small posting list containing iptr1, iptr2 and iptr3. At
first we get iptr1 from posting list of rare term, then we would like to
check whether we have to scan part of frequent term posting tree where iptr
< iptr1. So we call pre_consistent([false, true]), because we know that
rare term is not present for iptr < iptr2. pre_consistent returns false. So
we can start scanning frequent term posting tree from iptr1. Similarly we
can skip lags between iptr1 and iptr2, iptr2 and iptr3, from iptr3 to
maximum possible pointer.

And yes, it could be generalized to tri-state instead of a boolean. I had
that idea first. But I found superiority of exact "true" quite narrow.
Let's see it on the example. If you have "!term1 & term2" there are few
cases:
1) term1 is rare, term2 is frequent => you can exclude only some entries
from posting tree of term2, performance benefit will be negligible
2) term1 is frequent, term2 is rare => you should actually scan term2
posting list and then check term1 posting tree like in example about
"rare_term & frequent_term", so there is no need of tri-state to handle
this situation
3) term1 is frequent, term2 is frequent => this case probably could be
optimized by tri-state. But in order to actully skip some parts of term2
posting tree you need item pointers in term1 to go sequential. It seems for
me to be very narrow case.
4) term1 is rare, term2 is rare => don't need optimization

BTW, version of patch with some bug fixes is attached.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
gin_fast_scan.3.patch.gz application/x-gzip 6.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2013-06-18 21:21:39 Re: GIN improvements part 3: ordering in index
Previous Message Alexander Korotkov 2013-06-18 20:44:39 Re: GIN improvements part 1: additional information