Re: GIN improvements part 3: ordering in index

Lists: pgsql-hackers
From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: GIN improvements part 3: ordering in index
Date: 2013-06-14 23:02:27
Message-ID: CAPpHfduWvqv5b0XZ1DZuqAW29erDCULZp2wotfJzDBs7BHpKXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hackers,

attached patch implementing ordering inside GIN index. This is third patch
of GIN improvements, see previous two:
http://www.postgresql.org/message-id/CAPpHfduxv-iL7aedwPW0W5fXrWGAKfxijWM63_hZujaCRxnmFQ@mail.gmail.com
http://www.postgresql.org/message-id/CAPpHfdvftaJq7www381naLw1=4u0h+qpXgWvNhcEB9HMVywbGg@mail.gmail.com

This patch introduces new interface method of GIN which takes same
arguments as consistent but returns float8.
float8 gin_ordering(bool check[], StrategyNumber n, Datum query, int32
nkeys, Pointer extra_data[], bool *recheck, Datum queryKeys[], bool
nullFlags[], Datum addInfo[], bool addInfoIsNull[])
This patch implements gingettuple method which can return ordering data
using KNN infrastructure. Also it introduces >< operator for fts which
support ordering in GIN index. Some example:

postgres=# explain analyze select * from dblp_titles2 where tsvector @@
to_tsquery('english', 'statistics') order by tsvector ><
to_tsquery('english', 'statistics') limit 10;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=12.00..48.22 rows=10 width=136) (actual time=6.999..7.120
rows=10 loops=1)
-> Index Scan using dblp_titles2_idx on dblp_titles2
(cost=12.00..43003.03 rows=11868 width=136) (actual time=6.996..7.115
rows=10 loops=1)
Index Cond: (tsvector @@ '''statist'''::tsquery)
Order By: (tsvector >< '''statist'''::tsquery)
Total runtime: 7.556 ms
(5 rows)

------
With best regards,
Alexander Korotkov.


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-06-17 12:56:28
Message-ID: CAPpHfdsvHhWuKXkSDKWqu+8dqczr=TdHzo0ssghTEcQg9sa4VA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jun 15, 2013 at 3:02 AM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>wrote:

> attached patch implementing ordering inside GIN index. This is third patch
> of GIN improvements, see previous two:
>
> http://www.postgresql.org/message-id/CAPpHfduxv-iL7aedwPW0W5fXrWGAKfxijWM63_hZujaCRxnmFQ@mail.gmail.com
>
> http://www.postgresql.org/message-id/CAPpHfdvftaJq7www381naLw1=4u0h+qpXgWvNhcEB9HMVywbGg@mail.gmail.com
>
> This patch introduces new interface method of GIN which takes same
> arguments as consistent but returns float8.
> float8 gin_ordering(bool check[], StrategyNumber n, Datum query, int32
> nkeys, Pointer extra_data[], bool *recheck, Datum queryKeys[], bool
> nullFlags[], Datum addInfo[], bool addInfoIsNull[])
> This patch implements gingettuple method which can return ordering data
> using KNN infrastructure. Also it introduces >< operator for fts which
> support ordering in GIN index. Some example:
>
> postgres=# explain analyze select * from dblp_titles2 where tsvector @@
> to_tsquery('english', 'statistics') order by tsvector ><
> to_tsquery('english', 'statistics') limit 10;
> QUERY
> PLAN
>
> -------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=12.00..48.22 rows=10 width=136) (actual time=6.999..7.120
> rows=10 loops=1)
> -> Index Scan using dblp_titles2_idx on dblp_titles2
> (cost=12.00..43003.03 rows=11868 width=136) (actual time=6.996..7.115
> rows=10 loops=1)
> Index Cond: (tsvector @@ '''statist'''::tsquery)
> Order By: (tsvector >< '''statist'''::tsquery)
> Total runtime: 7.556 ms
> (5 rows)
>

Attached version of patch has some refactoring and bug fixes.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
gin_ordering.2.patch.gz application/x-gzip 12.5 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-06-17 18:27:51
Message-ID: 51BF5527.1090203@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17.06.2013 15:56, Alexander Korotkov wrote:
> On Sat, Jun 15, 2013 at 3:02 AM, Alexander Korotkov<aekorotkov(at)gmail(dot)com>wrote:
>
>> This patch introduces new interface method of GIN which takes same
>> arguments as consistent but returns float8.
>> float8 gin_ordering(bool check[], StrategyNumber n, Datum query, int32
>> nkeys, Pointer extra_data[], bool *recheck, Datum queryKeys[], bool
>> nullFlags[], Datum addInfo[], bool addInfoIsNull[])
>> This patch implements gingettuple method which can return ordering data
>> using KNN infrastructure. Also it introduces>< operator for fts which
>> support ordering in GIN index. Some example:
>>
>> postgres=# explain analyze select * from dblp_titles2 where tsvector @@
>> to_tsquery('english', 'statistics') order by tsvector><
>> to_tsquery('english', 'statistics') limit 10;
>> QUERY
>> PLAN
>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------
>> Limit (cost=12.00..48.22 rows=10 width=136) (actual time=6.999..7.120
>> rows=10 loops=1)
>> -> Index Scan using dblp_titles2_idx on dblp_titles2
>> (cost=12.00..43003.03 rows=11868 width=136) (actual time=6.996..7.115
>> rows=10 loops=1)
>> Index Cond: (tsvector @@ '''statist'''::tsquery)
>> Order By: (tsvector>< '''statist'''::tsquery)
>> Total runtime: 7.556 ms
>> (5 rows)
>>
>
> Attached version of patch has some refactoring and bug fixes.

Thanks. There are no docs changes and not many comments, that needs to
be fixed, but I think I understand how it works:

On the first call to gingettuple, the index is first scanned for all the
matches, which are collected in an array in memory. Then, the array is
sorted with qsort(), and the matches are returned one by one from the
sorted array.

That has some obvious limitations. First of all, you can run out of
memory. Secondly, is that really any faster than the plan you get
without this patch? Ie. scan all the matches with a bitmap index scan,
and sort the results on the rank function. If it is faster, why?

What parts of the previous two patches does this rely on?

- Heikki


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-06-18 21:21:39
Message-ID: CAPpHfdvJ5Z=wRGVGmeOkkmbkjRo9U0s8sKfkSLQAoWnC-gcxTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 17, 2013 at 10:27 PM, Heikki Linnakangas <
hlinnakangas(at)vmware(dot)com> wrote:

> On 17.06.2013 15:56, Alexander Korotkov wrote:
>
>> On Sat, Jun 15, 2013 at 3:02 AM, Alexander Korotkov<aekorotkov(at)gmail(dot)com>
>> **wrote:
>>
>> This patch introduces new interface method of GIN which takes same
>>> arguments as consistent but returns float8.
>>> float8 gin_ordering(bool check[], StrategyNumber n, Datum query, int32
>>> nkeys, Pointer extra_data[], bool *recheck, Datum queryKeys[], bool
>>> nullFlags[], Datum addInfo[], bool addInfoIsNull[])
>>> This patch implements gingettuple method which can return ordering data
>>> using KNN infrastructure. Also it introduces>< operator for fts which
>>> support ordering in GIN index. Some example:
>>>
>>> postgres=# explain analyze select * from dblp_titles2 where tsvector @@
>>> to_tsquery('english', 'statistics') order by tsvector><
>>> to_tsquery('english', 'statistics') limit 10;
>>> QUERY
>>> PLAN
>>>
>>> ------------------------------**------------------------------**
>>> ------------------------------**------------------------------**
>>> -------------------------
>>> Limit (cost=12.00..48.22 rows=10 width=136) (actual time=6.999..7.120
>>> rows=10 loops=1)
>>> -> Index Scan using dblp_titles2_idx on dblp_titles2
>>> (cost=12.00..43003.03 rows=11868 width=136) (actual time=6.996..7.115
>>> rows=10 loops=1)
>>> Index Cond: (tsvector @@ '''statist'''::tsquery)
>>> Order By: (tsvector>< '''statist'''::tsquery)
>>> Total runtime: 7.556 ms
>>> (5 rows)
>>>
>>>
>> Attached version of patch has some refactoring and bug fixes.
>>
>
> Thanks. There are no docs changes and not many comments, that needs to be
> fixed, but I think I understand how it works:
>
> On the first call to gingettuple, the index is first scanned for all the
> matches, which are collected in an array in memory. Then, the array is
> sorted with qsort(), and the matches are returned one by one from the
> sorted array.
>

Right.

That has some obvious limitations. First of all, you can run out of memory.

Yes, it is so. qsort should be replaced with tuplesort.

Secondly, is that really any faster than the plan you get without this
> patch? Ie. scan all the matches with a bitmap index scan, and sort the
> results on the rank function. If it is faster, why?
>

At, first it's obviously much faster when not both heap and index fit into
cache, because of IO. With patch you need only posting trees and posting
lists of requested entries. If query matching significant part of documents
then without patch you will need almost all the heap.
Also it's faster when both heap and index are in the cache. There are some
examples on DBLP paper titles (2.5M titles of 47 average length).

Without patch:
postgres=# explain analyze select id, s from dblp_titles2 where ts @@
to_tsquery('english', 'system') order by ts_rank(ts, to_tsquery('english',
'system')) desc limit 10;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=60634.15..60634.17 rows=10 width=134) (actual
time=200.204..200.205 rows=10 loops=1)
-> Sort (cost=60634.15..61041.28 rows=162854 width=134) (actual
time=200.202..200.203 rows=10 loops=1)
Sort Key: (ts_rank(ts, '''system'''::tsquery))
Sort Method: top-N heapsort Memory: 28kB
-> Bitmap Heap Scan on dblp_titles2 (cost=1758.12..57114.93
rows=162854 width=134) (actual time=33.592..158.006 rows=168308 loops=1)
Recheck Cond: (ts @@ '''system'''::tsquery)
-> Bitmap Index Scan on dblp_titles2_idx
(cost=0.00..1717.41 rows=162854 width=0) (actual time=27.327..27.327
rows=168308 loops=1)
Index Cond: (ts @@ '''system'''::tsquery)
Total runtime: 200.610 ms
(9 rows)

With patch:
postgres=# explain analyze select id, s from dblp_titles2 where ts @@
to_tsquery('english', 'system') order by ts >< to_tsquery('english',
'system') limit 10;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=12.00..43.05 rows=10 width=136) (actual time=39.585..39.597
rows=10 loops=1)
-> Index Scan using dblp_titles2_idx on dblp_titles2
(cost=12.00..493681.61 rows=159005 width=136) (actual time=39.584..39.593
rows=10 loops=1)
Index Cond: (ts @@ '''system'''::tsquery)
Order By: (ts >< '''system'''::tsquery)
Total runtime: 40.115 ms
(5 rows)

Without patch:
postgres=# explain analyze select id, s, ts_rank(ts, to_tsquery('english',
'statistics')) from dblp_titles2 where ts @@ to_tsquery('english',
'statistics') order by ts_rank(ts, to_tsquery('english', 'statistics'))
desc limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=26029.65..26029.67 rows=10 width=134) (actual
time=19.938..19.939 rows=10 loops=1)
-> Sort (cost=26029.65..26055.17 rows=10210 width=134) (actual
time=19.936..19.937 rows=10 loops=1)
Sort Key: (ts_rank(ts, '''statist'''::tsquery))
Sort Method: top-N heapsort Memory: 27kB
-> Bitmap Heap Scan on dblp_titles2 (cost=127.13..25809.01
rows=10210 width=134) (actual time=4.262..16.854 rows=10298 loops=1)
Recheck Cond: (ts @@ '''statist'''::tsquery)
-> Bitmap Index Scan on dblp_titles2_idx
(cost=0.00..124.58 rows=10210 width=0) (actual time=2.887..2.887
rows=10298 loops=1)
Index Cond: (ts @@ '''statist'''::tsquery)
Total runtime: 20.037 ms
(9 rows)

With patch:
postgres=# explain analyze select id, s from dblp_titles2 where ts @@
to_tsquery('english', 'statistics') order by ts >< to_tsquery('english',
'statistics') limit 10;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=12.00..48.60 rows=10 width=134) (actual time=3.469..3.485
rows=10 loops=1)
-> Index Scan using dblp_titles2_idx on dblp_titles2
(cost=12.00..38919.15 rows=10629 width=134) (actual time=3.469..3.483
rows=10 loops=1)
Index Cond: (ts @@ '''statist'''::tsquery)
Order By: (ts >< '''statist'''::tsquery)
Total runtime: 3.575 ms
(5 rows)

I didn't do detailed profile to know exactly, but I can suppose following
reasons of why performance gain:
1) Don't do bsearch if tsvector, immediately get required word positions
2) Don't check visibility (check only for resulting tuples). However with a
lot of invisible tuples we have to calculate more relevance.
3) Less switching between planner nodes
4) More sequential access to RAM
Actually, I'm not sure about that reasons and probably there is another
reason. But there is no magic, with patch it still honestly get word
positions and calculates relevance. Replacing qsort with tuplesort will add
some overhead, but I doubt that it will change the situation dramatically.

What parts of the previous two patches does this rely on?

It could be possible to implement similar patch completely independently of
other patches. But without additional information you wouldn't be able to
calculate something useful to sort. So, logically it rely on part 1:
additional information.
"Physically" patches I posted are applying only sequentially: part 1 =>
part 2 => part 3.

BTW, patch with more bug fixes is attached.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
gin_ordering.3.patch.gz application/x-gzip 12.0 KB

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-06-24 22:24:37
Message-ID: CAPpHfdtthB9Pooks7m4XTi=mBmTrfhBBYBOBfe4F8Uv1MiHgnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 19, 2013 at 1:21 AM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>wrote:

> On Mon, Jun 17, 2013 at 10:27 PM, Heikki Linnakangas <
> hlinnakangas(at)vmware(dot)com> wrote:
>
>> On 17.06.2013 15:56, Alexander Korotkov wrote:
>>
>>> On Sat, Jun 15, 2013 at 3:02 AM, Alexander Korotkov<aekorotkov(at)gmail(dot)com
>>> >**wrote:
>>>
>>> This patch introduces new interface method of GIN which takes same
>>>> arguments as consistent but returns float8.
>>>> float8 gin_ordering(bool check[], StrategyNumber n, Datum query, int32
>>>> nkeys, Pointer extra_data[], bool *recheck, Datum queryKeys[], bool
>>>> nullFlags[], Datum addInfo[], bool addInfoIsNull[])
>>>> This patch implements gingettuple method which can return ordering data
>>>> using KNN infrastructure. Also it introduces>< operator for fts which
>>>> support ordering in GIN index. Some example:
>>>>
>>>> postgres=# explain analyze select * from dblp_titles2 where tsvector @@
>>>> to_tsquery('english', 'statistics') order by tsvector><
>>>> to_tsquery('english', 'statistics') limit 10;
>>>>
>>>> QUERY
>>>> PLAN
>>>>
>>>> ------------------------------**------------------------------**
>>>> ------------------------------**------------------------------**
>>>> -------------------------
>>>> Limit (cost=12.00..48.22 rows=10 width=136) (actual time=6.999..7.120
>>>> rows=10 loops=1)
>>>> -> Index Scan using dblp_titles2_idx on dblp_titles2
>>>> (cost=12.00..43003.03 rows=11868 width=136) (actual time=6.996..7.115
>>>> rows=10 loops=1)
>>>> Index Cond: (tsvector @@ '''statist'''::tsquery)
>>>> Order By: (tsvector>< '''statist'''::tsquery)
>>>> Total runtime: 7.556 ms
>>>> (5 rows)
>>>>
>>>>
>>> Attached version of patch has some refactoring and bug fixes.
>>>
>>
>> Thanks. There are no docs changes and not many comments, that needs to be
>> fixed, but I think I understand how it works:
>>
>> On the first call to gingettuple, the index is first scanned for all the
>> matches, which are collected in an array in memory. Then, the array is
>> sorted with qsort(), and the matches are returned one by one from the
>> sorted array.
>>
>
> Right.
>
> That has some obvious limitations. First of all, you can run out of
>> memory.
>
>
> Yes, it is so. qsort should be replaced with tuplesort.
>

In attached patch qsort is replaced with tuplesort. As expected, it leads
to some performance drawback, but it's not dramatic.
Also, some doc is added for new distance method of GIN.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
gin_ordering.4.patch.gz application/x-gzip 14.2 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-06-25 15:31:39
Message-ID: 51C9B7DB.1080706@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 25.06.2013 01:24, Alexander Korotkov wrote:
> On Wed, Jun 19, 2013 at 1:21 AM, Alexander Korotkov<aekorotkov(at)gmail(dot)com>wrote:
>
>> On Mon, Jun 17, 2013 at 10:27 PM, Heikki Linnakangas<
>> hlinnakangas(at)vmware(dot)com> wrote:
>>
>> That has some obvious limitations. First of all, you can run out of
>>> memory.
>>
>> Yes, it is so. qsort should be replaced with tuplesort.
>
> In attached patch qsort is replaced with tuplesort. As expected, it leads
> to some performance drawback, but it's not dramatic.
> Also, some doc is added for new distance method of GIN.

Thanks. But the fact remains that with this patch, you fetch all the
tuples and then you sort them, which is exactly the same thing you do
without the patch. The way it happens without the patch just seems to be
slower. Time to break out the profiler..

I downloaded what I believe to be the same DBLP titles dataset that you
tested with. Without the patch:

postgres=# explain analyze select * from dblp_titles where tsvector @@
to_tsquery('english', 'statistics') order by ts_rank(tsvector,
to_tsquery('english', 'statistics')) limit 10;
QUERY
PLAN

---------------------------------------------------------------------------------
---------------------------------------------------------
Limit (cost=42935.81..42935.84 rows=10 width=64) (actual
time=57.945..57.948 ro
ws=10 loops=1)
-> Sort (cost=42935.81..42980.86 rows=18020 width=64) (actual
time=57.943..5
7.944 rows=10 loops=1)
Sort Key: (ts_rank(tsvector, '''statist'''::tsquery))
Sort Method: top-N heapsort Memory: 28kB
-> Bitmap Heap Scan on dblp_titles (cost=211.66..42546.41
rows=18020 w
idth=64) (actual time=13.061..46.358 rows=15142 loops=1)
Recheck Cond: (tsvector @@ '''statist'''::tsquery)
-> Bitmap Index Scan on gin_title (cost=0.00..207.15
rows=18020
width=0) (actual time=8.339..8.339 rows=15142 loops=1)
Index Cond: (tsvector @@ '''statist'''::tsquery)
Total runtime: 57.999 ms
(9 rows)

And the profile looks like this:

6,94% postgres tbm_iterate
6,12% postgres hash_search_with_hash_value
4,40% postgres tbm_comparator
3,79% libc-2.17.so __memcpy_ssse3_back
3,68% postgres heap_hot_search_buffer
2,62% postgres slot_deform_tuple
2,47% postgres nocachegetattr
2,37% postgres heap_page_prune_opt
2,27% libc-2.17.so __memcmp_sse4_1
2,21% postgres heap_fill_tuple
2,18% postgres pg_qsort
1,96% postgres tas
1,88% postgres palloc0
1,83% postgres calc_rank_or

Drilling into that, tbm_iterate, tbm_comparator and pq_sort calls come
from the Tidbitmap code, as well as about 1/3 of the
hash_search_with_hash_value calls. There seems to be a fair amount of
overhead in building and iterating the tid bitmap. Is that what's
killing us?

For comparison, this is the same with your patch:

postgres=# explain analyze select * from dblp_titles where tsvector @@
to_tsquery('english', 'statistics') order by tsvector ><
to_tsquery('english', 'statistics') limit 10;
QUERY
PLAN

---------------------------------------------------------------------------------
--------------------------------------------------------
Limit (cost=16.00..52.81 rows=10 width=136) (actual time=9.957..9.980
rows=10 l
oops=1)
-> Index Scan using gin_title on dblp_titles (cost=16.00..52198.94
rows=1417
5 width=136) (actual time=9.955..9.977 rows=10 loops=1)
Index Cond: (tsvector @@ '''statist'''::tsquery)
Order By: (tsvector >< '''statist'''::tsquery)
Total runtime: 10.084 ms
(5 rows)

9,57% postgres scanGetItemFast
7,02% postgres calc_rank_or
5,71% postgres FunctionCall10Coll
5,59% postgres entryGetNextItem
5,19% postgres keyGetOrdering
5,13% postgres ginDataPageLeafReadItemPointer
4,89% postgres entryShift
4,85% postgres ginCompareItemPointers
3,44% postgres ginDataPageLeafRead
3,28% postgres AllocSetAlloc
3,27% postgres insertScanItem
3,18% postgres gin_tsquery_distance
2,38% postgres puttuple_common
2,26% postgres checkcondition_gin
2,20% postgres cmpEntries
2,17% postgres AllocSetFreeIndex
2,11% postgres calc_rank

Unsurprisingly, the tidbitmap overhead is not visible in the profile
with your patch.

To see how much of the difference is caused by the tidbitmap overhead, I
wrote the attached quick & dirty patch (it will crash and burn with
queries that require tidbitmap unions or intersects etc.). When there
are fewer than 10 items on a page, the tidbitmap keeps the offsets of
those items in an ordered array of offsets, instead of setting the bits
in the bitmap. In the test query, most pages have only 1, maybe 2 hits,
so that's a lot cheaper. Here's what the results look like with that patch:

postgres=# explain analyze select * from dblp_titles where tsvector @@
to_tsquery('english', 'statistics') order by ts_rank(tsvector,
to_tsquery('english', 'statistics')) limit 10;
QUERY
PLAN

---------------------------------------------------------------------------------
-------------------------------------------------------
Limit (cost=36396.80..36396.82 rows=10 width=136) (actual
time=19.532..19.532 r
ows=0 loops=1)
-> Sort (cost=36396.80..36432.23 rows=14175 width=136) (actual
time=19.531..
19.531 rows=0 loops=1)
Sort Key: (ts_rank(tsvector, '''statist'''::tsquery))
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on dblp_titles (cost=169.86..36090.48
rows=14175 w
idth=136) (actual time=19.525..19.525 rows=0 loops=1)
Recheck Cond: (tsvector @@ '''statist'''::tsquery)
-> Bitmap Index Scan on gin_title (cost=0.00..166.31
rows=14175
width=0) (actual time=8.310..8.310 rows=15142 loops=1)
Index Cond: (tsvector @@ '''statist'''::tsquery)
Total runtime: 19.583 ms
(9 rows)

So, that's about 3x faster than with no patch, but your patch is still
2x faster than this. However, I believe there's still room for
improvement with this approach. The profile with this quick & dirty
patch looks like this:

13,01% postgres hash_search_with_hash_value
11,03% postgres tbm_comparator
7,10% postgres heap_page_prune_opt
6,60% postgres pg_qsort
4,47% postgres expand_table
4,06% postgres scanGetItemFast
3,82% postgres tas
3,40% postgres tbm_iterate
2,70% postgres PinBuffer
2,30% postgres hash_any
2,16% postgres hash_seq_search
2,07% postgres tbm_get_pageentry
1,86% postgres calc_bucket

So, a lot of time is still spent in tidbitmap. The overhead of
tbm_iterate is gone, but much overhead remains elsewhere. The crux of
the overhead is that when you have only 1-2 hits per page, the tidbitmap
is really expensive, as it allocates an PagetableEntry struct for each
page. I believe it would be much cheaper to just accumulate the
ItemPointers into a simple array in tbm_add_tuples, sort it once, and
return results.

In summary: The test case you presented as motivation for this patch is
a bit of a worst-case scenario for the current tidbitmap implementation.
The speedup from your patch comes from avoiding the tidbitmap. However,
it would be fairly easy to optimize the tidbitmap to handle this
scenario better, which would benefit all kinds of queries that use
bitmap scans. There is really no reason to complicate the GIN API for
this. Let's just optimize tidbitmap.

I'm not sure if I fullly understand your patch, though. Is there some
other test scenario where it performs significantly better, which can
not be attributed to a tidbitmap overhead? I'm assuming 'no' for now,
and marking this patch as rejected in the commitfest app, but feel free
to reopen if there is.

- Heikki

Attachment Content-Type Size
tidbitmap-hack-1.patch text/x-diff 4.6 KB

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-06-25 18:18:08
Message-ID: CAPpHfduOeH75eXxmHc_UvyREa5=hvgkJE0B0tJA-wsAjsmR12Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 25, 2013 at 7:31 PM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 25.06.2013 01:24, Alexander Korotkov wrote:
>
>> On Wed, Jun 19, 2013 at 1:21 AM, Alexander Korotkov<aekorotkov(at)gmail(dot)com>
>> **wrote:
>>
>> On Mon, Jun 17, 2013 at 10:27 PM, Heikki Linnakangas<
>>> hlinnakangas(at)vmware(dot)com> wrote:
>>>
>>> That has some obvious limitations. First of all, you can run out of
>>>
>>>> memory.
>>>>
>>>
>>> Yes, it is so. qsort should be replaced with tuplesort.
>>>
>>
>> In attached patch qsort is replaced with tuplesort. As expected, it leads
>> to some performance drawback, but it's not dramatic.
>> Also, some doc is added for new distance method of GIN.
>>
>
> Thanks. But the fact remains that with this patch, you fetch all the
> tuples and then you sort them, which is exactly the same thing you do
> without the patch. The way it happens without the patch just seems to be
> slower. Time to break out the profiler..
>

Actually, it's no exactly so. gingettuple doesn't touch any heap tuple. It
gets all required information to calculate relevance directly from index
itself. It's possible with respect to additional information which is word
positions for fts. So, patch is doing similar but with significant
difference: source of data for ranking changes from heap to index. The
information required for ranking in index is much more well-localized than
it is in heap.

> I downloaded what I believe to be the same DBLP titles dataset that you
> tested with. Without the patch:
>
> postgres=# explain analyze select * from dblp_titles where tsvector @@
> to_tsquery('english', 'statistics') order by ts_rank(tsvector,
> to_tsquery('english', 'statistics')) limit 10;
> QUERY PLAN
>
> ------------------------------**------------------------------**
> ---------------------
> ------------------------------**---------------------------
> Limit (cost=42935.81..42935.84 rows=10 width=64) (actual
> time=57.945..57.948 ro
> ws=10 loops=1)
> -> Sort (cost=42935.81..42980.86 rows=18020 width=64) (actual
> time=57.943..5
> 7.944 rows=10 loops=1)
> Sort Key: (ts_rank(tsvector, '''statist'''::tsquery))
>
> Sort Method: top-N heapsort Memory: 28kB
> -> Bitmap Heap Scan on dblp_titles (cost=211.66..42546.41
> rows=18020 w
> idth=64) (actual time=13.061..46.358 rows=15142 loops=1)
> Recheck Cond: (tsvector @@ '''statist'''::tsquery)
> -> Bitmap Index Scan on gin_title (cost=0.00..207.15
> rows=18020
> width=0) (actual time=8.339..8.339 rows=15142 loops=1)
>
> Index Cond: (tsvector @@ '''statist'''::tsquery)
> Total runtime: 57.999 ms
> (9 rows)
>
> And the profile looks like this:
>
> 6,94% postgres tbm_iterate
> 6,12% postgres hash_search_with_hash_value
> 4,40% postgres tbm_comparator
> 3,79% libc-2.17.so __memcpy_ssse3_back
> 3,68% postgres heap_hot_search_buffer
> 2,62% postgres slot_deform_tuple
> 2,47% postgres nocachegetattr
> 2,37% postgres heap_page_prune_opt
> 2,27% libc-2.17.so __memcmp_sse4_1
> 2,21% postgres heap_fill_tuple
> 2,18% postgres pg_qsort
> 1,96% postgres tas
> 1,88% postgres palloc0
> 1,83% postgres calc_rank_or
>
> Drilling into that, tbm_iterate, tbm_comparator and pq_sort calls come
> from the Tidbitmap code, as well as about 1/3 of the
> hash_search_with_hash_value calls. There seems to be a fair amount of
> overhead in building and iterating the tid bitmap. Is that what's killing
> us?
>
> For comparison, this is the same with your patch:
>
> postgres=# explain analyze select * from dblp_titles where tsvector @@
>
> to_tsquery('english', 'statistics') order by tsvector ><
> to_tsquery('english', 'statistics') limit 10;
> QUERY PLAN
>
> ------------------------------**------------------------------**
> ---------------------
> ------------------------------**--------------------------
> Limit (cost=16.00..52.81 rows=10 width=136) (actual time=9.957..9.980
> rows=10 l
> oops=1)
> -> Index Scan using gin_title on dblp_titles (cost=16.00..52198.94
> rows=1417
> 5 width=136) (actual time=9.955..9.977 rows=10 loops=1)
>
> Index Cond: (tsvector @@ '''statist'''::tsquery)
> Order By: (tsvector >< '''statist'''::tsquery)
> Total runtime: 10.084 ms
> (5 rows)
>
> 9,57% postgres scanGetItemFast
> 7,02% postgres calc_rank_or
> 5,71% postgres FunctionCall10Coll
> 5,59% postgres entryGetNextItem
> 5,19% postgres keyGetOrdering
> 5,13% postgres ginDataPageLeafReadItemPointer
> 4,89% postgres entryShift
> 4,85% postgres ginCompareItemPointers
> 3,44% postgres ginDataPageLeafRead
> 3,28% postgres AllocSetAlloc
> 3,27% postgres insertScanItem
> 3,18% postgres gin_tsquery_distance
> 2,38% postgres puttuple_common
> 2,26% postgres checkcondition_gin
> 2,20% postgres cmpEntries
> 2,17% postgres AllocSetFreeIndex
> 2,11% postgres calc_rank
>
> Unsurprisingly, the tidbitmap overhead is not visible in the profile with
> your patch.
>
> To see how much of the difference is caused by the tidbitmap overhead, I
> wrote the attached quick & dirty patch (it will crash and burn with queries
> that require tidbitmap unions or intersects etc.). When there are fewer
> than 10 items on a page, the tidbitmap keeps the offsets of those items in
> an ordered array of offsets, instead of setting the bits in the bitmap. In
> the test query, most pages have only 1, maybe 2 hits, so that's a lot
> cheaper. Here's what the results look like with that patch:
>
> postgres=# explain analyze select * from dblp_titles where tsvector @@
> to_tsquery('english', 'statistics') order by ts_rank(tsvector,
> to_tsquery('english', 'statistics')) limit 10;
> QUERY PLAN
>
> ------------------------------**------------------------------**
> ---------------------
> ------------------------------**-------------------------
> Limit (cost=36396.80..36396.82 rows=10 width=136) (actual
> time=19.532..19.532 r
> ows=0 loops=1)
> -> Sort (cost=36396.80..36432.23 rows=14175 width=136) (actual
> time=19.531..
> 19.531 rows=0 loops=1)
> Sort Key: (ts_rank(tsvector, '''statist'''::tsquery))
> Sort Method: quicksort Memory: 25kB
> -> Bitmap Heap Scan on dblp_titles (cost=169.86..36090.48
> rows=14175 w
> idth=136) (actual time=19.525..19.525 rows=0 loops=1)
> Recheck Cond: (tsvector @@ '''statist'''::tsquery)
> -> Bitmap Index Scan on gin_title (cost=0.00..166.31
> rows=14175
> width=0) (actual time=8.310..8.310 rows=15142 loops=1)
>
> Index Cond: (tsvector @@ '''statist'''::tsquery)
> Total runtime: 19.583 ms
> (9 rows)
>
> So, that's about 3x faster than with no patch, but your patch is still 2x
> faster than this. However, I believe there's still room for improvement
> with this approach. The profile with this quick & dirty patch looks like
> this:
>
> 13,01% postgres hash_search_with_hash_value
> 11,03% postgres tbm_comparator
> 7,10% postgres heap_page_prune_opt
> 6,60% postgres pg_qsort
> 4,47% postgres expand_table
> 4,06% postgres scanGetItemFast
> 3,82% postgres tas
> 3,40% postgres tbm_iterate
> 2,70% postgres PinBuffer
> 2,30% postgres hash_any
> 2,16% postgres hash_seq_search
> 2,07% postgres tbm_get_pageentry
> 1,86% postgres calc_bucket
>
> So, a lot of time is still spent in tidbitmap. The overhead of tbm_iterate
> is gone, but much overhead remains elsewhere. The crux of the overhead is
> that when you have only 1-2 hits per page, the tidbitmap is really
> expensive, as it allocates an PagetableEntry struct for each page. I
> believe it would be much cheaper to just accumulate the ItemPointers into a
> simple array in tbm_add_tuples, sort it once, and return results.
>
> In summary: The test case you presented as motivation for this patch is a
> bit of a worst-case scenario for the current tidbitmap implementation. The
> speedup from your patch comes from avoiding the tidbitmap. However, it
> would be fairly easy to optimize the tidbitmap to handle this scenario
> better, which would benefit all kinds of queries that use bitmap scans.
> There is really no reason to complicate the GIN API for this. Let's just
> optimize tidbitmap.
>
> I'm not sure if I fullly understand your patch, though. Is there some
> other test scenario where it performs significantly better, which can not
> be attributed to a tidbitmap overhead? I'm assuming 'no' for now, and
> marking this patch as rejected in the commitfest app, but feel free to
> reopen if there is.

So, it's likely I've positioned this patch wrong from the begging, because
my examples were focused on CPU time improvement. But initial purpose of
this patch was to decrease IO. Actually, CPU time improvement just came as
some kind of bonus which you have unmasked now. On the simple example you
can see how dramatically differs the number of pages used for query
execution.

Without patch it's 8002.

postgres=# explain (analyze, buffers) select * from dblp_titles2 where ts
@@ to_tsquery('english', 'statistics') order by ts_rank(ts,
to_tsquery('english', 'statistics')) limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=28651.81..28651.84 rows=10 width=134) (actual
time=18.133..18.135 rows=10 loops=1)
Buffers: shared hit=*8002*
-> Sort (cost=28651.81..28681.13 rows=11728 width=134) (actual
time=18.132..18.134 rows=10 loops=1)
Sort Key: (ts_rank(ts, '''statist'''::tsquery))
Sort Method: top-N heapsort Memory: 27kB
Buffers: shared hit=8002
-> Bitmap Heap Scan on dblp_titles2 (cost=138.89..28398.37
rows=11728 width=134) (actual time=4.005..14.866 rows=10298 loops=1)
Recheck Cond: (ts @@ '''statist'''::tsquery)
Buffers: shared hit=8002
-> Bitmap Index Scan on dblp_titles2_idx
(cost=0.00..135.96 rows=11728 width=0) (actual time=2.546..2.546
rows=10298 loops=1)
Index Cond: (ts @@ '''statist'''::tsquery)
Buffers: shared hit=13
Total runtime: 18.177 ms
(13 rows)

And with patch it's 34.

postgres=# explain (analyze, buffers) select * from dblp_titles2 where ts
@@ to_tsquery('english', 'statistics') order by ts >< to_tsquery('english',
'statistics') limit 10;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=12.00..48.80 rows=10 width=136) (actual time=3.238..3.254
rows=10 loops=1)
Buffers: shared hit=*34*
-> Index Scan using dblp_titles2_idx on dblp_titles2
(cost=12.00..36929.74 rows=10033 width=136) (actual time=3.237..3.252
rows=10 loops=1)
Index Cond: (ts @@ '''statist'''::tsquery)
Order By: (ts >< '''statist'''::tsquery)
Buffers: shared hit=34
Total runtime: 3.448 ms
(7 rows)

So, we need in 234 times less pages to execute the query. DBLP titles is
small dataset, but on large dataset ratio of used pages will be similar. On
large databases which isn't entirely in cache the difference is dramatic. I
don't have my experiments on large database now on hand (will provide them
a bit later), but I think you see the point. I return patch to "need
review" because IO aspect worth reviewing.

------
With best regards,
Alexander Korotkov.


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-06-30 11:09:21
Message-ID: 51D011E1.6030108@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 25.06.2013 21:18, Alexander Korotkov wrote:
> On Tue, Jun 25, 2013 at 7:31 PM, Heikki Linnakangas<hlinnakangas(at)vmware(dot)com
>> wrote:
>
>> In summary: The test case you presented as motivation for this patch is a
>> bit of a worst-case scenario for the current tidbitmap implementation. The
>> speedup from your patch comes from avoiding the tidbitmap. However, it
>> would be fairly easy to optimize the tidbitmap to handle this scenario
>> better, which would benefit all kinds of queries that use bitmap scans.
>> There is really no reason to complicate the GIN API for this. Let's just
>> optimize tidbitmap.
>>
>> I'm not sure if I fullly understand your patch, though. Is there some
>> other test scenario where it performs significantly better, which can not
>> be attributed to a tidbitmap overhead? I'm assuming 'no' for now, and
>> marking this patch as rejected in the commitfest app, but feel free to
>> reopen if there is.
>
> So, it's likely I've positioned this patch wrong from the begging, because
> my examples were focused on CPU time improvement. But initial purpose of
> this patch was to decrease IO.

Ok. Storing the additional information bloats the index considerably, so
it's clearly not going to be a win in all cases. So whether you store
the additional information or not needs to configurable somehow.

I'm marking this as "returned with feedback", as we need new performance
testing from I/O point of view. The comparison should be with the base
"additional information" patch or at least the part of that that packs
the item pointers more tightly. Also, this depends on the additional
information patch, so we need to get that committed before this one, and
I just returned that patch.

- Heikki


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-07-01 10:18:05
Message-ID: CAPpHfdtW87tqRrXUFUxGPMSjKVcZVer_MmF7H3AtwFbPW9Gf6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 30, 2013 at 3:09 PM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 25.06.2013 21:18, Alexander Korotkov wrote:
>
>> On Tue, Jun 25, 2013 at 7:31 PM, Heikki Linnakangas<hlinnakangas(at)**
>> vmware.com <hlinnakangas(at)vmware(dot)com>
>>
>>> wrote:
>>>
>>
>> In summary: The test case you presented as motivation for this patch is a
>>> bit of a worst-case scenario for the current tidbitmap implementation.
>>> The
>>> speedup from your patch comes from avoiding the tidbitmap. However, it
>>> would be fairly easy to optimize the tidbitmap to handle this scenario
>>> better, which would benefit all kinds of queries that use bitmap scans.
>>> There is really no reason to complicate the GIN API for this. Let's just
>>> optimize tidbitmap.
>>>
>>> I'm not sure if I fullly understand your patch, though. Is there some
>>> other test scenario where it performs significantly better, which can not
>>> be attributed to a tidbitmap overhead? I'm assuming 'no' for now, and
>>> marking this patch as rejected in the commitfest app, but feel free to
>>> reopen if there is.
>>>
>>
>> So, it's likely I've positioned this patch wrong from the begging, because
>> my examples were focused on CPU time improvement. But initial purpose of
>> this patch was to decrease IO.
>>
>
> Ok. Storing the additional information bloats the index considerably, so
> it's clearly not going to be a win in all cases. So whether you store the
> additional information or not needs to configurable somehow.
>

Yes, I think we should have two distinct opclasses.

------
With best regards,
Alexander Korotkov.


From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: GIN improvements part 3: ordering in index
Date: 2013-07-06 17:02:11
Message-ID: 51D84D93.3040709@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

this is a follow-up to the message I posted to the thread about
additional info in GIN.

I've applied all three patches (ginaddinfo7.patch, gin_fast_scan.4.patch
and gin_ordering.4.patch) onto commit b8fd1a09. I ended up with two
definitions of ‘cmpEntries’ in ginget.c, but I suppose this is due to
split of the patch into multiple pieces. The definitions are exactly the
same so I've commented out the second one.

After applying fast scan the queries fail with 'buffer is not owned by
resource owner Portal' errors, the ordering patch causes segmentation
faults when loading the data.

Loading the data is basically a bunch of INSERT statements into
"messages" table, with a GIN index on the message body. So the table and
index are defined like this:

CREATE TABLE messages (

id SERIAL PRIMARY KEY,
parent_id INT REFERENCES messages(id),
thread_id INT,
level INT,
hash_id VARCHAR(32) NOT NULL UNIQUE,

list VARCHAR(32) NOT NULL REFERENCES lists(id),
message_id VARCHAR(200),
in_reply_to TEXT[],
refs TEXT[],
sent TIMESTAMP,

subject TEXT,
author TEXT,

body_plain TEXT,

body_tsvector tsvector,
subject_tsvector tsvector,

headers HSTORE,
raw_message TEXT
);

CREATE INDEX message_body_idx on messages using gin(body_tsvector);

I've observed about three failure scenarios:

1) autovacuum runs VACUUM on the 'messages' table and fails, killing
all the connections, with this message in the server log

LOG: server process (PID 16611) was terminated by signal
11: Segmentation fault
DETAIL: Failed process was running: autovacuum: ANALYZE
public.messages

2) manual run of VACUUM on the table, with about the same result and
this output on the console (and the same segfault in the server log)

archie=# vacuum messages;
WARNING: relation "messages" page 6226 is uninitialized --- fixing
WARNING: relation "messages" page 6227 is uninitialized --- fixing
WARNING: relation "messages" page 6228 is uninitialized --- fixing
WARNING: relation "messages" page 6229 is uninitialized --- fixing
WARNING: relation "messages" page 6230 is uninitialized --- fixing
WARNING: relation "messages" page 6231 is uninitialized --- fixing
WARNING: relation "messages" page 6232 is uninitialized --- fixing
WARNING: relation "messages" page 6233 is uninitialized --- fixing
The connection to the server was lost. Attempting reset: Failed.

3) disabled autovacuum, the load fails (always at exactly the same
place) - I have collected a backtrace from gdb (after recompiling
with disabled optimization), see the attachment.

All three scenarios might actually be caused by the same bug, as I've
checked the backtrace for the VACUUM and it fails at exactly the same
place as the third case.

regards
Tomas

Attachment Content-Type Size
backtrace.log text/plain 5.0 KB