Re: GIN improvements part2: fast scan

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part2: fast scan
Date: 2014-01-27 10:32:29
Message-ID: CAPpHfdticvY2+RGTxzQdrr4YSQie+vYGRDRBTuG1ZxBmjwGrOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 26, 2014 at 8:14 PM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 01/26/2014 08:24 AM, Tomas Vondra wrote:
>
>> Hi!
>>
>> On 25.1.2014 22:21, Heikki Linnakangas wrote:
>>
>>> Attached is a new version of the patch set, with those bugs fixed.
>>>
>>
>> I've done a bunch of tests with all the 4 patches applied, and it seems
>> to work now. I've done tests with various conditions (AND/OR, number of
>> words, number of conditions) and I so far I did not get any crashes,
>> infinite loops or anything like that.
>>
>> I've also compared the results to 9.3 - by dumping the database and
>> running the same set of queries on both machines, and indeed I got 100%
>> match.
>>
>> I also did some performance tests, and that's when I started to worry.
>>
>> For example, I generated and ran 1000 queries that look like this:
>>
>> SELECT id FROM messages
>> WHERE body_tsvector @@ to_tsquery('english','(header & 53 & 32 &
>> useful & dropped)')
>> ORDER BY ts_rank(body_tsvector, to_tsquery('english','(header & 53 &
>> 32 & useful & dropped)')) DESC;
>>
>> i.e. in this case the query always was 5 words connected by AND. This
>> query is a pretty common pattern for fulltext search - sort by a list of
>> words and give me the best ranked results.
>>
>> On 9.3, the script was running for ~23 seconds, on patched HEAD it was
>> ~40. It's perfectly reproducible, I've repeated the test several times
>> with exactly the same results. The test is CPU bound, there's no I/O
>> activity at all. I got the same results with more queries (~100k).
>>
>> Attached is a simple chart with x-axis used for durations measured on
>> 9.3.2, y-axis used for durations measured on patched HEAD. It's obvious
>> a vast majority of queries is up to 2x slower - that's pretty obvious
>> from the chart.
>>
>> Only about 50 queries are faster on HEAD, and >700 queries are more than
>> 50% slower on HEAD (i.e. if the query took 100ms on 9.3, it takes >150ms
>> on HEAD).
>>
>> Typically, the EXPLAIN ANALYZE looks something like this (on 9.3):
>>
>> http://explain.depesz.com/s/5tv
>>
>> and on HEAD (same query):
>>
>> http://explain.depesz.com/s/1lI
>>
>> Clearly the main difference is in the "Bitmap Index Scan" which takes
>> 60ms on 9.3 and 120ms on HEAD.
>>
>> On 9.3 the "perf top" looks like this:
>>
>> 34.79% postgres [.] gingetbitmap
>> 28.96% postgres [.] ginCompareItemPointers
>> 9.36% postgres [.] TS_execute
>> 5.36% postgres [.] check_stack_depth
>> 3.57% postgres [.] FunctionCall8Coll
>>
>> while on 9.4 it looks like this:
>>
>> 28.20% postgres [.] gingetbitmap
>> 21.17% postgres [.] TS_execute
>> 8.08% postgres [.] check_stack_depth
>> 7.11% postgres [.] FunctionCall8Coll
>> 4.34% postgres [.] shimTriConsistentFn
>>
>> Not sure how to interpret that, though. For example where did the
>> ginCompareItemPointers go? I suspect it's thanks to inlining, and that
>> it might be related to the performance decrease. Or maybe not.
>>
>
> Yeah, inlining makes it disappear from the profile, and spreads that time
> to the functions calling it.
>
> The profile tells us that the consistent function is called a lot more
> than before. That is expected - with the fast scan feature, we're calling
> consistent not only for potential matches, but also to refute TIDs based on
> just a few entries matching. If that's effective, it allows us to skip many
> TIDs and avoid consistent calls, which compensates, but if it's not
> effective, it's just overhead.
>
> I would actually expect it to be fairly effective for that query, so
> that's a bit surprising. I added counters to see where the calls are coming
> from, and it seems that about 80% of the calls are actually coming from
> this little the feature I explained earlier:
>
>
> In addition to that, I'm using the ternary consistent function to check
>> if minItem is a match, even if we haven't loaded all the entries yet.
>> That's less important, but I think for something like "rare1 | (rare2 &
>> frequent)" it might be useful. It would allow us to skip fetching
>> 'frequent', when we already know that 'rare1' matches for the current
>> item. I'm not sure if that's worth the cycles, but it seemed like an
>> obvious thing to do, now that we have the ternary consistent function.
>>
>
> So, that clearly isn't worth the cycles :-). At least not with an
> expensive consistent function; it might be worthwhile if we pre-build the
> truth-table, or cache the results of the consistent function.
>
> Attached is a quick patch to remove that, on top of all the other patches,
> if you want to test the effect.

Every single change you did in fast scan seems to be reasonable, but
testing shows that something went wrong. Simple test with 3 words of
different selectivities.

After applying your patches:

# select count(*) from fts_test where fti @@ plainto_tsquery('english',
'gin index select');
count
───────
627
(1 row)

Time: 21,252 ms

In original fast-scan:

# select count(*) from fts_test where fti @@ plainto_tsquery('english',
'gin index select');
count
───────
627
(1 row)

Time: 3,382 ms

I'm trying to get deeper into it.

------
With best regards,
Alexander Korotkov.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marti Raudsepp 2014-01-27 10:40:29 Re: plpgsql.warn_shadow
Previous Message Simon Riggs 2014-01-27 09:27:17 Re: A better way than tweaking NTUP_PER_BUCKET