Re: GIN improvements part 1: additional information

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 1: additional information
Date: 2014-01-14 08:59:20
Message-ID: CAPpHfdsjoWKRr0KA+RAgaf4b-7NuMeLaBfy=PrOdOC0uq9oRew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 14, 2014 at 12:34 PM, Heikki Linnakangas <
hlinnakangas(at)vmware(dot)com> wrote:

> On 01/13/2014 07:07 PM, Alexander Korotkov wrote:
>
>> I've fixed this bug and many other bug. Now patch passes test suite that
>> I've used earlier. The results are so:
>>
>> Operations time:
>> event | period
>> -----------------------+-----------------
>> index_build | 00:01:47.53915
>> index_build_recovery | 00:00:04
>> index_update | 00:05:24.388163
>> index_update_recovery | 00:00:53
>> search_new | 00:24:02.289384
>> search_updated | 00:27:09.193343
>> (6 rows)
>>
>> Index sizes:
>> label | size
>> ---------------+-----------
>> new | 384761856
>> after_updates | 667942912
>> (2 rows)
>>
>> Also, I made following changes in algorithms:
>>
>> - Now, there is a limit to number of uncompressed TIDs in the page.
>>
>> After reaching this limit, they are encoded independent on if they
>> can fit
>> page. That seems to me more desirable behaviour and somehow it
>> accelerates
>> search speed. Before this change times were following:
>>
>> event | period
>> -----------------------+-----------------
>> index_build | 00:01:51.467888
>> index_build_recovery | 00:00:04
>> index_update | 00:05:03.315155
>> index_update_recovery | 00:00:51
>> search_new | 00:24:43.194882
>> search_updated | 00:28:36.316784
>> (6 rows)
>>
>
> Hmm, that's strange. Any idea why that's happening? One theory is that
> when you re-encode the pages more aggressively, there are fewer pages with
> a mix of packed and unpacked items. Mixed pages are somewhat slower to scan
> than fully packed or fully unpacked pages, because
> GinDataLeafPageGetItems() has to merge the packed and unpacked items into a
> single list. But I wouldn't expect that effect to be large enough to
> explain the results you got.
>

Probably, it's because of less work in ginMergeItemPointers.

- Page are not fully re-encoded if it's enough to re-encode just last
>> segment.
>>
>
> Great! We should also take advantage of that in the WAL record that's
> written; no point in WAL-logging all the segments, if we know that only
> last one was modified.

Already.

------
With best regards,
Alexander Korotkov.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-01-14 09:04:47 Re: UNION ALL on partitioned tables won't use indices.
Previous Message Claudio Freire 2014-01-14 08:39:02 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance