Re: Patch for fast gin cache performance improvement

From: Ian Link <ian(at)ilink(dot)io>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: 'pgsql-hackers' <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch for fast gin cache performance improvement
Date: 2013-09-30 22:09:35
Message-ID: 5249F69F.10305@ilink.io
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Etsuro,
Sorry for the delay but I have been very busy with work. I have been
away from postgres for a while, so I will need a little time to review
the code and make sure I give you an informed response. I'll get back to
you as soon as I am able. Thanks for understanding.
Ian Link

> Etsuro Fujita <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
> Friday, September 27, 2013 2:24 AM
> I wrote:
>> I had a look over this patch. I think this patch is interesting and very
> useful.
>> Here are my review points:
>
>> 8. I think there are no issues in this patch. However, I have one question:
>> how this patch works in the case where gin_fast_limit/fast_cache_size = 0? In
>> this case, in my understanding, this patch inserts new entries into the
> pending
>> list temporarily and immediately moves them to the main GIN data structure
> using
>> ginInsertCleanup(). Am I right? If so, that is obviously inefficient.
>
> Sorry, There are incorrect expressions. I mean gin_fast_limit> 0 and
> fast_cache_size = 0.
>
> Although I asked this question, I've reconsidered about these parameters, and it
> seems that these parameters not only make code rather complex but are a little
> confusing to users. So I'd like to propose to introduce only one parameter:
> fast_cache_size. While users that give weight to update performance for the
> fast update technique set this parameter to a large value, users that give
> weight not only to update performance but to search performance set this
> parameter to a small value. What do you think about this?
>
> Thanks,
>
> Best regards,
> Etsuro Fujita
>
>
> Etsuro Fujita <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
> Thursday, September 26, 2013 6:02 AM
> Hi Ian,
>
>> This patch contains a performance improvement for the fast gin cache. As you
>> may know, the performance of the fast gin cache decreases with its size.
>> Currently, the size of the fast gin cache is tied to work_mem. The size of
>> work_mem can often be quite high. The large size of work_mem is inappropriate
>> for the fast gin cache size. Therefore, we created a separate cache size
> called
>> gin_fast_limit. This global variable controls the size of the fast gin cache,
>> independently of work_mem. Currently, the default gin_fast_limit is set to
> 128kB.
>> However, that value could need tweaking. 64kB may work better, but it's hard
>> to say with only my single machine to test on.
>
>> On my machine, this patch results in a nice speed up. Our test queries improve
>> from about 0.9 ms to 0.030 ms. Please feel free to use the test case yourself:
>> it should be attached. I can look into additional test cases (tsvectors) if
>> anyone is interested.
>
>> In addition to the global limit, we have provided a per-index limit:
>> fast_cache_size. This per-index limit begins at -1, which means that it is
>> disabled. If the user does not specify a per-index limit, the index will
> simply
>> use the global limit.
>
> I had a look over this patch. I think this patch is interesting and very
> useful. Here are my review points:
>
> 1. Patch applies cleanly.
> 2. make, make install and make check is good.
> 3. I did performance evaluation using your test queries with 64kB and 128kB of
> gin_fast_limit (or fast_cache_size), and saw that both values achieved the
> performance gains over gin_fast_limit = '256MB'. 64kB worked better than 128kB.
> 64kB improved from 1.057 ms to 0.075 ms. Great!
> 4. In my understanding, the small value of gin_fast_limit/fast_cache_size leads
> to the increase in GIN search performance, which, however, leads to the decrease
> in GIN update performance. Am I right? If so, I think the tradeoff should be
> noted in the documentation.
> 5. The following documents in Chapter 57. GIN Indexes need to be updated:
> * 57.3.1. GIN Fast Update Technique
> * 57.4. GIN Tips and Tricks
> 6. I would like to see the results for the additional test cases (tsvectors).
> 7. The commented-out elog() code should be removed.
> 8. I think there are no issues in this patch. However, I have one question: how
> this patch works in the case where gin_fast_limit/fast_cache_size = 0? In this
> case, in my understanding, this patch inserts new entries into the pending list
> temporarily and immediately moves them to the main GIN data structure using
> ginInsertCleanup(). Am I right? If so, that is obviously inefficient.
>
> Sorry for the delay.
>
> Best regards,
> Etsuro Fujita
>
>
> Ian Link <mailto:ian(at)ilink(dot)io>
> Monday, June 17, 2013 9:42 PM
> *
>
> This patch contains a performance improvement for the fast gin cache.
> As you may know, the performance of the fast gin cache decreases with
> its size. Currently, the size of the fast gin cache is tied to
> work_mem. The size of work_mem can often be quite high. The large size
> of work_mem is inappropriate for the fast gin cache size. Therefore,
> we created a separate cache size called gin_fast_limit. This global
> variable controls the size of the fast gin cache, independently of
> work_mem. Currently, the default gin_fast_limit is set to 128kB.
> However, that value could need tweaking. 64kB may work better, but
> it's hard to say with only my single machine to test on.
>
>
> On my machine, this patch results in a nice speed up. Our test queries
> improve from about 0.9 ms to 0.030 ms. Please feel free to use the
> test case yourself: it should be attached. I can look into additional
> test cases (tsvectors) if anyone is interested.
>
>
> In addition to the global limit, we have provided a per-index limit:
> fast_cache_size. This per-index limit begins at -1, which means that
> it is disabled. If the user does not specify a per-index limit, the
> index will simply use the global limit.
>
>
> I would like to thank Andrew Gierth for all his help on this patch. As
> this is my first patch he was extremely helpful. The idea for this
> performance improvement was entirely his. I just did the
> implementation. Thanks for reading and considering this patch!*
>
>
> Ian Link

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bernd Helmle 2013-09-30 22:28:55 Re: Wait free LW_SHARED acquisition
Previous Message Kevin Grittner 2013-09-30 22:00:41 Re: logical changeset generation v6.1