Re: Scaling shared buffer eviction

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling shared buffer eviction
Date: 2014-09-26 14:31:46
Message-ID: CAA4eK1+r=7uX+2f5dC6T3dW1wLM-K7cwtPF8rhyW6TJak+D=pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 26, 2014 at 7:04 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Sep 26, 2014 at 7:40 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
>
>> First of all thanks for committing part-1 of this changes and it
>> seems you are planing to commit part-3 based on results of tests
>> which Andres is planing to do and for remaining part (part-2), today
>> I have tried some tests, the results of which are as follows:
>>
>> Scale Factor - 3000, Shared_buffer - 8GB
>>
>> Patch_Ver/Client_Count 16 32 64 128 reduce-replacement-locking.patch
>> + 128 Buf Partitions 157732 229547 271536 245295
>> scalable_buffer_eviction_v9.patch 163762 230753 275147 248309
>>
>>
>> Scale Factor - 3000, Shared_buffer - 8GB
>>
>>
Typo here Scale Factor - 3000, Shared_buffer - *2*GB

> Patch_Ver/Client_Count 16 32 64 128 reduce-replacement-locking.patch
>> + 128 Buf Partitions 157781 212134 202209 171176
>> scalable_buffer_eviction_v9.patch 160301 213922 208680 172720
>>
>>
>> The results indicates that in all cases there is benefit by doing
>> part-2 (bgreclaimer). Though the benefit at this configuration is
>> not high, but might be at some higher configurations
>> (scale factor - 10000) there is more benefit. Do you see any merit
>> in pursuing further to accomplish part-2 as well?
>>
>
> Interesting results. Thanks for gathering this data.
>
>
One more point I have missed is that above data is using
"-M prepared" option of pgbench.

> If this is the best we can do with the bgreclaimer, I think the case for
> pursuing it is somewhat marginal. The biggest jump you've got seems to be
> at scale factor 3000 with 64 clients, where you picked up about 4%. 4%
> isn't nothing, but it's not a lot, either. On the other hand, this might
> not be the best we can do. There may be further improvements to
> bgreclaimer that make the benefit larger.
>
> Backing up a it, to what extent have we actually solved the problem here?
> If we had perfectly removed all of the scalability bottlenecks, what would
> we expect to see? You didn't say which machine this testing was done on
>

It was IBM POWER7
Sorry, I should have mentioned it.

> , or how many cores it had, but for example on the IBM POWER7 machine, we
> probably wouldn't expect the throughput at 64 clients to be 4 times the
> throughput at 16 cores because up to 16 clients each one can have a full
> CPU core, whereas after that and out to 64 each is getting a hardware
> thread, which is not quite as good. Still, we'd expect performance to go
> up, or at least not go down. Your data shows a characteristic performance
> knee: between 16 and 32 clients we go up, but then between 32 and 64 we go
> down,
>

Here another point worth noting is that it goes down between
32 and 64 when shared_buffers is 2GB, however when
shared_buffers are 8GB it doesn't go down between 32 and 64.

> and between 64 and 128 we go down more. You haven't got enough data
> points there to show very precisely where the knee is, but unless you
> tested this on a smaller box than what you have been using, we're certainly
> hitting the knee sometime before we run out of physical cores. That
> implies a remaining contention bottleneck.
>
> My results from yesterday were a bit different. I tested 1 client, 8
> clients, and multiples of 16 clients out to 96. With
> reduce-replacement-locking.patch + 128 buffer mapping partitions,
> performance continued to rise all the way out to 96 clients. It definitely
> wasn't linearly, but it went up, not down. I don't know why this is
> different from what you are seeing.
>

I think it is almost same if we consider same configuration
(scale_factor - 3000, shared_buffer - 8GB).

> Anyway there's a little more ambiguity there about how much contention
> remains, but my bet is that there is at least some contention that we could
> still hope to remove. We need to understand where that contention is. Are
> the buffer mapping locks still contended? Is the new spinlock contended?
> Are there other contention points? I won't be surprised if it turns out
> that the contention is on the new spinlock and that a proper design for
> bgreclaimer is the best way to remove that contention .... but I also won't
> be surprised if it turns out that there are bigger wins elsewhere. So I
> think you should try to figure out where the remaining contention is first,
> and then we can discuss what to do about it.
>
>
Make sense.

> On another point, I think it would be a good idea to rebase the
> bgreclaimer patch over what I committed, so that we have a clean patch
> against master to test with.
>
>
I think this also makes sense, however I think it is better to first
see where is the bottleneck.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-09-26 14:32:01 Re: Scaling shared buffer eviction
Previous Message Robert Haas 2014-09-26 14:28:21 Re: Inefficient barriers on solaris with sun cc