Re: CLOG contention, part 2

From: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, simon(at)2ndquadrant(dot)com, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CLOG contention, part 2
Date: 2012-02-10 19:01:50
Message-ID: CA+CSw_ubRBKdC2Ue-0RzOwKpFObs4nW=BLv6GJS=eFdvKH6N+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Feb 9, 2012 1:27 AM, "Robert Haas" <robertmhaas(at)gmail(dot)com>
> However, there is a potential fly in the ointment: in other cases in
> which we've reduced contention at the LWLock layer, we've ended up
> with very nasty contention at the spinlock layer that can sometimes
> eat more CPU time than the LWLock contention did. In that light, it
> strikes me that it would be nice to be able to partition the
> contention N ways rather than just 2 ways. I think we could do that
> as follows. Instead of having one control lock per SLRU, have N
> locks, where N is probably a power of 2. Divide the buffer pool for
> the SLRU N ways, and decree that each slice of the buffer pool is
> controlled by one of the N locks. Route all requests for a page P to
> slice P mod N. Unlike this approach, that wouldn't completely
> eliminate contention at the LWLock level, but it would reduce it
> proportional to the number of partitions, and it would reduce spinlock
> contention according to the number of partitions as well. A down side
> is that you'll need more buffers to get the same hit rate, but this
> proposal has the same problem: it doubles the amount of memory
> allocated for CLOG.

Splitting the SLRU into different parts is exactly the same approach as
associativity used in CPU caches. I found some numbers that analyze cache
hit rate with different associativities:

http://research.cs.wisc.edu/multifacet/misc/spec2000cache-data/

Now obviously CPU cache access patterns are different from CLOG patterns,
but I think that the numbers strongly suggest that the reduction in hitrate
might be less than what you fear. For example, the harmonic mean of data
cache misses over all benchmark for 16, 32 and 64 cache lines:
| Size | Direct | 2-way LRU | 4-way LRU | 8-way LRU | Full LRU |
|-------+-------------+-------------+-------------+-------------+-------------|

| 1KB | 0.0863842-- | 0.0697167-- | 0.0634309-- | 0.0563450-- | 0.0533706--
|
| 2KB | 0.0571524-- | 0.0423833-- | 0.0360463-- | 0.0330364-- | 0.0305213--
|
| 4KB | 0.0370053-- | 0.0260286-- | 0.0222981-- | 0.0202763-- | 0.0190243--
|

As you can see, the reduction in hit rate is rather small down to 4 way
associative caches.

There may be a performance problem when multiple CLOG pages that happen to
sit in a single way become hot at the same time. The most likely case that
I can come up with is multiple scans going over unhinted pages created at
different time periods. If that is something to worry about, then a tool
that's used for CPUs is to employ a fully associative victim cache behind
the main cache. If a CLOG page is evicted, it is transferred into the
victim cache, evicting a page from there. When a page isn't found in the
main cache, the victim cache is first checked for a possible hit. The
movement between the two caches doesn't need to involve any memory copying
- just swap pointers in metadata.

The victim cache will bring back concurrency issues when the hit rate of
the main cache is small - like the pgbench example you mentioned. In that
case, a simple associative cache will allow multiple reads of clog pages
simultaneously. On the other hand - in that case lock contention seems to
be the symptom, rather than the disease. I think that those cases would be
better handled by increasing the maximum CLOG SLRU size. The increase in
memory usage should be a drop in the bucket for systems that have enough
transaction processing velocity for that to be a problem.

--
Ants Aasma

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-02-10 19:14:11 Re: CLOG contention, part 2
Previous Message Robert Haas 2012-02-10 18:59:18 Re: RFC: Making TRUNCATE more "MVCC-safe"