Quick Links

Re: WIP: dynahash replacement for buffer table

From:	Ryan Johnson <ryan(dot)johnson(at)cs(dot)utoronto(dot)ca>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: WIP: dynahash replacement for buffer table
Date:	2014-10-16 15:33:03
Message-ID:	543FE52F.9050300@cs.utoronto.ca
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 16/10/2014 7:19 AM, Robert Haas wrote:
> On Thu, Oct 16, 2014 at 8:03 AM, Ryan Johnson
> <ryan(dot)johnson(at)cs(dot)utoronto(dot)ca> wrote:
>> Why not use an RCU mechanism [1] and ditch the hazard pointers? Seems like
>> an ideal fit...
>>
>> In brief, RCU has the following requirements:
>>
>> Read-heavy access pattern
>> Writers must be able to make dead objects unreachable to new readers (easily
>> done for most data structures)
>> Writers must be able to mark dead objects in such a way that existing
>> readers know to ignore their contents but can still traverse the data
>> structure properly (usually straightforward)
>> Readers must occasionally inform the system that they are not currently
>> using any RCU-protected pointers (to allow resource reclamation)
> Have a look at http://lwn.net/Articles/573424/ and specifically the
> "URCU overview" section. Basically, that last requirement - that
> readers inform the system tat they are not currently using any
> RCU-protected pointers - turns out to require either memory barriers
> or signals.
> All of the many techniques that have been developed in this area are
> merely minor variations on a very old theme: set some kind of flag
> variable in shared memory to let people know that you are reading a
> shared data structure, and clear it when you are done. Then, other
> people can figure out when it's safe to recycle memory that was
> previously part of that data structure.
Sure, but RCU has the key benefit of decoupling its machinery (esp. that
flag update) from the actual critical section(s) it protects. In a DBMS
setting, for example, once per transaction or SQL statement would do
just fine. The notification can be much better than a simple flag---you
want to know whether the thread has ever quiesced since the last reclaim
cycle began, not whether it is currently quiesced (which it usually
isn't). In the implementation I use, a busy thread (e.g. not about to go
idle) can "chain" its RCU "transactions." In the common case, a chained
quiesce call comes when the RCU epoch is not trying to change, and the
"flag update" degenerates to a simple load. Further, the only time it's
critical to have that memory barrier is if the quiescing thread is about
to go idle. Otherwise, missing a flag just imposes a small delay on
resource reclamation (and that's assuming the flag in question even
belonged to a straggler process). How you implement epoch management,
especially the handling of stragglers, is the deciding factor in whether
RCU works well. The early URCU techniques were pretty terrible, and
maybe general-purpose URCU is doomed to stay that way, but in a DBMS
core it can be done very cleanly and efficiently because we can easily
add the quiescent points at appropriate locations in the code.

> In Linux's RCU, the flag
> variable is "whether the process is currently scheduled on a CPU",
> which is obviously not workable from user-space. Lacking that, you
> need an explicit flag variable, which means you need memory barriers,
> since the protected operation is a load and the flag variable is
> updated via a store. You can try to avoid some of the overhead by
> updating the flag variable less often (say, when a signal arrives) or
> you can make it more fine-grained (in my case, we only prevent reclaim
> of a fraction of the data structure at a time, rather than all of it)
> or various other variants, but none of this is unfortunately so simple
> as "apply technique X and your problem just goes away".
Magic wand, no (does nothing for update contention, for example, and
requires some care to apply). But from a practical perspective RCU,
properly implemented, does make an awful lot of problems an awful lot
simpler to tackle. Especially for the readers.

Ryan

In response to

Re: WIP: dynahash replacement for buffer table at 2014-10-16 13:19:16 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Stephen Frost	2014-10-16 15:47:51	Re: Additional role attributes && superuser review
Previous Message	Stephen Frost	2014-10-16 15:28:28	Re: Review of GetUserId() Usage