Re: group locking: incomplete patch, just for discussion

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: group locking: incomplete patch, just for discussion
Date: 2014-10-31 14:38:52
Message-ID: 20141031143852.GJ13584@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-10-31 10:08:59 -0400, Robert Haas wrote:
> On Fri, Oct 31, 2014 at 9:07 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-10-31 08:54:54 -0400, Robert Haas wrote:
> >> On Fri, Oct 31, 2014 at 6:41 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >> > Is it genuinely required for most parallel operations? I think it's
> >> > clear that none of us knows the answer. Sure, the general case needs
> >> > it, but is the general case the same thing as the reasonably common
> >> > case?
> >>
> >> Well, I think that the answer is pretty clear. Most of the time,
> >> perhaps in 99.9% of cases, group locking will make no difference as to
> >> whether a parallel operation succeeds or fails. Occasionally,
> >> however, it will cause an undetected deadlock. I don't hear anyone
> >> arguing that that's OK. Does anyone wish to make that argument?
> >
> > Maybe we can, as a first step, make those edges in the lock graph
> > visible to the deadlock detector? It's pretty clear that undetected
> > deadlocks aren't ok, but detectable deadlocks in a couple corner cases
> > might be acceptable.
>
> Interesting idea. I agree that would be more acceptable. It would be
> kind of sad, though, if you got a deadlock from this:
>
> BEGIN;
> LOCK TABLE foo;
> SELECT * FROM foo;

I have serious doubts about the number of cases where it's correct to
access relations in a second backend that are exclusively locked. Not so
much when that happens for a user issued LOCK statement of course, but
when the system has done so. Personally I think to stay sane we'll have
to forbid access to anything normal other backends can't access either -
otherwise things will get too hard to reason about.

So just refusing parallelism as soon as anything has taken an access
exclusive lock doesn't sound too bad to me.

> You can, of course, avoid that, but it's basically transferring the
> burden from the lock manager to every bit of parallel code we ever
> write. If it turns out to be easier to do this than what I'm
> currently proposing, I'm definitely amenable to going this route for
> version 1, but I don't think it's going to be appealing as a permanent
> solution.

It's probably not the solution forever, I agree. But it might be the
simplest way forward till we have more actual users.

> >> 1. Turing's theorem being what it is, predicting what catalog tables
> >> the child might lock is not necessarily simple.
> >
> > Well, there's really no need to be absolutely general here. We're only
> > going to support a subset of functionality as parallelizable. And in
> > that case we don't need anything complicated to acquire all locks.
>
> See, that's what I don't believe. Even very simple things like
> equality operators do a surprising amount of catalog locking.

So what? I don't see catalog locking as something problematic? Maybe I'm
missing something, but I don't see cases would you expect deadlocks or
anything like it on catalog relations?

> >> So, I am still of the opinion that group locking makes sense. While
> >> I appreciate the urge to avoid solving difficult problems where it's
> >> reasonably avoidable, I think we're in danger of spending more effort
> >> trying to avoid solving this particular problem than it would take to
> >> actually solve it. Based on what I've done so far, I'm guessing that
> >> a complete group locking patch will be between 1000 and 1500 lines of
> >> code and that nearly all of the new logic will be skipped when it's
> >> not in use (i.e. no parallelism). That sounds to me like a hell of a
> >> deal compared to trying to predict what locks the child might
> >> conceivably take and preemptively acquire them all, which sounds
> >> annoyingly tedious even for simple things (like equality operators)
> >> and nearly impossible for anything more complicated.
> >
> > What I'm worried about is the performance impact of group locking when
> > it's not in use. The heavyweight locking code is already quite complex
> > and often a noticeable bottleneck...
>
> I certainly agree that it would be unacceptable for group locking to
> significantly regress the normal case. I'm pretty optimistic about
> reducing the overhead to near-zero, though - a few extra branch
> instructions on the non-fast-path case should be lost in the noise.

I'm less worried about the additional branches than about increasing the
size of the datastructures. They're already pretty huge.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-10-31 14:40:19 Re: CREATE INDEX CONCURRENTLY?
Previous Message Simon Riggs 2014-10-31 14:35:11 Re: Column Redaction