Re: group locking: incomplete patch, just for discussion

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: group locking: incomplete patch, just for discussion
Date: 2014-10-31 19:32:33
Message-ID: 20141031193233.GL13584@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-10-31 14:29:57 -0400, Robert Haas wrote:
> On Fri, Oct 31, 2014 at 10:38 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > I have serious doubts about the number of cases where it's correct to
> > access relations in a second backend that are exclusively locked. Not so
> > much when that happens for a user issued LOCK statement of course, but
> > when the system has done so. Personally I think to stay sane we'll have
> > to forbid access to anything normal other backends can't access either -
> > otherwise things will get too hard to reason about.
> >
> > So just refusing parallelism as soon as anything has taken an access
> > exclusive lock doesn't sound too bad to me.
>
> That restriction seems onerous to me; for example, it would prevent a
> parallel sort for CLUSTER or a parallel index rebuild for VACUUM FULL.
> Those seems like worthy targets for parallelism, and maybe even early
> targets, since I expect a lot of the real complexity here to be in the
> query planner, and you can avoid most of that complexity when planning
> DDL.

Ugh. I think that's aiming too high. You'll suddenly need to share cche
invalidations between the backends. I think we should start by requiring
that there's no DDL done *at all* in the transaction that starts the
parallel activity. For CREATE INDEX PARALLEL that can be done by reusing
the logic we have for CONCURRENTLY, thereby moving the parallelized part
into a transaction that hasn't done any DDL yet.

> I also think it's unnecessary. It's wrong to think of two parallel
> backends that both want AccessExclusiveLock on a given target relation
> as conflicting with each other; if the process were not running in
> parallel, those lock requests would both have come from the same
> process and both requests would have been granted.

I don't think that's a realistic view. There's lots of backend private
state around. If we try to make all that coherent between backends we'll
fail.

> Parallelism is
> generally not about making different things happen; it's about
> spreading the stuff that would happen anyway across multiple backends,
> and things that would succeed if done in a single backend shouldn't
> fail when spread across 2 or more without some compelling
> justification. The cases where it's actually unsafe for a table to be
> accessed even by some other code within the same backend are handled
> not by the lock manager, but by CheckTableNotInUse(). That call
> should probably fail categorically if issued while parallelism is
> active.

Which will e.g. prohibit CLUSTER and VACUUM FULL.

> >> >> 1. Turing's theorem being what it is, predicting what catalog tables
> >> >> the child might lock is not necessarily simple.
> >> >
> >> > Well, there's really no need to be absolutely general here. We're only
> >> > going to support a subset of functionality as parallelizable. And in
> >> > that case we don't need anything complicated to acquire all locks.
> >>
> >> See, that's what I don't believe. Even very simple things like
> >> equality operators do a surprising amount of catalog locking.
> >
> > So what? I don't see catalog locking as something problematic? Maybe I'm
> > missing something, but I don't see cases would you expect deadlocks or
> > anything like it on catalog relations?
>
> Suppose somebody fires off a parallel sort on a text column, or a
> parallel sequential scan involving a qual of the form textcol = 'zog'.
> We launch a bunch of workers to do comparisons; they do lookups
> against pg_collation. After some but not all of them have loaded the
> collation information they need into their local cache, the DBA types
> "cluster pg_collate". It queues up for an AccessExclusiveLock. The
> remaining parallel workers join the wait queue waiting for their
> AccessShareLock. The system is now deadlocked; the only solution is to
> move the parallel workers ahead of the AccessExclusiveLock request,
> but the deadlock detector won't do that unless it knows about the
> relationship among the parallel workers.

I think you would need to make the case more complex - we only hold
locks on system object for a very short time, and mostly not in a nested
fashion.

But generally I think it's absolutely perfectly alright to fail in such
case. We need to fail reliably, but *none* of this is a new hazard. You
can have pretty similar issues today, without any parallelism.

> It's possible to construct more obscure cases as well. For example,
> suppose a user (for reasons known only to himself and God) does LOCK
> TABLE pg_opclass and then begins a sort of a column full of enums.
> Meanwhile, another user tries to CLUSTER pg_enum. This will deadlock
> if, and only if, some of the enum OIDs are odd. I mention this
> example not because I think it's a particularly useful case but
> because it illustrates just how robust the current system is: it will
> catch that case, and break the deadlock somehow, and you as a
> PostgreSQL backend hacker do not need to think about it. The guy who
> added pg_enum did not need to think about it. It just works. If we
> decide that parallelism is allowed to break that guarantee, then every
> patch that does parallel anything has got to worry about what
> undetected deadlocks might result, and then argue about which of them
> are obscure enough that we shouldn't care and which are not. That
> doesn't sound like a fun way to spend my time.

Well, that's why I think we should teach the deadlock detector to catch
these kind of cases. That doesn't sound particularly hard.

I'm sorry to be a bit pessimistic here. But my intuition says that
starting to do group locking opens a can of worms that'll take us a long
time to close again. Maybe I'm just imagining complexity that won't
actually be there. But I have a hard time believing that.

I wonder if we couldn't implement a 'halfway' by allowing parallel
workers to jump the lockqueue if the parent process already has the
lock. That'd only work for nonconflicting locks, but I think that's
actually good.

> > I'm less worried about the additional branches than about increasing the
> > size of the datastructures. They're already pretty huge.
>
> I share that concern. I aimed for a design which would be
> memory-efficient and have low impact on the non-group-locking case
> rather than a design which would be ideal for group locking. The
> patch could be made to handle large locking groups more efficiently by
> changing the shared memory structure around; e.g. by replacing
> PROCLOCK's holdMask and waitMask fields, now both of type LOCKMASK,
> with int [MAX_LOCKMODES] so that we can represent the entire group's
> locks held and awaited on a single object with a single PROCLOCK.
> That representation would be significantly more convenient than the
> one I actually chose, but it would also use up a LOT more shared
> memory and would penalize LockReleaseAll(). The design embodied by
> the proposed patch adds one additional pointer per PROCLOCK, which is
> still not great, but I haven't been able to think of a reasonable way
> to do better.

That's in the current prototype you sent or something newer you have
privately?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-10-31 19:36:13 Re: group locking: incomplete patch, just for discussion
Previous Message Robert Haas 2014-10-31 18:47:34 Re: group locking: incomplete patch, just for discussion