Re: alternative model for handling locking in parallel groups

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: alternative model for handling locking in parallel groups
Date: 2014-11-18 13:53:13
Message-ID: CAA4eK1+ryoYon0wC-EU9S4sdMQV6M73sObcwWjq-uYSGYQYLcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 14, 2014 at 2:29 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Discussion of my incomplete group locking patch seems to have
> converged around two points: (1) Everybody agrees that undetected
> deadlocks are unacceptable. (2) Nobody agrees with my proposal to
> treat locks held by group members as mutually non-conflicting. As was
> probably evident from the emails on the other thread, it was not
> initially clear to me why you'd EVER want heavyweight locks held by
> different group members to mutually conflict, but after thinking it
> over for a while, I started to think of cases where you would
> definitely want that:
>
> 1. Suppose two or more parallel workers are doing a parallel COPY.
> Each time the relation needs to be extended, one backend or the other
> will need to take the relation extension lock in Exclusive mode.
> Clearly, taking this lock needs to exclude both workers in the same
> group and also unrelated processes.
>
> 2. Suppose two or more parallel workers are doing a parallel
> sequential scan, with a filter condition of myfunc(a.somecol), and
> that myfunc(a.somecal) updates a tuple in some other table. Access to
> update that tuple will be serialized using tuple locks, and it's no
> safer for two parallel workers to do this at the same time than it
> would be for two unrelated processes to do it at the same time.
>

Won't this be addressed because both updates issued from myfunc()
are considered as separate commands, so w.r.t lock it should behave
as 2 different updates in same transaction. I think there may be more
things to make updates possible via parallel workers apart from tuple lock.

> On the other hand, I think there are also some cases where you pretty
> clearly DO want the locks among group members to be mutually
> non-conflicting, such as:
>
> 3. Parallel VACUUM. VACUUM takes ShareUpdateExclusiveLock, so that
> only one process can be vacuuming a relation at the same time. Now,
> if you've got several processes in a group that are collaborating to
> vacuum that relation, they clearly need to avoid excluding each other,
> but they still need to exclude other people. And in particular,
> nobody else should get to start vacuuming that relation until the last
> member of the group exits. So what you want is a
> ShareUpdateExclusiveLock that is, in effect, shared across the whole
> group, so that it's only released when the last process exits.
>
> 4. Parallel query on a locked relation. Parallel query should work on
> a table created in the current transaction, or one explicitly locked
> by user action. It's not acceptable for that to just randomly
> deadlock, and skipping parallelism altogether, while it'd probably be
> acceptable for a first version, is not going a good long-term
> solution. It also sounds buggy and fragile for the query planner to
> try to guess whether the lock requests in the parallel workers will
> succeed or fail when issued. Figuring such details out is the job of
> the lock manager or the parallelism infrastructure, not the query
> planner.
>
> After thinking about these cases for a bit, I came up with a new
> possible approach to this problem. Suppose that, at the beginning of
> parallelism, when we decide to start up workers, we grant all of the
> locks already held by the master to each worker (ignoring the normal
> rules for lock conflicts). Thereafter, we do everything the same as
> now, with no changes to the deadlock detector. That allows the lock
> conflicts to happen normally in the first two cases above, while
> preventing the unwanted lock conflicts in the second two cases.
>

Here I think we have to consider how to pass the information about
all the locks held by master to worker backends. Also I think assuming
we have such an information available, still it will be considerable work
to grant locks considering the number of locks we acquire [1] (based
on Simon's analysis) and the additional memory they require. Finally
I think deadlock detector work might also be increased as there will be
now more procs to visit.

In general, I think this scheme will work, but I am not sure it is worth
at this stage (considering initial goal to make parallel workers will be
used for read operations).

[1] :
http://www.postgresql.org/message-id/CA+U5nMJLuBGduWjqikt6UmQRFMrmRQdhpNDb6Z5Xzdtb0pH2vQ@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-11-18 14:04:48 Re: group locking: incomplete patch, just for discussion
Previous Message Pavel Stehule 2014-11-18 12:29:02 Re: proposal: plpgsql - Assert statement