Detail part for still waiting for lock log message

Lists: pgsql-hackers
From: Tarvi Pillessaar <tarvip(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Detail part for still waiting for lock log message
Date: 2013-08-20 16:21:41
Message-ID: 52139795.1050105@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello!

From time to time when investigating different locking issues using
postgres log i have thought that "process x is still waiting for"
message could be more informative, for example at the moment it is quite
painful to find out which session was actually holding that particular lock.

I would like to add detail part for this message to show information
about the lock holder and also show what the lock holder is actually
doing (yes, i know that i could get the lock holder's statement from the
log, but this not the same, maybe lock holder was idle in transaction).
So, i wrote a small patch, but i am not sure that this is the
best/correct way to do it.
Maybe someone can take a look at my patch and give some feedback.
Even if this idea won't reach to upstream, i would still like to get
feedback.

About patch:
Patch is tested against 9.2.4.
I was not sure that i should check if the lock holder's proclock was
found (as lock holder's proclock should be always there), check is there
to be on the safe side, but maybe it's unnecessary.
If it's not needed then fallback to old behavior (logging without
detail) is not needed as well.
And yes, i know that the lock holding time is not actually correct and
it actually shows milliseconds since transaction start.

Regards,
Tarvi Pillessaar

Attachment Content-Type Size
still_waiting_for_lock-v1.patch text/x-patch 2.4 KB

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tarvi Pillessaar <tarvip(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detail part for still waiting for lock log message
Date: 2013-08-20 18:11:14
Message-ID: 1377022274.34562.YahooMailNeo@web162906.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tarvi Pillessaar <tarvip(at)gmail(dot)com> wrote:

> Maybe someone can take a look at my patch and give some feedback.

> Even if this idea won't reach to upstream, i would still like to get
> feedback.

Please add the patch to the open CommitFest.

You can read about the process here:

http://wiki.postgresql.org/wiki/CommitFest

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tarvi Pillessaar <tarvip(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detail part for still waiting for lock log message
Date: 2013-08-24 14:58:43
Message-ID: 1377356323.8206.8.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2013-08-20 at 19:21 +0300, Tarvi Pillessaar wrote:
> About patch:
> Patch is tested against 9.2.4.
> I was not sure that i should check if the lock holder's proclock was
> found (as lock holder's proclock should be always there), check is there
> to be on the safe side, but maybe it's unnecessary.
> If it's not needed then fallback to old behavior (logging without
> detail) is not needed as well.
> And yes, i know that the lock holding time is not actually correct and
> it actually shows milliseconds since transaction start.
>

Please fix this compiler warning:

proc.c: In function ‘ProcSleep’:
proc.c:1258:6: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]


From: Tarvi Pillessaar <tarvip(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detail part for still waiting for lock log message
Date: 2013-08-26 12:18:34
Message-ID: 521B479A.2020502@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Fixed patch attached.

Regards,
Tarvi Pillessaar

On 24.08.2013 17:58, Peter Eisentraut wrote:
> On Tue, 2013-08-20 at 19:21 +0300, Tarvi Pillessaar wrote:
>> About patch:
>> Patch is tested against 9.2.4.
>> I was not sure that i should check if the lock holder's proclock was
>> found (as lock holder's proclock should be always there), check is there
>> to be on the safe side, but maybe it's unnecessary.
>> If it's not needed then fallback to old behavior (logging without
>> detail) is not needed as well.
>> And yes, i know that the lock holding time is not actually correct and
>> it actually shows milliseconds since transaction start.
>>
> Please fix this compiler warning:
>
> proc.c: In function ‘ProcSleep’:
> proc.c:1258:6: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
>
>

Attachment Content-Type Size
still_waiting_for_lock-v2.patch text/x-patch 2.4 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tarvi Pillessaar <tarvip(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detail part for still waiting for lock log message
Date: 2013-08-26 19:20:54
Message-ID: 20130826192054.GB6206@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tarvi Pillessaar escribió:
> Fixed patch attached.

1. this assumes there is only one holder, which is not correct.
(Consider two backends holding shared lock on something and another one
stuck trying to acquire exclusive)

2. I think pgstat_get_backend_current_activity() can be helpful.

3. Doesn't this risk excessive overhead? Can the other backends be gone
(or done with the lock) before the report has completed? If this
happens, is there a problem?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tarvi Pillessaar <tarvip(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detail part for still waiting for lock log message
Date: 2013-08-27 16:40:40
Message-ID: 521CD688.60105@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thank you for feedback.

On 26.08.2013 22:20, Alvaro Herrera wrote:

> 1. this assumes there is only one holder, which is not correct.
> (Consider two backends holding shared lock on something and another one
> stuck trying to acquire exclusive)
Hmm, true. Then it's not that simple as I thought in first place.
I guess it is possible to find all those backends that are holding that
shared lock, but i'm not sure about the usefulness anymore as this list
can be huge.

> 2. I think pgstat_get_backend_current_activity() can be helpful.
Yes, I saw that function, but i also wanted tx start time.
> 3. Doesn't this risk excessive overhead?
About the overhead, i may be wrong, but i was thinking that that
particular backend will be put to sleep anyway and also in normal
workload such log messages are not very common (or at least shouldn't be).
> Can the other backends be gone
> (or done with the lock) before the report has completed? If this
> happens, is there a problem?
That's why i have added check if other backend is found etc, but maybe i
missed something. This is one of reasons why i wanted feedback.

Regards
Tarvi Pillessaar


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tarvi Pillessaar <tarvip(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detail part for still waiting for lock log message
Date: 2013-08-27 16:46:41
Message-ID: 20130827164640.GB4933@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tarvi Pillessaar escribió:
> Thank you for feedback.
>
> On 26.08.2013 22:20, Alvaro Herrera wrote:
>
> >1. this assumes there is only one holder, which is not correct.
> >(Consider two backends holding shared lock on something and another one
> >stuck trying to acquire exclusive)
> Hmm, true. Then it's not that simple as I thought in first place.
> I guess it is possible to find all those backends that are holding
> that shared lock, but i'm not sure about the usefulness anymore as
> this list can be huge.

We already do this for deadlock reports, so I think it's fine.

> >2. I think pgstat_get_backend_current_activity() can be helpful.
> Yes, I saw that function, but i also wanted tx start time.

So add that. But note the existing function has checks for
"changecount" and stuff that your code didn't. I think for robustness
you need to add that too.

> >3. Doesn't this risk excessive overhead?
> About the overhead, i may be wrong, but i was thinking that that
> particular backend will be put to sleep anyway and also in normal
> workload such log messages are not very common (or at least
> shouldn't be).

I guess there's no way to know how long the process is going to sleep
after logging this message.

> > Can the other backends be gone
> >(or done with the lock) before the report has completed? If this
> >happens, is there a problem?
> That's why i have added check if other backend is found etc, but
> maybe i missed something. This is one of reasons why i wanted
> feedback.

Make sure you understand the "changecount" stuff and the reasons to use
it in pgstat_get_backend_current_activity.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services