Re: Lock Wait Statistics (next commitfest)

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock Wait Statistics (next commitfest)
Date: 2009-10-04 20:14:32
Message-ID: f67928030910041314t22e9dd83t9b85ba8e90648c67@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 28, 2009 at 12:14 AM, Jaime Casanova
<jcasanov(at)systemguards(dot)com(dot)ec> wrote:
> On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> wrote:
>>>>
>>>>
>>> Patch with max(wait time).
>>>
>>> Still TODO
>>>
>>> - amalgamate individual transaction lock waits
>>> - redo (rather ugly) temporary pg_stat_lock_waits in a form more like
>>> pg_locks
>>>
>> This version has the individual transaction lock waits amalgamated.
>>
>> Still TODO: redo pg_stat_lock_waits ...
>>
>
> it applies with some hunks, compiles fine and seems to work...
> i'm still not sure this is what we need, some more comments could be helpful.
>
> what kind of questions are we capable of answer with this and and what
> kind of questions are we still missing?
>
> for example, now we know "number of locks that had to wait", "total
> time waiting" and "max time waiting for a single lock"... but still we
> can have an inaccurate understanding if we have lots of locks waiting
> little time and a few waiting a huge amount of time...

Aren't the huge ones already loggable from the deadlock detector?

With the max, we can at least put an upper limit on how long the
longest ones could have been. However, is there a way to reset the
max? I tried deleting data/pg_stat_tmp, but that didn't work. With
cumulative values, you can you take snapshots and then take the
difference of them, that won't work with max. If the max can't be
reset except with an initdb, I think that makes it barely usable.

> something i have been asked when system starts to slow down is "can we
> know if there were a lock contention on that period"? for now the only
> way to answer that is logging locks

I was surprised to find that running with track_locks on did not cause
a detectable difference in performance, so you could just routinely do
regularly scheduled snapshots and go back and mine them over the time
that a problem was occurring. I just checked with pgbench over
various levels of concurrency and fsync settings. If potential
slowness wouldn't show up there, I don't know how else to look for it.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-10-04 20:22:27 Re: Unicode UTF-8 table formatting for psql text output
Previous Message Pavel Stehule 2009-10-04 20:08:19 Re: Rules: A Modest Proposal