Re: Lock Wait Statistics (next commitfest)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock Wait Statistics (next commitfest)
Date: 2009-10-06 01:03:34
Message-ID: 603c8f070910051803n26c69260wd6735431291fc0d5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 4, 2009 at 4:14 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Mon, Sep 28, 2009 at 12:14 AM, Jaime Casanova
> <jcasanov(at)systemguards(dot)com(dot)ec> wrote:
>> On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> wrote:
>>>>>
>>>>>
>>>> Patch with max(wait time).
>>>>
>>>> Still TODO
>>>>
>>>> - amalgamate individual transaction lock waits
>>>> - redo (rather ugly) temporary pg_stat_lock_waits in a form more like
>>>> pg_locks
>>>>
>>> This version has the individual transaction lock waits amalgamated.
>>>
>>> Still TODO: redo pg_stat_lock_waits ...
>>>
>>
>> it applies with some hunks, compiles fine and seems to work...
>> i'm still not sure this is what we need, some more comments could be helpful.
>>
>> what kind of questions are we capable of answer with this and and what
>> kind of questions are we still missing?
>>
>> for example, now we know "number of locks that had to wait", "total
>> time waiting" and "max time waiting for a single lock"... but still we
>> can have an inaccurate understanding if we have lots of locks waiting
>> little time and a few waiting a huge amount of time...
>
> Aren't the huge ones already loggable from the deadlock detector?
>
> With the max, we can at least put an upper limit on how long the
> longest ones could have been.  However, is there a way to reset the
> max?  I tried deleting data/pg_stat_tmp, but that didn't work.  With
> cumulative values, you can you take snapshots and then take the
> difference of them, that won't work with max.  If the max can't be
> reset except with an initdb, I think that makes it barely usable.
>
>> something i have been asked when system starts to slow down is "can we
>> know if there were a lock contention on that period"? for now the only
>> way to answer that is logging locks
>
> I was surprised to find that running with track_locks on did not cause
> a detectable difference in performance, so you could just routinely do
> regularly scheduled snapshots and go back and mine them over the time
> that a problem was occurring.  I just checked with pgbench over
> various levels of concurrency and fsync settings.  If potential
> slowness wouldn't show up there, I don't know how else to look for it.

It seems that this patch had open TODO items at the beginning of the
CommitFest (so perhaps we should have bounced it immediately), and I
think that's still the case now, so I am going to mark this as
Returned with Feedback. A lot of good reviewing has been done,
though, so many this can be submitted for a future CommitFest in
something close to a final form.

Thanks,

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2009-10-06 01:12:50 Re: 8.5 TODO: any info on "Create dump tool for write-ahead logs..." in PITR section (1.4)?
Previous Message Robert Haas 2009-10-06 00:48:33 Re: dblink memory leak