Re: pg_stat_lwlocks view - lwlocks statistics, round 2

From: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Qi Huang <huangqiyx(at)hotmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: pg_stat_lwlocks view - lwlocks statistics, round 2
Date: 2012-10-13 14:34:12
Message-ID: 50797BE4.4070508@uptime.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

2012/10/13 23:05, Satoshi Nagayasu wrote:
> Hi all,
>
> I have fixed my previous patch for pg_stat_lwlocks view, and
> as Josh commented, it now supports local and global (shared)
> statistics in the same system view.

Sorry, I found my mistakes. New fixed one is attached to this mail.

Regards,

>
> Local statistics means the counters are only effective in the
> same session, and shared ones means the counters are shared within
> the entire cluster.
>
> Also the global statistics would be collected via pgstat collector
> process like other statistics do.
>
> Now, the global statistics struct has been splitted into two parts
> for different use, for bgwriter stats and lwlock stats.
>
> Therefore, calling pg_stat_reset_shared('bgwriter') or
> pg_stat_reset_shared('lwlocks') would reset dedicated struct,
> not entire PgStat_GlobalStats.
>
> Comments and review are always welcome.
>
> Regards,
>
> ------------------------------------------------------------------------------
> postgres=# SELECT * FROM pg_stat_lwlocks;
> lwlockid | local_calls | local_waits | local_time_ms | shared_calls |
> shared_waits | shared_time_ms
> ----------+-------------+-------------+---------------+--------------+--------------+----------------
> 0 | 0 | 0 | 0 | 4268 |
> 0 | 0
> 1 | 43 | 0 | 0 | 387 |
> 0 | 0
> 2 | 0 | 0 | 0 | 19 |
> 0 | 0
> 3 | 0 | 0 | 0 | 28 |
> 0 | 0
> 4 | 3 | 0 | 0 | 315 |
> 0 | 0
> 5 | 0 | 0 | 0 | 24 |
> 0 | 0
> 6 | 1 | 0 | 0 | 76 |
> 0 | 0
> 7 | 0 | 0 | 0 | 16919 |
> 0 | 0
> 8 | 0 | 0 | 0 | 0 |
> 0 | 0
> 9 | 0 | 0 | 0 | 0 |
> 0 | 0
> 10 | 0 | 0 | 0 | 0 |
> 0 | 0
> 11 | 0 | 0 | 0 | 75 |
> 0 | 0
> 12 | 0 | 0 | 0 | 0 |
> 0 | 0
> 13 | 0 | 0 | 0 | 0 |
> 0 | 0
> 14 | 0 | 0 | 0 | 0 |
> 0 | 0
> 15 | 0 | 0 | 0 | 0 |
> 0 | 0
> 16 | 0 | 0 | 0 | 0 |
> 0 | 0
> 17 | 0 | 0 | 0 | 61451 |
> 6 | 0
> 18 | 0 | 0 | 0 | 0 |
> 0 | 0
> 19 | 0 | 0 | 0 | 0 |
> 0 | 0
> 20 | 0 | 0 | 0 | 0 |
> 0 | 0
> 21 | 1 | 0 | 0 | 9 |
> 0 | 0
> 22 | 0 | 0 | 0 | 0 |
> 0 | 0
> 23 | 0 | 0 | 0 | 0 |
> 0 | 0
> 24 | 0 | 0 | 0 | 1 |
> 0 | 0
> 25 | 0 | 0 | 0 | 0 |
> 0 | 0
> 26 | 2 | 0 | 0 | 18 |
> 0 | 0
> 27 | 0 | 0 | 0 | 0 |
> 0 | 0
> 28 | 0 | 0 | 0 | 0 |
> 0 | 0
> 29 | 0 | 0 | 0 | 0 |
> 0 | 0
> 30 | 0 | 0 | 0 | 0 |
> 0 | 0
> 31 | 0 | 0 | 0 | 0 |
> 0 | 0
> 32 | 0 | 0 | 0 | 0 |
> 0 | 0
> 33 | 4 | 0 | 0 | 207953 |
> 0 | 0
> 50 | 8 | 0 | 0 | 33388 |
> 0 | 0
> 67 | 0 | 0 | 0 | 0 |
> 0 | 0
> (36 rows)
>
> postgres=#
> ------------------------------------------------------------------------------
>
>
> 2012/06/26 21:11, Satoshi Nagayasu wrote:
>> Hi all,
>>
>> I've modified the pg_stat_lwlocks patch to be able to work with
>> the latest PostgreSQL Git code.
>>
>> This patch provides:
>> pg_stat_lwlocks New system view to show lwlock statistics.
>> pg_stat_get_lwlocks() New function to retrieve lwlock statistics.
>> pg_stat_reset_lwlocks() New function to reset lwlock statistics.
>>
>> Please try it out.
>>
>> Regards,
>>
>> 2012/06/26 5:29, Satoshi Nagayasu wrote:
>>> Hi all,
>>>
>>> I've been working on a new system view, pg_stat_lwlocks, to observe
>>> LWLock, and just completed my 'proof-of-concept' code that can work
>>> with version 9.1.
>>>
>>> Now, I'd like to know the possibility of this feature for future
>>> release.
>>>
>>> With this patch, DBA can easily determine a bottleneck around lwlocks.
>>> --------------------------------------------------
>>> postgres=# SELECT * FROM pg_stat_lwlocks ORDER BY time_ms DESC LIMIT 10;
>>> lwlockid | calls | waits | time_ms
>>> ----------+--------+-------+---------
>>> 49 | 193326 | 32096 | 23688
>>> 8 | 3305 | 133 | 1335
>>> 2 | 21 | 0 | 0
>>> 4 | 135188 | 0 | 0
>>> 5 | 57935 | 0 | 0
>>> 6 | 141 | 0 | 0
>>> 7 | 24580 | 1 | 0
>>> 3 | 3282 | 0 | 0
>>> 1 | 41 | 0 | 0
>>> 9 | 3 | 0 | 0
>>> (10 rows)
>>>
>>> postgres=#
>>> --------------------------------------------------
>>>
>>> In this view,
>>> 'lwlockid' column represents LWLockId used in the backends.
>>> 'calls' represents how many times LWLockAcquire() was called.
>>> 'waits' represents how many times LWLockAcquire() needed to wait
>>> within it before lock acquisition.
>>> 'time_ms' represents how long LWLockAcquire() totally waited on
>>> a lwlock.
>>>
>>> And lwlocks that use a LWLockId range, such as BufMappingLock or
>>> LockMgrLock, would be grouped and summed up in a single record.
>>> For example, lwlockid 49 in the above view represents LockMgrLock
>>> statistics.
>>>
>>> Now, I know there are some considerations.
>>>
>>> (1) Performance
>>>
>>> I've measured LWLock performance both with and without the patch,
>>> and confirmed that this patch does not affect the LWLock perfomance
>>> at all.
>>>
>>> pgbench scores with the patch:
>>> tps = 900.906658 (excluding connections establishing)
>>> tps = 908.528422 (excluding connections establishing)
>>> tps = 903.900977 (excluding connections establishing)
>>> tps = 910.470595 (excluding connections establishing)
>>> tps = 909.685396 (excluding connections establishing)
>>>
>>> pgbench scores without the patch:
>>> tps = 909.096785 (excluding connections establishing)
>>> tps = 894.868712 (excluding connections establishing)
>>> tps = 910.074669 (excluding connections establishing)
>>> tps = 904.022770 (excluding connections establishing)
>>> tps = 895.673830 (excluding connections establishing)
>>>
>>> Of course, this experiment was not I/O bound, and the cache hit ratio
>>> was>99.9%.
>>>
>>> (2) Memory space
>>>
>>> In this patch, I added three new members to LWLock structure
>>> as uint64 to collect statistics.
>>>
>>> It means that those members must be held in the shared memory,
>>> but I'm not sure whether it's appropriate.
>>>
>>> I think another possible option is holding those statistics
>>> values in local (backend) process memory, and send them through
>>> the stat collector process (like other statistics values).
>>>
>>> (3) LWLock names (or labels)
>>>
>>> Now, pg_stat_lwlocks view shows LWLockId itself. But LWLockId is
>>> not easy for DBA to determine actual lock type.
>>>
>>> So, I want to show LWLock names (or labels), like 'WALWriteLock'
>>> or 'LockMgrLock', but how should I implement it?
>>>
>>> Any comments?
>>>
>>> Regards,
>>
>>
>
>

--
Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Uptime Technologies, LLC. http://www.uptime.jp

Attachment Content-Type Size
pg_stat_lwlocks_20121013_2.diff text/plain 25.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Lelarge 2012-10-13 14:47:06 Bug in -c CLI option of pg_dump/pg_restore
Previous Message Satoshi Nagayasu 2012-10-13 14:05:00 pg_stat_lwlocks view - lwlocks statistics, round 2