Re: pg_stat_lwlocks view - lwlocks statistics, round 2

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Qi Huang <huangqiyx(at)hotmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: pg_stat_lwlocks view - lwlocks statistics, round 2
Date: 2012-10-18 19:36:26
Message-ID: CA+TgmoZXvaOfQbkMkE4qBA1W6XwRDdGLyZWqjSpkbSK1jfmT7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 16, 2012 at 11:31 AM, Satoshi Nagayasu <snaga(at)uptime(dot)jp> wrote:
> A flight-recorder must not be disabled. Collecting
> performance data must be top priority for DBA.

This analogy is inapposite, though, because a flight recorder rarely
crashes the aircraft. If it did, people might have second thoughts
about the "never disable the flight recorder" rule. I have had a
couple of different excuses to look into the overhead of timing
lately, and it does indeed seem that on many modern Linux boxes even
extremely frequent gettimeofday calls produce only very modest amounts
of overhead. Sadly, the situation on Windows doesn't look so good. I
don't remember the exact numbers but I think it was something like 40
or 60 or 80 times slower on the Windows box one of my colleagues
tested than it is on Linux. And it turns out that that overhead
really is measurable and does matter if you do it in a code path that
gets run frequently. Of course I am enough of a Linux geek that I
don't use Windows myself and curse my fate when I do have to use it,
but the reality is that we have a huge base of users who only use
PostgreSQL at all because it runs on Windows, and we can't just throw
those people under the bus. I think that older platforms like HP/UX
likely have problems in this area as well although I confess to not
having tested.

That having been said, if we're going to do this, this is probably the
right approach, because it only calls gettimeofday() in the case where
the lock acquisition is contended, and that is a lot cheaper than
calling it in all cases. Maybe it's worth finding a platform where
pg_test_timing reports that timing is very slow and then measuring how
much impact this has on something like a pgbench or pgbench -S
workload. We might find that it is in fact negligible. I'm pretty
certain that it will be almost if not entirely negligible on Linux but
that's not really the case we need to worry about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-10-18 19:49:00 Re: Deprecating RULES
Previous Message Noah Misch 2012-10-18 19:18:28 Re: Incorrect behaviour when using a GiST index on points