Re: review - pg_stat_statements

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
Subject: Re: review - pg_stat_statements
Date: 2013-11-24 16:29:21
Message-ID: CAFj8pRD=RfrzwsbgSr0oqY4K-3CougZywg9JKFOd9qkdHxxMBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2013/11/24 Peter Geoghegan <pg(at)heroku(dot)com>

> On Sun, Nov 24, 2013 at 1:18 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> > I got a compilation warning:
>
> I'll look into it.
>
> > * I tried do some basic benchmark, and I didn't see any negative on
> > performance related to implemented feature
>
> You're never going to see any performance impact with something like a
> regular pgbench workload. That's because you'll only ever write query
> texts out when a shared lock is held. With only extreme exceptions (a
> garbage collection), exceptions will never be exercised by what you're
> doing, you will only block whole new entries from being added -
> existing stat counters are just protected by a shared lock + spinlock.
>
> You might consider rigging pg_stat_statements to create a new hash
> value randomly (consisting of a random integer for a queryid "hash")
> maybe 1% - 10% of the time. That would simulate the cache being filled
> quickly, I guess, where that shared lock will conflict with the
> exclusive lock, potentially showing where what I've done can hurt. I
> recommend in the interest of fairness not letting it get so far as to
> put the cache under continual pressure.
>
> Now, this isn't that important a case, because having a larger hash
> table makes exclusive locking/new entries less necessary, and this
> work enables people to have larger hash tables. But it does matter to
> some degree.
>
>
how is a size of hash table related to exclusive locks in pgss_store? I
don't afraid about performance of pg_stat_statements(). I afraid a
performance of creating new entry and appending to file.

I didn't expected some slowdown - a benchmark was "only" verification
against some hidden cost. Is not difficult to write synthetic benchmark
that will find some differences, but a result of this benchmark will be
useless probably due minimal relation to reality.

> > * We would to this patch - a idea of proposal is right - a shared memory
> can
> > be used better than storage of possibly extra long queries
>
> Right. Plus the amount of storage used is pretty modest, even compared
> to previous shared memory use. Each entry doesn't need to have
> track_activity_query_size bytes of storage, only what it needs (though
> garbage can accrue, which is a concern).
>
> > Peter does some warning about performance in feature proposal
> >
> http://www.postgresql.org/message-id/CAM3SWZRYYnfwXi3r3eDAKWBOYtaf1_PwGJxTAyddNSbjAfDzjA@mail.gmail.com
>
> I'm mostly talking about the cost of the shared lock for *reading*
> here, when pg_stat_statements() is called. If that was happening at
> the same time as I/O for reading the query texts from file, that could
> be a concern. Not enough of a concern to worry about humans doing
> this, I think, but maybe for scripts.
>
> Maybe the trick would be to copy the stats into shared memory, and
> only then read texts from disk (with the shared lock released). We'd
> have to be concerned about a garbage collection occurring, so we'd
> need to re-acquire a shared lock again (plus a spinlock) to check that
> didn't happen (which is generally very unlikely). Only when we know it
> didn't happen could we display costs to the user, and that might mean
> keeping all the query texts in memory, and that might matter in
> extreme situations. The reason I haven't done all this already is
> because it's far from clear that it's worth the trouble.
>
>
I don't expect so pg_stat_statements is on application critical path, so I
prefer mostly simple design

> > Peter mentioned a race conditions under high load. It is fixed?
>
> Yes, the version you mentioned had the fix.
>
>
ok

Regards

Pavel

> --
> Peter Geoghegan
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Kreen 2013-11-24 16:44:32 Re: PL/Python: domain over array support
Previous Message Simon Riggs 2013-11-24 16:21:15 Re: Traffic jams in fn_extra