Re: review - pg_stat_statements

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
Subject: Re: review - pg_stat_statements
Date: 2013-11-30 23:34:50
Message-ID: CAM3SWZSVFf-vBNC8sc-0CpWZxVQH49REYBcHb95t95fcrGSa6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've produced another revision, attached. Changes:

* Fixes the compiler warnings on your environment.

* Normalizes query string with the shared lock held (not always, just
in case its needed). In master, this doesn't have to happen with a
shared lock held, so because of this, but also because of the file
writing there is probably *some* small regression against master when
the cache is under lots of pressure. Since we have to append the file,
and need to do so with the shared lock held, there is no way to check
that the normalized text is really needed (i.e. that the entry doesn't
exist already) while taking the opportunity, with the shared lock
already held, to create a shared-lock-only query text file entry. This
is logical; we shouldn't normalize query texts unless we know we'll
need them, even if this implies that we must do so with a shared lock
held when we do, because that ought to be highly infrequent.

* pg_stat_statemens.max default is increased to 5,000. It feels like
this should be increased in light of the drastically reduced memory
footprint of pg_stat_statements. Besides, the best way of handling
whatever modest regression I may have created is to simply make cache
pressure very light, so that new entries are created infrequently. As
I've mentioned previously, any regression I may have created could not
possibly negatively affect something like a pgbench workload with only
a handful of distinct queries, because the price is only paid once per
new entry, and even then only the duration of holding a shared lock is
potentially increased (the shared lock alone doesn't prevent existing
entries from being updated, so are not too costly). Actually, this
isn't 100% true - there is an extremely remote chance of writing a
query text to file while an exclusive lock is held - but it really is
exceedingly rare and unlikely.

* Adds a new, deliberately undocumented parameter to the
pg_stat_statements() function. This is currently completely useless,
because of course we don't expose the query hash, which is why I
haven't created a new extension version (The queryid-exposing patch I
recently marked "ready for committer" has that - pgss version 1.2 -
and since this change's inclusion is entirely predicated on getting
that other patch in, it wouldn't have made sense to do anything more
than just show what I meant by modifying the existing pgss version,
1.1). If we have the queryid to work off of, this becomes a useful
feature for authors of third party snapshot-consuming tools, that now
need only lazily fetch query texts for new entries (so when they
observe a new entry, they know that there next snapshot should ask for
query texts to fill in what they need). It's a performance feature for
their use-case, since I feel that supporting those kinds of tools is a
really useful direction for pg_stat_statements. The user-visible
behavior of the pg_stat_statements view is unaffected.

--
Peter Geoghegan

Attachment Content-Type Size
pg_stat_statements_ext_text.v3.2013_11_30.patch.gz application/x-gzip 13.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-11-30 23:48:02 Re: [GENERAL] pg_upgrade ?deficiency
Previous Message Kevin Grittner 2013-11-30 23:21:08 Re: [GENERAL] pg_upgrade ?deficiency