Re: pg_stat_statements: calls under-estimation propagation

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Daniel Farina <drfarina(at)acm(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_stat_statements: calls under-estimation propagation
Date: 2012-12-30 02:37:46
Message-ID: CAEYLb_U4r58zwvWJz49=fp2X9uPvobpb8xbQiB_KMDGLgmpvZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29 December 2012 12:21, Daniel Farina <drfarina(at)acm(dot)org> wrote:
> These were not express goals of the patch, but so long as you are
> inviting features, attached is a bonus patch that exposes the queryid
> and also the notion of a "statistics session" that is re-rolled
> whenever the stats file could not be read or the stats are reset, able
> to fully explain all obvious causes of retrograde motion in
> statistics. It too is cumulative, so it includes the under-estimation
> field.

Cool.

I had a thought about Tom's objection to exposing the hash value. I
would like to propose a compromise between exposing the hash value and
not doing so at all.

What if we expose the hash value without documenting it, in a way not
apparent to normal users, while letting experts willing to make an
executive decision about its stability use it? What I have in mind is
to expose the hash value from the pg_stat_statements function, and yet
to avoid exposing it within the pg_stat_statements view definition.
The existence of the hash value would not need to be documented, since
the pg_stat_statements function is an undocumented implementation
detail.

Precedent for this exists, I think - the undocumented system hash
functions are exposed via an SQL interface. Some satellite projects
rely on this (apparently the pl/proxy documentation shows the use of
hashtext(), which is a thin wrapper on hash_any(), and there is
chatter about it elsewhere). So it is already the case that people are
using hashtext(), which should not be problematic if the applications
that do so have a reasonable set of expectations about its stability
(i.e. it's not going to change in a point release, because that would
break hash indexes, but may well change across major releases). We've
already in effect promised to not break hashtext() in a point release,
just as we've already in effect promised to not break the hash values
that pg_stat_statements uses internally (to do any less would
invalidate the on-disk representation, and necessitate bumping
PGSS_FILE_HEADER to wipe the stored stats).

Thoughts?

> Notably, I also opted to nullify extra pg_stat_statements
> fields when they'd also show "insufficient privileges" (that one is
> spared from this censorship), because I feel as though a bit too much
> information leaks from pg_stat_statement's statistics to ignore,
> especially after adding the query id.

That seems sensible.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2012-12-30 02:45:46 Re: pg_stat_statements: calls under-estimation propagation
Previous Message Stephen Frost 2012-12-30 02:01:04 Re: enhanced error fields