From: | Daniel Farina <daniel(at)heroku(dot)com> |
---|---|
To: | Peter Geoghegan <peter(at)2ndquadrant(dot)com> |
Cc: | Daniel Farina <drfarina(at)acm(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: pg_stat_statements: calls under-estimation propagation |
Date: | 2012-12-30 06:31:47 |
Message-ID: | CAAZKuFYq_WnCX9zKMjJ2qf9TwoGfXQ81OG8i2bG8KhKG-Bt3wQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Dec 29, 2012 at 7:16 PM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> On Sat, Dec 29, 2012 at 7:12 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
>> On 30 December 2012 02:45, Daniel Farina <daniel(at)heroku(dot)com> wrote:
>>> As I recall, the gist of this objection had to do with a false sense
>>> of stability of the hash value, and the desire to enforce the ability
>>> to alter it. Here's an option: xor the hash value with the
>>> 'statistics session id', so it's *known* to be unstable between
>>> sessions. That gets you continuity in the common case and sound
>>> deprecation in the less-common cases (crashes, format upgrades, stat
>>> resetting).
>>
>> Hmm. I like the idea, but a concern there would be that you'd
>> introduce additional scope for collisions in the third-party utility
>> building time-series data from snapshots. I currently put the
>> probability of a collision within pg_stat_statements as 1% in the
>> event of a pg_stat_statements.max of 10,000.
>
> We can use a longer session key and duplicate the queryid (effectively
> padding) a couple of times to complete the XOR. I think that makes
> the cases of collisions introduced by this astronomically low, as an
> increase over the base collision rate.
A version implementing that is attached, except I generate an
additional 64-bit session not exposed to the client to prevent even
casual de-leaking of the session state. That may seem absurd, until
someone writes a tool that de-xors things and relies on it and then
nobody feels inclined to break it. It also keeps the public session
number short.
I also opted to save the underestimate since I'm adding a handful of
fixed width fields to the file format anyway.
--
fdr
Attachment | Content-Type | Size |
---|---|---|
pg_stat_statements-identification-v3.patch.gz | application/x-gzip | 6.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2012-12-30 06:34:00 | Re: enhanced error fields |
Previous Message | Stephen Frost | 2012-12-30 04:47:46 | Re: enhanced error fields |