Re: pg_stat_statements: calls under-estimation propagation

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Sameer Thakur <samthakur74(at)gmail(dot)com>
Cc: Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements: calls under-estimation propagation
Date: 2013-10-10 14:40:14
Message-ID: CAHGQGwE4P9PUTdf1fkWzBE9JsMmJZvpv3Eg4SvusSFop==83vQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 10, 2013 at 7:20 PM, Sameer Thakur <samthakur74(at)gmail(dot)com> wrote:
> Please find patch attached which adds documentation for session_start
> and introduced fields and corrects documentation for queryid to be
> query_id. session_start remains in the view as agreed.

Thanks for updating the document!

I'm not clear about the use case of 'session_start' and 'introduced' yet.
Could you elaborate it? Maybe we should add that explanation into
the document?

In my test, I found that pg_stat_statements.total_time always indicates a zero.
I guess that the patch might handle pg_stat_statements.total_time wrongly.

+ values[i++] = DatumGetTimestamp(
+ instr_get_timestamptz(pgss->session_start));
+ values[i++] = DatumGetTimestamp(
+ instr_get_timestamptz(entry->introduced));

These should be executed only when detected_version >= PGSS_TUP_LATEST?

+ <entry><structfield>session_start</structfield></entry>
+ <entry><type>text</type></entry>
+ <entry></entry>
+ <entry>Start time of a statistics session.</entry>
+ </row>
+
+ <row>
+ <entry><structfield>introduced</structfield></entry>
+ <entry><type>text</type></entry>

The data type of these columns is not text.

+ instr_time session_start;
+ uint64 private_stat_session_key;

it's better to add the comments explaining these fields.

+ microsec = INSTR_TIME_GET_MICROSEC(t) -
+ ((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * USECS_PER_DAY);

HAVE_INT64_TIMESTAMP should be considered here.

+ if (log_cannot_read)
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not read pg_stat_statement file \"%s\": %m",
+ PGSS_DUMP_FILE)));

Is it better to use WARNING instead of LOG as the log level here?

+ /*
+ * The role calling this function is unable to see
+ * sensitive aspects of this tuple.
+ *
+ * Nullify everything except the "insufficient privilege"
+ * message for this entry
+ */
+ memset(nulls, 1, sizeof nulls);
+
+ nulls[i] = 0;
+ values[i] = CStringGetTextDatum("<insufficient privilege>");

Why do we need to hide *all* the fields in pg_stat_statements, when
it's accessed by improper user? This is a big change of pg_stat_statements
behavior, and it might break the compatibility.

>> >This is not directly related to the patch itself, but why does the
>> > queryid
>> >need to be calculated based on also the "statistics session"?
>
> If we expose hash value of query tree, without using statistics session,
> it is possible that users might make wrong assumption that this value
> remains stable across version upgrades, when in reality it depends on
> whether the version has make changes to query tree internals. So to
> explicitly ensure that users do not make this wrong assumption, hash value
> generation use statistics session id, which is newly created under
> situations described above.

So, ISTM that we can use, for example, the version number to calculate
the query_id. Right?

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2013-10-10 14:44:39 Re: Auto-tuning work_mem and maintenance_work_mem
Previous Message Kevin Grittner 2013-10-10 14:24:26 Re: Auto-tuning work_mem and maintenance_work_mem