Re: Inconsistency between pg_stat_activity and log_duration

Lists: pgsql-hackers
From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-04 04:06:37
Message-ID: 20140204.130637.400047374338726225.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I found an interesting inconsistency between pg_stat_activity and
log_duration.

-[ RECORD 1 ]----+---------------------------------------------------------------------------------------------------------------------------
datid | 16392
datname | test
pid | 21815
usesysid | 10
usename | t-ishii
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2014-02-04 12:46:55.178688+09
xact_start | 2014-02-04 12:47:27.210976+09
query_start | 2014-02-04 12:47:27.210976+09
state_change | 2014-02-04 12:47:27.210981+09
waiting | f
state | active
query | select * from pg_stat_activity;

[snip]

-[ RECORD 3 ]----+---------------------------------------------------------------------------------------------------------------------------
datid | 16392
datname | test
pid | 21850
usesysid | 10
usename | t-ishii
application_name | pgbench
client_addr |
client_hostname |
client_port | -1
backend_start | 2014-02-04 12:47:11.233245+09
xact_start | 2014-02-04 12:47:11.235236+09
query_start | 2014-02-04 12:47:11.241084+09
state_change | 2014-02-04 12:47:11.241085+09
waiting | f
state | active
query | SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pgpool_regclass('pgbench_accounts') AND c.relpersistence = 'u'

As you can see, at 2014-02-04 12:47:27.210981+09 the query "SELECT
count(*) FROM pg_catalog.pg_class..." is "active" and it seems still
running.

On the other side, Here is an excerpt from PostgreSQL log:

21850 2014-02-04 12:47:11.241 JST LOG: execute pgpool21805/pgpool21805: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pgpool_regclass('pgbench_accounts') AND c.relpersistence = 'u'
21850 2014-02-04 12:47:11.241 JST LOG: duration: 0.078 ms

The duration was shown as "0.078 ms" thus it seems the query has been
already finished.

The reason why pg_stat_activity thinks that the query in question is,
"sync" message has not been sent to the backend yet (at least from
what I read from postgres.c).

I think this inconsistency is not very intutive to users...

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-07 05:13:56
Message-ID: CAA4eK1Kw5b2xGAXqXWQyRrWObL_Exgb9_1ZYgL_ngUET-LmUEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 4, 2014 at 9:36 AM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
> As you can see, at 2014-02-04 12:47:27.210981+09 the query "SELECT
> count(*) FROM pg_catalog.pg_class..." is "active" and it seems still
> running.
>
> On the other side, Here is an excerpt from PostgreSQL log:
>
> 21850 2014-02-04 12:47:11.241 JST LOG: execute pgpool21805/pgpool21805: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pgpool_regclass('pgbench_accounts') AND c.relpersistence = 'u'
> 21850 2014-02-04 12:47:11.241 JST LOG: duration: 0.078 ms
>
> The duration was shown as "0.078 ms" thus it seems the query has been
> already finished.
>
> The reason why pg_stat_activity thinks that the query in question is,
> "sync" message has not been sent to the backend yet (at least from
> what I read from postgres.c).

I think that is the probable reason for the above mentioned behaviour.
As I understand here, the problem is that 'state' of backend is shown as
active along with 'query' which according to docs (If state is active this field
shows the currently executing query.) means that query is executing.

This statement holds true for simple query but for prepared statement
(using message 'P', 'B', 'D', 'E', 'S') it might not be completely right as
we update the state only after sync message which can confuse some
users as you have stated. However I don't think it is good idea to change
state in between different messages or at least with the current set of
states.

> I think this inconsistency is not very intutive to users...

Do you think we can fix it in any easy way, or might be updating docs
can make users understand the current situation better?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-07 05:28:36
Message-ID: CAA4eK1+5TysPa-pdoVKkjRJHajVZ0wd6WZfPDQH46SdF9VhNRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 7, 2014 at 10:43 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Feb 4, 2014 at 9:36 AM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>> As you can see, at 2014-02-04 12:47:27.210981+09 the query "SELECT
>> count(*) FROM pg_catalog.pg_class..." is "active" and it seems still
>> running.
>>
>> On the other side, Here is an excerpt from PostgreSQL log:
>>
>> 21850 2014-02-04 12:47:11.241 JST LOG: execute pgpool21805/pgpool21805: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pgpool_regclass('pgbench_accounts') AND c.relpersistence = 'u'
>> 21850 2014-02-04 12:47:11.241 JST LOG: duration: 0.078 ms
>>
>> The duration was shown as "0.078 ms" thus it seems the query has been
>> already finished.
>>
>> The reason why pg_stat_activity thinks that the query in question is,
>> "sync" message has not been sent to the backend yet (at least from
>> what I read from postgres.c).
>
> I think that is the probable reason for the above mentioned behaviour.

One more catch is that I think it can happen for simple query statements
as well, because we first logs the query end duration and then updates
the state and both are not related in code, so there is always a small
window during which this can happen.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: amit(dot)kapila16(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-07 06:16:22
Message-ID: 20140207.151622.517892175662113916.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I think that is the probable reason for the above mentioned behaviour.
> As I understand here, the problem is that 'state' of backend is shown as
> active along with 'query' which according to docs (If state is active this field
> shows the currently executing query.) means that query is executing.
>
> This statement holds true for simple query but for prepared statement
> (using message 'P', 'B', 'D', 'E', 'S') it might not be completely right as
> we update the state only after sync message which can confuse some
> users as you have stated. However I don't think it is good idea to change
> state in between different messages or at least with the current set of
> states.
>
>> I think this inconsistency is not very intutive to users...
>
> Do you think we can fix it in any easy way, or might be updating docs
> can make users understand the current situation better?

One idea is, calling pgstat_report_activity(STATE_IDLE) in
exec_execute_message() of postgres.c. The function has already called
pgstat_report_activity(STATE_RUNNING) which shows "active" state in
pg_stat_actviity view. So why cann't we call
pgstat_report_activity(STATE_IDLE) here.

Somebody might claim that "idle" is a transaction state term. In the
case, I propose to add new state name, say "finished". So above
proposal would calling pgstat_report_activity(STATE_FINISHED) instead.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-07 07:05:05
Message-ID: CAA4eK1Jk33cSUUNppfMr85PPxEpiQH33oL6G58iniz5e=3tCXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 7, 2014 at 11:46 AM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>> I think that is the probable reason for the above mentioned behaviour.
>> As I understand here, the problem is that 'state' of backend is shown as
>> active along with 'query' which according to docs (If state is active this field
>> shows the currently executing query.) means that query is executing.
>>
>> This statement holds true for simple query but for prepared statement
>> (using message 'P', 'B', 'D', 'E', 'S') it might not be completely right as
>> we update the state only after sync message which can confuse some
>> users as you have stated. However I don't think it is good idea to change
>> state in between different messages or at least with the current set of
>> states.
>>
>>> I think this inconsistency is not very intutive to users...
>>
>> Do you think we can fix it in any easy way, or might be updating docs
>> can make users understand the current situation better?
>
> One idea is, calling pgstat_report_activity(STATE_IDLE) in
> exec_execute_message() of postgres.c. The function has already called
> pgstat_report_activity(STATE_RUNNING) which shows "active" state in
> pg_stat_actviity view. So why cann't we call
> pgstat_report_activity(STATE_IDLE) here.
>
> Somebody might claim that "idle" is a transaction state term.

Idle means "The backend is waiting for a new client command.", which
is certainly not true especially in case of 'E' message as still sync
message processing is left.

> In the
> case, I propose to add new state name, say "finished". So above
> proposal would calling pgstat_report_activity(STATE_FINISHED) instead.

Okay, so by state finish, it can mean "The backend has finished execution
of a query.". In that case I think this might need to be called at end
of exec_simple_query() as well, but then there will be very less difference
between idle and finish for simple query.

The argument here could be do we really need a new state for such a short
window between completion of 'E' message and processing of 'S' sync
message considering updation of state is not a very light call which can
be called between processing of 2 messages. It might make sense for cases
where sync message processing can take longer time.

Would it be not sufficient, If we just explain this in docs. Do users really
face any inconvenience or it's a matter of clear understanding for users?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: amit(dot)kapila16(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-07 09:14:09
Message-ID: 20140207.181409.1662680332531859493.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> One idea is, calling pgstat_report_activity(STATE_IDLE) in
>> exec_execute_message() of postgres.c. The function has already called
>> pgstat_report_activity(STATE_RUNNING) which shows "active" state in
>> pg_stat_actviity view. So why cann't we call
>> pgstat_report_activity(STATE_IDLE) here.
>>
>> Somebody might claim that "idle" is a transaction state term.
>
> Idle means "The backend is waiting for a new client command.", which
> is certainly not true especially in case of 'E' message as still sync
> message processing is left.
>
>> In the
>> case, I propose to add new state name, say "finished". So above
>> proposal would calling pgstat_report_activity(STATE_FINISHED) instead.
>
> Okay, so by state finish, it can mean "The backend has finished execution
> of a query.". In that case I think this might need to be called at end
> of exec_simple_query() as well, but then there will be very less difference
> between idle and finish for simple query.

Of course.

> The argument here could be do we really need a new state for such a short
> window between completion of 'E' message and processing of 'S' sync
> message considering updation of state is not a very light call which can
> be called between processing of 2 messages. It might make sense for cases
> where sync message processing can take longer time.
>
> Would it be not sufficient, If we just explain this in docs. Do users really
> face any inconvenience or it's a matter of clear understanding for users?

Well... maybe it's a matter of doc.

Pgpool-II issues such SELECTs intenally to retrieve system catalog
info.

The query is piggy backed on the same connection to PostgreSQL opend
by user (pgpool-II cannot issue "sync" because it closes the
transaction, which in turn closes user's unnamed portal).

If user's query is SELECT, it can be sent to standbys because of load
balance. After such internal queries are sent to master, which will
remain "active" for long time because "sync" is not issued.

I got many inquries from pgpool-II users "Hey, when I looked at
pg_stat_activity, I noticed querys is running for so long time. why?".

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: amit(dot)kapila16(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-07 15:21:16
Message-ID: 11167.1391786476@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
>> The argument here could be do we really need a new state for such a short
>> window between completion of 'E' message and processing of 'S' sync
>> message considering updation of state is not a very light call which can
>> be called between processing of 2 messages. It might make sense for cases
>> where sync message processing can take longer time.

I concur with this objection --- we are not going to add reasons for
extended query mode to be slower than plain mode, especially not reasons
that are totally useless in normal usage.

> The query is piggy backed on the same connection to PostgreSQL opend
> by user (pgpool-II cannot issue "sync" because it closes the
> transaction, which in turn closes user's unnamed portal).

This argument (and usage) seems pretty broken. If you don't issue
sync then how do you know you've gotten all of the command's output?

If you're issuing a flush instead, maybe we could consider whether it's
reasonable to do an extra pgstat_report_activity() upon receipt of a
flush message. But -1 for putting it into the normal control flow.

I'd also vote against inventing a new pgstat state code for this.
Most people will never see it, which would prompt questions in itself.

regards, tom lane


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: amit(dot)kapila16(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-07 23:31:47
Message-ID: 20140208.083147.7144463289061930.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> The query is piggy backed on the same connection to PostgreSQL opend
>> by user (pgpool-II cannot issue "sync" because it closes the
>> transaction, which in turn closes user's unnamed portal).
>
> This argument (and usage) seems pretty broken. If you don't issue
> sync then how do you know you've gotten all of the command's output?
>
> If you're issuing a flush instead, maybe we could consider whether it's
> reasonable to do an extra pgstat_report_activity() upon receipt of a
> flush message. But -1 for putting it into the normal control flow.

Pgpool-II issues "flush" of course.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inconsistency between pg_stat_activity and log_duration
Date: 2014-02-10 04:10:08
Message-ID: CAA4eK1+qUEYs921W912+q_W=3RhK+verBvE2q0r6R34YwPr3zQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 7, 2014 at 2:44 PM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>>> One idea is, calling pgstat_report_activity(STATE_IDLE) in
>>> exec_execute_message() of postgres.c. The function has already called
>>> pgstat_report_activity(STATE_RUNNING) which shows "active" state in
>>> pg_stat_actviity view. So why cann't we call
>>> pgstat_report_activity(STATE_IDLE) here.
>>>
>>> Somebody might claim that "idle" is a transaction state term.
>>
>> Idle means "The backend is waiting for a new client command.", which
>> is certainly not true especially in case of 'E' message as still sync
>> message processing is left.
>>
>>> In the
>>> case, I propose to add new state name, say "finished". So above
>>> proposal would calling pgstat_report_activity(STATE_FINISHED) instead.
>>
>> Okay, so by state finish, it can mean "The backend has finished execution
>> of a query.". In that case I think this might need to be called at end
>> of exec_simple_query() as well, but then there will be very less difference
>> between idle and finish for simple query.
>
> Of course.
>
>> The argument here could be do we really need a new state for such a short
>> window between completion of 'E' message and processing of 'S' sync
>> message considering updation of state is not a very light call which can
>> be called between processing of 2 messages. It might make sense for cases
>> where sync message processing can take longer time.
>>
>> Would it be not sufficient, If we just explain this in docs. Do users really
>> face any inconvenience or it's a matter of clear understanding for users?
>
> Well... maybe it's a matter of doc.
>
> Pgpool-II issues such SELECTs intenally to retrieve system catalog
> info.
>
> The query is piggy backed on the same connection to PostgreSQL opend
> by user (pgpool-II cannot issue "sync" because it closes the
> transaction, which in turn closes user's unnamed portal).
>
> If user's query is SELECT, it can be sent to standbys because of load
> balance. After such internal queries are sent to master, which will
> remain "active" for long time because "sync" is not issued.

In that case, will it not be better if pgpool-II start a transaction explicitly
(BEGIN/START TRANSACTION) rather than relying on automatic commit
mode?

> If you're issuing a flush instead, maybe we could consider whether it's
> reasonable to do an extra pgstat_report_activity() upon receipt of a
> flush message.

I think it might be reasonable to call pgstat_report_activity(), as sending
Flush message indicates user got the complete response of the command
sent to backend, but I am not sure if it is good idea to set
send_ready_for_query = true; and sending ReadyForQuery() message to
FE (Frontend), as it is expected that FE can send more messages like
'B', 'E', so we might need to set the status bases on current state of
backend.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com