Re: [9.3 bug fix] backends emit hundreds of messages when statistics file is inaccessible

Lists: pgsql-hackers
From: "MauMau" <maumau307(at)gmail(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: [9.3 bug fix] backends emit hundreds of messages when statistics file is inaccessible
Date: 2013-07-08 11:34:59
Message-ID: F678F6772E314F38814968F6182753EE@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I've found a bug in PostgreSQL 9.3 beta 2 which is related to statistics
collection which emits excesssive amount of error messages. Please find
attached the patch to fix this. I confirmed this problem occurs in
PostgreSQL 9.2, too. Could you backport this?

[Problem]
If the directory specified by stats_temp_directory becomes inaccessible to
postgres, enormous amount of WARNING messages are output. For example, when
I set wrong permission on the directory by mistake while the database server
is running and then started psql, more than 600 messages were emitted in a
few seconds like this:

WARNING: could not open statistics file
"/work/maumau/stats_tmp/pgstat.stat": Permission denied
LOG: could not open temporary statistics file
"/work/maumau/stats_tmp/pgstat.tmp": Permission denied
WARNING: could not open statistics file
"/work/maumau/stats_tmp/pgstat.stat": Permission denied
WARNING: could not open statistics file
"/work/maumau/stats_tmp/pgstat.stat": Permission denied
WARNING: could not open statistics file
"/work/maumau/stats_tmp/pgstat.stat": Permission denied
...

Even when there's no client, these messages are output regularly. I guess
this is because autovacuum runs.

[Cause]
In backend_read_statsfile(), the backend loops PGSTAT_POLL_LOOP_COUNT
(=1000) times trying to open the statistics file. The backend emits a
WARNING message if it fails to open the file in every iteration of the loop.

[How to fix]
Emit the message upon the first failed attempt. For some diagnostics
purposes, emit the message with DEBUG2 level on the subsequent attempts.
DEBUG2 is arbitrary.

Regards
MauMau

Attachment Content-Type Size
reduce_statsfile_error.patch application/octet-stream 4.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "MauMau" <maumau307(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [9.3 bug fix] backends emit hundreds of messages when statistics file is inaccessible
Date: 2013-07-08 13:08:57
Message-ID: 27433.1373288937@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"MauMau" <maumau307(at)gmail(dot)com> writes:
> I've found a bug in PostgreSQL 9.3 beta 2 which is related to statistics
> collection which emits excesssive amount of error messages. Please find
> attached the patch to fix this. I confirmed this problem occurs in
> PostgreSQL 9.2, too. Could you backport this?

It's not apparent that this is a good idea, let alone a bug fix.

> If the directory specified by stats_temp_directory becomes inaccessible to
> postgres, enormous amount of WARNING messages are output.

Well, yeah, because all of that functionality just broke. Not warning
about it doesn't seem like a good idea. AFAICT, your patch would result
in the situation not being remarked on anywhere except in the postmaster
log.

regards, tom lane


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [9.3 bug fix] backends emit hundreds of messages when statistics file is inaccessible
Date: 2013-07-08 21:22:11
Message-ID: 843E3533897C408ABB614B6D55A13910@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> "MauMau" <maumau307(at)gmail(dot)com> writes:
>> If the directory specified by stats_temp_directory becomes inaccessible
>> to
>> postgres, enormous amount of WARNING messages are output.
>
> Well, yeah, because all of that functionality just broke. Not warning
> about it doesn't seem like a good idea. AFAICT, your patch would result
> in the situation not being remarked on anywhere except in the postmaster
> log.

No, my patch makes backend_read_statsfile_timestamp() emit just one warning
message instead of outputing the same message 1,000 times in one call. So
your concern does not exist. Is this OK?

Regards
MauMau