Re: log_truncate_on_rotation=on is not truncating

Lists: pgsql-bugs
From: Chan Pham <chan(dot)pham(at)iovation(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: log_truncate_on_rotation=on is not truncating
Date: 2012-06-20 18:23:44
Message-ID: CADnKLtYdYujf1Narz0ChL+=YuEMWxY8joungPUiDz3H+K_7+xA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

We discovered that on our servers where we do daily shutdown and startup
for a cold backup, the pg_log logs are not truncated when it switches to
the new day, when log_truncation_on_rotation=on. The shutdown/startup of
postgresql happens daily at 16:00 system time. Also observed that only when
the shutdown event happen at 16:00 pm, it is only when the log is rotated
for the current day, and the event message is appended to last week's
postgreql-%a.log.
On our other servers where there is not a daily shutdown/startup, the
postgresql-%a.log is rotated to the new day at 00:00 hour and is truncated
of all last week's content. We first noticed this only our standby servers
where we do daily cold backup and restart postgresql, and no other
activities. Then I was able to reproduce this issue on a standalone server
by scheduling a daily shutdown/restart.
*
*
*Here are settings and log details:*
*
*
Postgresql is shutdown/startup via service:
/sbin/service postgresql-9.1 stop/start -- calls pg_ctl

OS versions:
centos-release-notes-5.5-0
and
centos-release-6-2.el6.centos.7.x86_64

PostgreSQL version:
PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit
and
PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit

Postgresql.conf settings for pg_logs related:
logging_collector=on
log_directory='pg_log'
log_filename='postgresql-%a.log'
log_truncate_on_rotation=on
log_rotation_age=1d
log_rotation_size=0

timezone = 'UTC'

Service timezone:
>date
Tue Jun 19 11:25:56 PDT 2012

Content of a postgresql-Tue.log in its entirety. Note this is a standby
slave so there are not much activities other than shutdown/restart
performed for a cold backup. The *bold *date is just to de-mark the current
day where the entries were appended. The entries of 2012-06-19 was an
intentional syntax error to get the an entry for the current day.

2012-06-12 16:00:02 PDT 23761 4fd67fc4.5cd1 LOG: received fast shutdown
request
2012-06-12 16:00:02 PDT 23761 4fd67fc4.5cd1 LOG: aborting any active
transactions
2012-06-12 16:00:02 PDT 23800 4fd67fc8.5cf8 FATAL: terminating
walreceiver process due to administrator command
2012-06-12 16:00:02 PDT 23765 4fd67fc5.5cd5 LOG: shutting down
2012-06-12 16:00:02 PDT 23765 4fd67fc5.5cd5 LOG: database system is shut
down
2012-06-12 16:31:08 PDT 30458 4fd7d13c.76fa LOG: database system was
shut down in recovery at 2012-06-12 16:00:02 PDT
2012-06-12 16:31:08 PDT 30458 4fd7d13c.76fa LOG: entering standby mode
2012-06-12 16:31:08 PDT 30458 4fd7d13c.76fa LOG: redo starts at
6B/D3E1EA70
2012-06-12 16:31:10 PDT 30458 4fd7d13c.76fa LOG: consistent recovery
state reached at 6B/DB8FDFE8
2012-06-12 16:31:10 PDT 30458 4fd7d13c.76fa LOG: unexpected pageaddr
69/D68FE000 in log file 107, segment 219, offset 9428992
2012-06-12 16:31:10 PDT 30454 4fd7d13a.76f6 LOG: database system is
ready to accept read only connections
2012-06-12 16:31:11 PDT 30494 4fd7d13e.771e LOG: streaming replication
successfully connected to primary
*2012-06-19 *11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea ERROR:
syntax error at or near ":" at character 17
*2012-06-19 *11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea STATEMENT:
select version():

Listing of pg_log directory. You can see here that the last entry of each
day (file timestamp) is when postgresql is restarted at 16:30. Except June
19, I just force a syntax error to create an entry for today, June 19:

-rw------- 1 postgres postgres 1403 Jun 13 16:30 postgresql-Wed.log
-rw------- 1 postgres postgres 2868 Jun 14 16:31 postgresql-Thu.log
-rw------- 1 postgres postgres 2418 Jun 15 16:31 postgresql-Fri.log
-rw------- 1 postgres postgres 2222 Jun 16 16:31 postgresql-Sat.log
-rw------- 1 postgres postgres 2208 Jun 17 16:30 postgresql-Sun.log
-rw------- 1 postgres postgres 2832 Jun 18 16:31 postgresql-Mon.log
-rw------- 1 postgres postgres 1355 Jun 19 11:22 postgresql-Tue.log

Here's normal behavior on a server that does NOT get restarted daily. The
logs are all rotated and truncated at 00:00 as expected:
$ pwd
/data/postgres/9.1/pg_log
ls -ltr
total 4
-rw------- 1 postgres postgres 0 Jun 13 00:00 postgresql-Wed.log
-rw------- 1 postgres postgres 0 Jun 14 00:00 postgresql-Thu.log
-rw------- 1 postgres postgres 1563 Jun 15 14:12 postgresql-Fri.log
-rw------- 1 postgres postgres 0 Jun 16 00:00 postgresql-Sat.log
-rw------- 1 postgres postgres 0 Jun 17 00:00 postgresql-Sun.log
-rw------- 1 postgres postgres 0 Jun 18 00:00 postgresql-Mon.log
-rw------- 1 postgres postgres 0 Jun 19 00:00 postgresql-Tue.log
--
*

Chan Pham
Database Administrator

Direct: 503.943.6773
Fax: 503.224.1581 // AIM: ioChanny // Twitter: iovation
www.iovation.com

<http://www.iovation.com/eml-msg/>
*


From: Viswanatham Kiran Kumar <viswanatham(dot)kirankumar(at)huawei(dot)com>
To: "'Chan Pham'" <chan(dot)pham(at)iovation(dot)com>, <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: log_truncate_on_rotation=on is not truncating
Date: 2012-06-21 12:56:36
Message-ID: 009901cd4fad$4b5ff9d0$e21fed70$@kirankumar@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Chan,

I think this is configuration issue.

This issue is happening because log files will always append when
server is restarted. What I can see from your configuration,
log_rotation_age=1d (1 day) which is greater than daily schedule
shutdown/restart time. So this is the reason why log files are growing.

You can change the log_rotation_age=8h (i.e. any time less than the
schedule shutdown/restart time) as log file name
Is postgreql-%a.log this is sufficient to preserve till last week's log, as
at the time of rotation after 8 hours if the day is not changed then log
file name will be same as previous then truncation will not happen.

 
Thanks & Regards,
Kiran Kumar
   

--------------------------
VISWANATHAM  KIRAN KUMAR
SENIOR TECHNICAL LEADER.
HUAWEI TECHNOLOGIES INDIA PVT. LTD.
EXTN: 7560; LEVEL-7 C & D TOWER
DIAMOND DISTRICT, OLD AIRPORT ROAD
BANGALORE-08
MOBILE: +91-9945115987
 

From: pgsql-bugs-owner(at)postgresql(dot)org
[mailto:pgsql-bugs-owner(at)postgresql(dot)org] On Behalf Of Chan Pham
Sent: Wednesday, June 20, 2012 11:54 PM
To: pgsql-bugs(at)postgresql(dot)org
Subject: [BUGS] log_truncate_on_rotation=on is not truncating

We discovered that on our servers where we do daily shutdown and startup for
a cold backup, the pg_log logs are not truncated when it switches to the new
day, when log_truncation_on_rotation=on. The shutdown/startup of postgresql
happens daily at 16:00 system time. Also observed that only when the
shutdown event happen at 16:00 pm, it is only when the log is rotated for
the current day, and the event message is appended to last week's
postgreql-%a.log. 
On our other servers where there is not a daily shutdown/startup, the
postgresql-%a.log is rotated to the new day at 00:00 hour and is truncated
of all last week's content. We first noticed this only our standby servers
where we do daily cold backup and restart postgresql, and no other
activities. Then I was able to reproduce this issue on a standalone server
by scheduling a daily shutdown/restart.

Here are settings and log details:

Postgresql is shutdown/startup via service:
   /sbin/service postgresql-9.1 stop/start  -- calls pg_ctl

OS versions:
 centos-release-notes-5.5-0
  and
 centos-release-6-2.el6.centos.7.x86_64

PostgreSQL version:
   PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6
20110731 (Red Hat 4.4.6-3), 64-bit
   and
   PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6
20110731 (Red Hat 4.4.6-3), 64-bit

Postgresql.conf settings for pg_logs related:
   logging_collector=on
   log_directory='pg_log'
   log_filename='postgresql-%a.log'
   log_truncate_on_rotation=on
   log_rotation_age=1d
   log_rotation_size=0

   timezone = 'UTC'

Service timezone:  
>date
Tue Jun 19 11:25:56 PDT 2012

Content of a postgresql-Tue.log in its entirety. Note this is a standby
slave so there are not much activities other than shutdown/restart performed
for a cold backup. The bold date is just to de-mark the current day where
the entries were appended. The entries of 2012-06-19 was an intentional
syntax error to get the an entry for the current day.

2012-06-12 16:00:02 PDT   23761 4fd67fc4.5cd1 LOG:  received fast shutdown
request
2012-06-12 16:00:02 PDT   23761 4fd67fc4.5cd1 LOG:  aborting any active
transactions
2012-06-12 16:00:02 PDT   23800 4fd67fc8.5cf8 FATAL:  terminating
walreceiver process due to administrator command
2012-06-12 16:00:02 PDT   23765 4fd67fc5.5cd5 LOG:  shutting down
2012-06-12 16:00:02 PDT   23765 4fd67fc5.5cd5 LOG:  database system is shut
down
2012-06-12 16:31:08 PDT   30458 4fd7d13c.76fa LOG:  database system was shut
down in recovery at 2012-06-12 16:00:02 PDT
2012-06-12 16:31:08 PDT   30458 4fd7d13c.76fa LOG:  entering standby mode
2012-06-12 16:31:08 PDT   30458 4fd7d13c.76fa LOG:  redo starts at
6B/D3E1EA70
2012-06-12 16:31:10 PDT   30458 4fd7d13c.76fa LOG:  consistent recovery
state reached at 6B/DB8FDFE8
2012-06-12 16:31:10 PDT   30458 4fd7d13c.76fa LOG:  unexpected pageaddr
69/D68FE000 in log file 107, segment 219, offset 9428992
2012-06-12 16:31:10 PDT   30454 4fd7d13a.76f6 LOG:  database system is ready
to accept read only connections
2012-06-12 16:31:11 PDT   30494 4fd7d13e.771e LOG:  streaming replication
successfully connected to primary
2012-06-19 11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea ERROR:  syntax
error at or near ":" at character 17
2012-06-19 11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea STATEMENT:
 select version():

Listing of pg_log directory. You can see here that the last entry of each
day (file timestamp) is when postgresql is restarted at 16:30. Except June
19, I just force a syntax error to create an entry for today, June 19:

-rw------- 1 postgres postgres 1403 Jun 13 16:30 postgresql-Wed.log
-rw------- 1 postgres postgres 2868 Jun 14 16:31 postgresql-Thu.log
-rw------- 1 postgres postgres 2418 Jun 15 16:31 postgresql-Fri.log
-rw------- 1 postgres postgres 2222 Jun 16 16:31 postgresql-Sat.log
-rw------- 1 postgres postgres 2208 Jun 17 16:30 postgresql-Sun.log
-rw------- 1 postgres postgres 2832 Jun 18 16:31 postgresql-Mon.log
-rw------- 1 postgres postgres 1355 Jun 19 11:22 postgresql-Tue.log

Here's normal behavior on a server that does NOT get restarted daily. The
logs are all rotated and truncated at 00:00 as expected:
$ pwd
/data/postgres/9.1/pg_log
ls -ltr
total 4
-rw------- 1 postgres postgres    0 Jun 13 00:00 postgresql-Wed.log
-rw------- 1 postgres postgres    0 Jun 14 00:00 postgresql-Thu.log
-rw------- 1 postgres postgres 1563 Jun 15 14:12 postgresql-Fri.log
-rw------- 1 postgres postgres    0 Jun 16 00:00 postgresql-Sat.log
-rw------- 1 postgres postgres    0 Jun 17 00:00 postgresql-Sun.log
-rw------- 1 postgres postgres    0 Jun 18 00:00 postgresql-Mon.log
-rw------- 1 postgres postgres    0 Jun 19 00:00 postgresql-Tue.log
--

Chan Pham
Database Administrator

Direct: 503.943.6773
Fax: 503.224.1581 // AIM: ioChanny // Twitter: iovation
www.iovation.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: viswanatham(dot)kirankumar(at)huawei(dot)com
Cc: "'Chan Pham'" <chan(dot)pham(at)iovation(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: log_truncate_on_rotation=on is not truncating
Date: 2012-07-16 04:53:37
Message-ID: 767.1342414417@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

[ just got around to looking at this bug report, sorry for the delay ]

Viswanatham Kiran Kumar <viswanatham(dot)kirankumar(at)huawei(dot)com> writes:
> Hi Chan,
> I think this is configuration issue.

> This issue is happening because log files will always append when
> server is restarted. What I can see from your configuration,
> log_rotation_age=1d (1 day) which is greater than daily schedule
> shutdown/restart time. So this is the reason why log files are growing.

Yeah, I think that's part of it, but it does still seem like a bug.
The relevant bit of the code is

/*
* The requirements here are to choose the next time > now that is a
* "multiple" of the log rotation interval. "Multiple" can be interpreted
* fairly loosely. In this version we align to log_timezone rather than
* GMT.
*/
rotinterval = Log_RotationAge * SECS_PER_MINUTE; /* convert to seconds */
now = (pg_time_t) time(NULL);
tm = pg_localtime(&now, log_timezone);
now += tm->tm_gmtoff;
now -= now % rotinterval;
now += rotinterval;
now -= tm->tm_gmtoff;
next_rotation_time = now;

Looking at this, it seems like it will choose a next_rotation_time that
is more than rotinterval beyond "now" if you have a log_timezone for
which advancing by gmtoff moves past the next natural rotinterval
multiple. So if the server gets shut down before that time elapses,
it never will think it should rotate the logfile.

Chan, could you say exactly what log_timezone setting is in use on the
servers that are acting funny?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: viswanatham(dot)kirankumar(at)huawei(dot)com
Cc: "'Chan Pham'" <chan(dot)pham(at)iovation(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: log_truncate_on_rotation=on is not truncating
Date: 2012-07-16 05:43:22
Message-ID: 2054.1342417402@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I wrote:
> The relevant bit of the code is

> /*
> * The requirements here are to choose the next time > now that is a
> * "multiple" of the log rotation interval. "Multiple" can be interpreted
> * fairly loosely. In this version we align to log_timezone rather than
> * GMT.
> */

Oh, no, scratch that --- there's nothing wrong with that calculation.
Actually I think the issue is with the logic that decides whether to
truncate or append during rotation:

* Decide whether to overwrite or append. We can overwrite if (a)
* Log_truncate_on_rotation is set, (b) the rotation was triggered by
* elapsed time and not something else, and (c) the computed file name is
* different from what we were previously logging into.
*
* Note: during the first rotation after forking off from the postmaster,
* last_file_name will be NULL. (We don't bother to set it in the
* postmaster because it ain't gonna work in the EXEC_BACKEND case.) So we
* will always append in that situation, even though truncating would
* usually be safe.

This dates from around 2004, and at the time we did not think it was
very important if the first rotation opportunity failed to truncate.
Of course, if you restart the postmaster so often that it never gets to
the second rotation opportunity, that theory falls down. But I guess
it's worth asking why you think it's a good plan to reboot so often.
Most people running Postgres are appalled by the idea that they might
have to have any downtime at all.

regards, tom lane


From: Chan Pham <chan(dot)pham(at)iovation(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: viswanatham(dot)kirankumar(at)huawei(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: log_truncate_on_rotation=on is not truncating
Date: 2012-07-17 18:53:55
Message-ID: CADnKLtYMeATuddVkogJAkHFyRuOGv7sCdH-kMb1peDAWwbAHEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello Tom and all,

Thank you for following up on this.

1. log_timezone:

admin_datamart=# show log_timezone;
log_timezone
--------------
US/Pacific
(1 row)

I have been playing around with changing log_rotation_age = 7h, +/- 1 hour
to see if I can get the elapse timing right. The shutdown/startup time is
still at 1600/18:30. At 8h, it was still appending. But recently, I've
change it to 7h. Some day the log will truncate and some day it didn't. I'm
guessing if its based on elapsed time since startup for truncate to happen
or not, then it make sense that truncate didn't always happen because
startup time can varies (some day tar runs longer).

2. To answer your question as why we are restarting daily is that these
servers are hot standby and we are taking cold backups from there. The plan
is to do hot backup (now that we are on 9.1 and can pause recovery) but we
just haven't gotten time to implement the hot backup yet.

Thanks,

Chan

On Sun, Jul 15, 2012 at 10:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > The relevant bit of the code is
>
> > /*
> > * The requirements here are to choose the next time > now that is a
> > * "multiple" of the log rotation interval. "Multiple" can be
> interpreted
> > * fairly loosely. In this version we align to log_timezone rather
> than
> > * GMT.
> > */
>
> Oh, no, scratch that --- there's nothing wrong with that calculation.
> Actually I think the issue is with the logic that decides whether to
> truncate or append during rotation:
>
> * Decide whether to overwrite or append. We can overwrite if (a)
> * Log_truncate_on_rotation is set, (b) the rotation was triggered by
> * elapsed time and not something else, and (c) the computed file name
> is
> * different from what we were previously logging into.
> *
> * Note: during the first rotation after forking off from the
> postmaster,
> * last_file_name will be NULL. (We don't bother to set it in the
> * postmaster because it ain't gonna work in the EXEC_BACKEND case.)
> So we
> * will always append in that situation, even though truncating would
> * usually be safe.
>
> This dates from around 2004, and at the time we did not think it was
> very important if the first rotation opportunity failed to truncate.
> Of course, if you restart the postmaster so often that it never gets to
> the second rotation opportunity, that theory falls down. But I guess
> it's worth asking why you think it's a good plan to reboot so often.
> Most people running Postgres are appalled by the idea that they might
> have to have any downtime at all.
>
> regards, tom lane
>

--
*

Chan Pham
Database Administrator

Direct: 503.943.6773
Fax: 503.224.1581 // AIM: ioChanny // Twitter: iovation
www.iovation.com

<http://www.iovation.com/eml-msg/>
*