Re: BUG #6619: Misleading output from slave when host is not running

Lists: pgsql-bugs
From: petteri(dot)raty(at)aalto(dot)fi
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #6619: Misleading output from slave when host is not running
Date: 2012-04-27 07:47:50
Message-ID: E1SNfuA-0001O7-0C@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 6619
Logged by: Petteri Räty
Email address: petteri(dot)raty(at)aalto(dot)fi
PostgreSQL version: 9.1.3
Operating system: Gentoo Linux
Description:

I setup a hot standby master and slave following instructions at:

http://michael.otacoo.com/postgresql-2/postgres-9-1-setup-a-synchronous-stand-by-server-in-5-minutes/

I left archive mode off.

When I started the slave without the master running I got the following
output:

$ postgres -D gsd-replica/
LOG: database system was interrupted while in recovery at log time
2012-04-25 12:01:33 UTC
HINT: If this has occurred more than once some data might be corrupted and
you might need to choose an earlier recovery target.
LOG: entering standby mode
WARNING: WAL was generated with wal_level=minimal, data may be missing
HINT: This happens if you temporarily set wal_level=minimal without taking
a new base backup.
FATAL: hot standby is not possible because wal_level was not set to
"hot_standby" on the master server
HINT: Either set wal_level to "hot_standby" on the master, or turn off
hot_standby here.
LOG: startup process (PID 28761) exited with exit code 1
LOG: aborting startup due to startup process failure

The error message above on the FATAL line is wrong (or at least misleading).
The real problem should be that it can't connect to the master. The
wal_level on the master is hot_standby (captured after I started it):

=# SHOW wal_level;
wal_level
-------------
hot_standby
(1 row)


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: petteri(dot)raty(at)aalto(dot)fi
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6619: Misleading output from slave when host is not running
Date: 2012-04-27 14:16:04
Message-ID: CA+U5nMKRXCmD3ihSk5YxUHAbBQQ4cJzuPwVQWGJKBfQtonFM=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Apr 27, 2012 at 8:47 AM, <petteri(dot)raty(at)aalto(dot)fi> wrote:

> LOG:  entering standby mode
> WARNING:  WAL was generated with wal_level=minimal, data may be missing
> HINT:  This happens if you temporarily set wal_level=minimal without taking
> a new base backup.
> FATAL:  hot standby is not possible because wal_level was not set to
> "hot_standby" on the master server
> HINT:  Either set wal_level to "hot_standby" on the master, or turn off
> hot_standby here.
> LOG:  startup process (PID 28761) exited with exit code 1
> LOG:  aborting startup due to startup process failure
>
> The error message above on the FATAL line is wrong (or at least misleading).
> The real problem should be that it can't connect to the master. The
> wal_level on the master is hot_standby (captured after I started it):

The HINT that we should simply set something on the master is a little
misleading with respect to timing. However, if the master and the
standby aren't even connected and you know that, how did you expect
there to be a causal link between the setting on the master and the
state of the standby?

What do you suggest the messages say?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: petteri(dot)raty(at)aalto(dot)fi
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6619: Misleading output from slave when host is not running
Date: 2012-04-27 18:29:03
Message-ID: CA+TgmoaGAxGBO3_kY-Jc_2VY84smCZj8G_FzsiVxW0pViwgwbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Apr 27, 2012 at 3:47 AM, <petteri(dot)raty(at)aalto(dot)fi> wrote:
> When I started the slave without the master running I got the following
> output:
>
> $ postgres -D gsd-replica/
> LOG:  database system was interrupted while in recovery at log time
> 2012-04-25 12:01:33 UTC
> HINT:  If this has occurred more than once some data might be corrupted and
> you might need to choose an earlier recovery target.
> LOG:  entering standby mode
> WARNING:  WAL was generated with wal_level=minimal, data may be missing
> HINT:  This happens if you temporarily set wal_level=minimal without taking
> a new base backup.
> FATAL:  hot standby is not possible because wal_level was not set to
> "hot_standby" on the master server
> HINT:  Either set wal_level to "hot_standby" on the master, or turn off
> hot_standby here.
> LOG:  startup process (PID 28761) exited with exit code 1
> LOG:  aborting startup due to startup process failure
>
> The error message above on the FATAL line is wrong (or at least misleading).

I think it's trying to tell you that you had wal_level=minimal
configured on the master *at the time you took the base backup*.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Petteri Räty <petteri(dot)raty(at)aalto(dot)fi>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6619: Misleading output from slave when host is not running
Date: 2012-04-28 22:45:37
Message-ID: 4F9C7311.1030103@aalto.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 27.04.2012 17:16, Simon Riggs wrote:
> On Fri, Apr 27, 2012 at 8:47 AM, <petteri(dot)raty(at)aalto(dot)fi> wrote:
>
>> LOG: entering standby mode
>> WARNING: WAL was generated with wal_level=minimal, data may be missing
>> HINT: This happens if you temporarily set wal_level=minimal without taking
>> a new base backup.
>> FATAL: hot standby is not possible because wal_level was not set to
>> "hot_standby" on the master server
>> HINT: Either set wal_level to "hot_standby" on the master, or turn off
>> hot_standby here.
>> LOG: startup process (PID 28761) exited with exit code 1
>> LOG: aborting startup due to startup process failure
>>
>> The error message above on the FATAL line is wrong (or at least misleading).
>> The real problem should be that it can't connect to the master. The
>> wal_level on the master is hot_standby (captured after I started it):
>
> The HINT that we should simply set something on the master is a little
> misleading with respect to timing. However, if the master and the
> standby aren't even connected and you know that, how did you expect
> there to be a causal link between the setting on the master and the
> state of the standby?
>

I started investigating after seeing that it didn't start up and found
that the master had a firewall preventing from connecting to the port
where I had setup postgres to listen.

>
> What do you suggest the messages say?
>

If the slave had no way to connect to the master then how can the slave
tell how "hot_standby" is configured there? I am expecting the message
to tell me that it can't connect to the master.

Regards,
Petteri