Re: Postgresql 9.3.4 Streaming Replication Standby invalid Page block

From: "Burgess, Freddie" <FBurgess(at)Radiantblue(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: "PostgreSQL Bugs ‎[pgsql-bugs(at)postgresql(dot)org]‎" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Postgresql 9.3.4 Streaming Replication Standby invalid Page block
Date: 2014-07-02 20:04:27
Message-ID: 3BBE635F64E28D4C899377A61DAA9FE02E662601@NBSVR-MAIL01.radiantblue.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

show data_checksums;
data_checksums
----------------
off

tabsdb=# select version();
version
----------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4). 64-bit

On both Master/Standby
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
The standby replayed all of the outstanding WAL logs overnight and we have caught up with the primary database now, and streaming replication is running fine now.

The relation "pg_tblspc/16435/PG_9.3_201306121/16444/125127698" points to a Partition tablespace with data from the year 2007. I verified that the row counts match up between the master/slave on the tables that reside on that tablespace.

Is there anything else I can do to verify the consistency on the standby?

thanks

________________________________________
From: Andres Freund [andres(at)2ndquadrant(dot)com]
Sent: Wednesday, July 02, 2014 7:09 AM
To: Heikki Linnakangas
Cc: Burgess, Freddie; "PostgreSQL Bugs ‎[pgsql-bugs(at)postgresql(dot)org]‎"
Subject: Re: [BUGS] Postgresql 9.3.4 Streaming Replication Standby invalid Page block

On 2014-07-02 14:02:27 +0300, Heikki Linnakangas wrote:
> On 07/02/2014 02:03 AM, Burgess, Freddie wrote:
> > PostgreSQL version: 9.3.4
> > Operating system: rhel 6.4 linux
> > Action: stream replication Master/Slave
> > Description:
> >
> >Last entries in the PostgreSQL log file before the standby crashed, the primary seems unaffected
> >
> >LOG: restored log file "0000000100001127000000cc" from archive
> >FATAL: invalid page in block 464698 of relation pg_tblspc/16435/PG_9.3_201306121/16444/125127698
> >CONTEXT: xlog redo vacuum: rel 16435/16444/125127698; blk 512019, lastBlockVacuumed 0
> >LOG: startup process (PID 27797) exited with exit code 1
> >LOG: terminating any other active server processes
> >
> >We did re-started the database and the process of restoring the log file has continued beyond this point, but is are standby server corrupted?

Do you run with data checksums enabled?

> Sounds exactly like this bug:
>
> http://www.postgresql.org/message-id/flat/CAL_0b1s4QCkFy_55kk_8XWcJPs7wsgVWf8vn4=jXe6V4R7Hxmg(at)mail(dot)gmail(dot)com
>
> but that was fixed in 9.3.3 already. Are you sure you're running 9.3.4 in
> the standby too?

Hm - that bug was about uninitialized pages, not invalid ones. I don't
immediately see why it'd be legal to have a invalid page (as in
!PageIsVerified()) somewhere? At least not after reaching consistency.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2014-07-02 23:23:32 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Previous Message Bruce Momjian 2014-07-02 19:32:55 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts