Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages

From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>, Максим Панченко <Panchenko(at)gw(dot)tander(dot)ru>, Сизов Сергей Павлович <sizov_sp(at)gw(dot)tander(dot)ru>, Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>
Subject: Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Date: 2013-12-20 20:01:55
Message-ID: CAL_0b1vhCTWM2Le2+u4ycdCfTscuicdU=7HMss_cuokunQxh2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Fri, Dec 20, 2013 at 11:34 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 12/20/2013 01:13 AM, Sergey Konoplev wrote:
>> On Thu, Dec 19, 2013 at 3:04 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
>> wrote:
>>> On 2013-12-19 14:37:04 -0800, Sergey Konoplev wrote:
>>>>
>>>> It was suffering from this problem on 9.2.4, mostly last couple of
>>>> weeks when I had to rebuild the replica almost every 3 days, and I
>>>> hoped it would be fixed in 9.2.6, but it is not.
>>>
>>>
>>> This actually is a separate bug from the one fixed. Not that the CONTEXT
>>> message isn't talking about "visible" but about "vacuum".
>>
>>
>> Oh, I see. I have completely lost in these bugs last several months.
>>
>> Are there anything I can do to prevent it except waiting for a patch?
>
> I wonder if this might be the same bug Tomonari Katsumata reported
> yesterday:
> http://www.postgresql.org/message-id/E1VtTni-00082E-Jv@wrigleys.postgresql.org.
> Since this keeps happening to you, could you try the patch he posted and see
> if it helps?

For me it looks like something completely different from the
Tomonari's bug. I does not have problems with restarting and it always
has something to do with vacuum. The error message is the same every
time it breaks down:

2013-12-19 20:51:22 MSK 19938 @ from [vxid:1/0 txid:0] [] WARNING:
page 14833 of relation base/16436/3321003988 is uninitialized
2013-12-19 20:51:22 MSK 19938 @ from [vxid:1/0 txid:0] [] CONTEXT:
xlog redo vacuum: rel 1663/16436/3321003988; blk 38538,
lastBlockVacuumed 0
2013-12-19 20:51:22 MSK 19938 @ from [vxid:1/0 txid:0] [] PANIC: WAL
contains references to invalid pages
2013-12-19 20:51:22 MSK 19938 @ from [vxid:1/0 txid:0] [] CONTEXT:
xlog redo vacuum: rel 1663/16436/3321003988; blk 38538,
lastBlockVacuumed 0

Also, the issue is reproducible on the production site of a big
enterprise company and it is always a long lasting pain to negotiate
and deploy even a little patch. So I would like to ask you to review
this issue more carefully before I start the negotiation process,
please.

I saved the buggy cluster directory and the core dump in case if you
will need it.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2013-12-20 20:09:31 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Previous Message Heikki Linnakangas 2013-12-20 19:34:30 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-12-20 20:04:25 Re: shared memory message queues
Previous Message Adrian Klaver 2013-12-20 19:57:52 Re: pg_upgrade & tablespaces