Re: WAL format and API changes (9.5)

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: WAL format and API changes (9.5)
Date: 2014-07-02 07:23:50
Message-ID: CAB7nPqQzPgY_=Q2XQf8kKW7evwx_FAmAfFa9re7uQmaGPfjG5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 2, 2014 at 4:09 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
>
> 3) I noticed a bug in gin redo code path when trying to use the WAL replay
> facility.
>

Looking at the backtrace, it seems that a page for gin is corrupted. The
buffer capture patch does some sanity checks on the page format when
performing masking and it is failing in one of them on a standby when
kicking ginRedoUpdateMetapage:
if (pd_lower > pd_upper || pd_special < pd_upper ||
pd_lower < SizeOfPageHeaderData || pd_special > BLCKSZ)
{
elog(ERROR, "invalid page at %X/%08X\n",
((PageHeader) page)->pd_lsn.xlogid,
((PageHeader) page)->pd_lsn.xrecoff);
}

frame #4: 0x000000010437dec5 postgres`CheckForBufferLeaks + 165 at
bufmgr.c:1778
frame #5: 0x000000010437df1e postgres`AtProcExit_Buffers(code=1, arg=0)
+ 30 at bufmgr.c:1750
frame #6: 0x000000010438fded postgres`shmem_exit(code=1) + 301 at
ipc.c:263
frame #7: 0x000000010438fc1c postgres`proc_exit_prepare(code=1) + 124
at ipc.c:187
frame #8: 0x000000010438fb63 postgres`proc_exit(code=1) + 19 at
ipc.c:102
frame #9: 0x0000000104555b3c postgres`errfinish(dummy=0) + 1180 at
elog.c:555
frame #10: 0x00000001045590de postgres`elog_finish(elevel=20,
fmt=0x0000000104633d4f) + 830 at elog.c:1362
frame #11: 0x000000010437c1af
postgres`mask_unused_space(page=0x00007fff5bc20a70) + 159 at bufcapt.c:78
frame #12: 0x000000010437b53d
postgres`mask_heap_page(page=0x00007fff5bc20a70) + 29 at bufcapt.c:95
frame #13: 0x000000010437b1cd
postgres`buffer_capture_write(newcontent=0x0000000104ab3980, blkno=0) + 205
at bufcapt.c:329
frame #14: 0x000000010437bc7d postgres`buffer_capture_forget(buffer=82)
+ 349 at bufcapt.c:433
frame #15: 0x00000001043801c9 postgres`LockBuffer(buffer=82, mode=0) +
233 at bufmgr.c:2773
frame #16: 0x00000001043800c8 postgres`UnlockReleaseBuffer(buffer=82) +
24 at bufmgr.c:2554
frame #17: 0x0000000103fefb03
postgres`ginRedoUpdateMetapage(lsn=335350144, record=0x00007fb1740382f0) +
1843 at ginxlog.c:580
frame #18: 0x0000000103fede96 postgres`gin_redo(lsn=335350144,
record=0x00007fb1740382f0) + 534 at ginxlog.c:724
frame #19: 0x00000001040ad692 postgres`StartupXLOG + 8482 at xlog.c:6810
frame #20: 0x0000000104330e0e postgres`StartupProcessMain + 430 at
startup.c:224
frame #21: 0x00000001040c64d9 postgres`AuxiliaryProcessMain(argc=2,
argv=0x00007fff5bc231b0) + 1897 at bootstrap.c:416
frame #22: 0x000000010432b1a8
postgres`StartChildProcess(type=StartupProcess) + 328 at postmaster.c:5090
frame #23: 0x00000001043292b9 postgres`PostmasterMain(argc=3,
argv=0x00007fb173c044e0) + 5401 at postmaster.c:1212
frame #24: 0x000000010426f995 postgres`main(argc=3,
argv=0x00007fb173c044e0) + 773 at main.c:219
Note that I have never seen that with vanilla, only with this patch.
Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-07-02 07:29:05 Re: 9.5 CF1
Previous Message Michael Paquier 2014-07-02 07:09:46 Re: WAL format and API changes (9.5)