Quick Links

New WAL code dumps core trivially on replay of bad data

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	New WAL code dumps core trivially on replay of bad data
Date:	2012-08-18 01:46:55
Message-ID:	10185.1345254415@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I just had HEAD fail to recover after a backend core dump, because the
startup process dumped core itself during replay:

LOG: all server processes terminated; reinitializing
LOG: database system was interrupted; last known up at 2012-08-17 20:47:37 EDT
LOG: database system was not properly shut down; automatic recovery in progress
LOG: startup process (PID 9418) was terminated by signal 11
LOG: aborting startup due to startup process failure

The startup process's stack trace is

#0 0x26fd1c in RecordIsValid (record=0x4008d7a0, recptr=80658424, emode=15)
at xlog.c:3713
3713 COMP_CRC32(crc, XLogRecGetData(record), len);
(gdb) bt
#0 0x26fd1c in RecordIsValid (record=0x4008d7a0, recptr=80658424, emode=15)
at xlog.c:3713
#1 0x270690 in ReadRecord (RecPtr=0x7b03bad0, emode=15,
fetching_ckpt=0 '\000') at xlog.c:4006
#2 0x2761e8 in StartupXLOG () at xlog.c:6550
#3 0x463d28 in StartupProcessMain () at startup.c:222
#4 0x2ade78 in AuxiliaryProcessMain (argc=2, argv=0x7b03b850)
at bootstrap.c:418
#5 0x4631e0 in StartChildProcess (type=StartupProcess) at postmaster.c:4564
#6 0x461ddc in PostmasterStateMachine () at postmaster.c:3167
#7 0x460e48 in reaper (postgres_signal_arg=1074321312) at postmaster.c:2609

The current WAL address is 80658424 == 0x04cebff8, that is just 8 bytes
short of a page boundary, and what RecordIsValid thinks it is dealing
with is

(gdb) p *record
$1 = {xl_tot_len = 0, xl_xid = 0, xl_len = 1074108016, xl_info = 0 '\000',
xl_rmid = 0 '\000', xl_prev = 412316860416, xl_crc = 64}
(gdb) x/32 record
0x4008d7a0: 0x00000000 0x00000000 0x40059670 0x00008009
0x4008d7b0: 0x00000060 0x00000000 0x00000040 0x10000000
0x4008d7c0: 0x00000000 0x04cebec0 0x620583d8 0x00000000
0x4008d7d0: 0x00000000 0x04cebf98 0x00000001 0x01000000
0x4008d7e0: 0x00000000 0x00001947 0x0000c000 0x00000001
0x4008d7f0: 0x00000000 0x0000029a 0x00000001 0x00000000
0x4008d800: 0x00000000 0x502ee5c0 0x00000000 0x00000000
0x4008d810: 0x00000000 0x00000000 0x00000000 0x00000000

so it merrily tries to compute a checksum on a gigabyte worth of data,
and soon falls off the end of memory.

In reality, inspection of the WAL file suggests that this is the end of
valid data and what should have happened is that replay just stopped.
The xl_len and so forth shown above are just garbage from off the end of
what was actually read from the file (everything beyond offset 0xcebff8
in file 4 is in fact zeroes).

I'm not sure whether this is just a matter of having failed to
sanity-check that xl_tot_len is at least SizeOfXLogRecord, or whether
there is a deeper problem with the new design of continuation records
that makes it impossible to validate records safely.

Certainly it's completely insane for RecordIsValid to be trusting xl_len
so implicitly as it does, and to be blithely CRC'ing backup blocks that
could extend off the known length of the record as well --- that is,
checking only at line 3738 to see if we've fallen off the end of memory
is a tad too late. I realize that that code was like that before, but
apparently the earlier logic protected it to a greater extent than it
does now.

regards, tom lane

Responses

Re: New WAL code dumps core trivially on replay of bad data at 2012-08-18 05:52:01 from Amit kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeff Janes	2012-08-18 02:18:52	"CLUSTER VERBOSE" tab completion
Previous Message	Euler Taveira	2012-08-17 23:08:49	Re: NOT NULL constraints in foreign tables