Re: Corrupt WAL production possible in gistxlog.c

Lists: pgsql-hackers
From: Yoichi Hirai <yh(at)is(dot)s(dot)u-tokyo(dot)ac(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Corrupt WAL production possible in gistxlog.c
Date: 2009-12-24 03:21:14
Message-ID: 87tyvhf45x.wl%yh@is.s.u-tokyo.ac.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I was reading GiST core codes when I found an XLogInsert()
call that can produce a corrupt WAL record.

== Summary ==
There is an execution path that produces a WAL record whose
xl_info indicates XLOG_GIST_PAGE_UPDATE while the record
actually contains a gistxlogPageSplit structure.

== Details ==
(Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +0000.)

The problematic XLogInsert() call is on gistxlog.c, line 770:
recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata);
where the last argument rdata has a pointer assigned either
on line 741 or on line 752.

When rdata comes from formSplitRdata() at line 741,
rdata contains a reference to a gistxlogPageSplit structure.
This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE.

== Importance ==
I think this poses possible data loss under multiple consequent crashes.

== Fix ==
I attach a simple patch (for HEAD as of the datetime above)
that, I suppose, prevents the corrupt WAL production.
I would be glad if you liked it.

Please note that the problematic execution path exists at
least in current HEAD, REL8_2_STABLE and the branches in between.

Sincerely,

--
Yoichi Hirai
Dept. of Computer Science, The University of Tokyo.

Attachment Content-Type Size
gistxlog_fix_xlinfo.patch text/x-patch 1.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yoichi Hirai <yh(at)is(dot)s(dot)u-tokyo(dot)ac(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Corrupt WAL production possible in gistxlog.c
Date: 2009-12-24 17:53:32
Message-ID: 18571.1261677212@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yoichi Hirai <yh(at)is(dot)s(dot)u-tokyo(dot)ac(dot)jp> writes:
> I was reading GiST core codes when I found an XLogInsert()
> call that can produce a corrupt WAL record.

Thanks for the report! (We didn't really need nine copies though ;-))
Applied back to 8.2.

regards, tom lane