Re: Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>
Subject: Re: Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?
Date: 2014-02-24 21:17:31
Message-ID: 20140224211731.GM6718@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2014-02-24 17:55:14 -0300, Alvaro Herrera wrote:
> Greg Stark wrote:
> > I have a database where a a couple rows don't appear in index scans
> > but do appear in sequential scans. It looks like the same problem as
> > Peter reported but this is a different database. I've extracted all
> > the xlogdump records and below are the ones I think are relevant. You
> > can see that lp 2 gets a few HOT updates and concurrently has someone
> > create a MultiXact NO KEY UPDATE lock while one of those HOT updates
> > is pending but not committed.

Per se the sequence of records doesn't look bad (even though I am not
happy that we log intermediate and final rows first, and only then the
start of the chain).

> > The net result seems to be that the ctid
> > update chain got broken. The index of course points to the head of the
> > HOT chain so it doesn't find the live tail whereas the sequential scan
> > picks it up.

Yea, that's the problem.

> > rmgr: Heap len (rec/tot): 291/ 323, tx: 5943849, lsn: FD/2F0ADFC0, prev FD/2F0ADF90, bkp: 0000, desc: hot_update: rel 1663/16385/212653; tid 13065/2 xmax 5943849 ; new tid 13065/3 xmax 0
> > rmgr: Heap2 len (rec/tot): 25/ 57, tx: 5943851, lsn: FD/2F0AE450, prev FD/2F0AE408, bkp: 0000, desc: lock updated: xmax 5943851 msk 000a; rel 1663/16385/212653; tid 13065/3
> > rmgr: MultiXact len (rec/tot): 28/ 60, tx: 5943851, lsn: FD/2F0AE490, prev FD/2F0AE450, bkp: 0000, desc: create mxid 728896 offset 1632045 nmembers 2: 5943849 (nokeyupd) 5943851 (keysh)
> > rmgr: Heap len (rec/tot): 25/ 57, tx: 5943851, lsn: FD/2F0AE4D0, prev FD/2F0AE490, bkp: 0000, desc: lock 728896: rel 1663/16385/212653; tid 13065/2 IS_MULTI EXCL_LOCK

> > lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff |
> > ----+--------+----------+--------+---------+---------+----------+------------+-------------+------------+--------+-
> > 2 | 3424 | 1 | 232 | 5943845 | 728896 | 0 | (13065,2) | 32 | 4419 | 32 |
> > 3 | 3152 | 1 | 272 | 5943849 | 5943879 | 0 | (13065,4) | 49184 | 9475 | 32 |
> > 4 | 2864 | 1 | 287 | 5943879 | 5943880 | 0 | (13065,7) | 49184 | 9475 | 32 |
> > 7 | 2576 | 1 | 287 | 5943880 | 0 | 0 | (13065,7) | 32800 | 10499 | 32 |

Those together explain the story. Note this bit:

static void
heap_xlog_lock(XLogRecPtr lsn, XLogRecord *record)
{
...
HeapTupleHeaderClearHotUpdated(htup);
HeapTupleHeaderSetXmax(htup, xlrec->locking_xid);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Make sure there is no forward chain link in t_ctid */
htup->t_ctid = xlrec->target.tid;
...
}

So, the replay of FD/2F0AE4D0 breaks the ctid chain *and* unsets the
HOT_UPDATED flag.

Which means fkey locks have never properly worked across SR/crash
recovery.

Haven't thought about how to fix it yet, I hope won't have to (hint hint).

We somehow need to have a policy of testing changes to the WAL format
without full_page_writes. They hide bugs in replay far, far too often.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-02-24 21:31:03 Re: Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?
Previous Message Andrew Dunstan 2014-02-24 21:02:54 Re: jsonb and nested hstore