Re: Recursive ReceiveSharedInvalidMessages not safe

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Recursive ReceiveSharedInvalidMessages not safe
Date: 2014-05-05 19:28:57
Message-ID: 20140505192857.GG17909@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-05-05 14:15:58 -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > While investigating an issue pointed out by valgrind around undefined
> > bytes in inval.c SHAREDINVALSMGR_ID processing I noticed that there's a
> > bug in ReceiveSharedInvalidMessages(). It tries to be safe against
> > recursion but it's not:
> > When it recurses into ReceiveSharedInvalidMessages() from it's main loop
> > from inside the inval callback while nextmsg = nummsgs it'll overwrite
> > the 'messages' array with new contents. But at this point the old
> > content of one entry in the messages array is still passed to
> > the LocalExecuteInvalidationMessage() that caused the recursion.
>
> Hm, yeah, so if the called inval function continues to use the message
> contents after doing something that could result in a recursive call,
> it might be looking at trashed data.

FWIW, with a bit of changes around it to make it more visible
(malloc/freeing the array) this sometimes triggers during make check. So
it's quite plausible that this is the caused the relfilenodemap
regression error we were a bit confused about.

I started to look at this code because valgrind was bleating at me
inside inval.c and for the heck of I couldn't understand wh. Since then
this lead me into quite a wild goose chase. Leading me to discover a
couple of things I am not perfectly happy about:

a) SICleanupQueue() sometimes releases and reacquires the write lock
held on the outside. That's pretty damn fragile, not to mention
ugly. Even slight reformulations of the code in SIInsertDataEntries()
can break this... Can we please extend the comment in the latter that
to mention the lock dropping explicitly?
b) we right/left shift -1 in a signed int by 16 in
CacheInvalidateSmgr/LocalExecuteInvalidationMessage(). IIRC that's
implementation defined behaviour.
c) The ProcessMessageList() calls access msg->rc.id to test for the type
of the existing message. That's not nice. The compiler is entirely
free to e.g. copy the entire struct to local memory causing
unitialized memory to be accessed. Entirely cosmetic, but ...

d)
The valgrind splats I was very occasionally getting were:
==21013== Conditional jump or move depends on uninitialised value(s)
==21013== at 0x8DD755: hash_search_with_hash_value (dynahash.c:885)
==21013== by 0x8DD60F: hash_search (dynahash.c:811)
==21013== by 0x799A58: smgrclosenode (smgr.c:357)
==21013== by 0x8B6ABF: LocalExecuteInvalidationMessage (inval.c:566)
==21013== by 0x77BBCF: ReceiveSharedInvalidMessages (sinval.c:127)
==21013== by 0x8B6C3F: AcceptInvalidationMessages (inval.c:640)
==21013== by 0x77FDCC: LockRelationOid (lmgr.c:107)
==21013== by 0x4A6F33: relation_open (heapam.c:1029)
==21013== by 0x4A71EB: heap_open (heapam.c:1195)
==21013== by 0x8B4228: SearchCatCache (catcache.c:1223)
==21013== by 0x8C61B1: SearchSysCache (syscache.c:909)
==21013== by 0x8C62CD: GetSysCacheOid (syscache.c:987)
==21013== Uninitialised value was created by a stack allocation
==21013== at 0x8B64C3: AddCatcacheInvalidationMessage (inval.c:328)

After far, far too much confused head scratching, code reading, random
elog()s et al I found out that this is just because of a deficiency in
valgrind's undefinedness tracking. The problem is that it doesn't really
understand shared memory sufficiently. When a backend stored a message
in the SI ringbuffer it sets a bitmask about initialized memory for a
specific position in the ringubffer. If the *same* backend reads from
the same position in the ringbuffer (4096 messages later) into which
another backend has stored a *different* type of message the bitmap of
initialized bytes will possibly be wrong if the padding is distributed
differently.

Unfortunately this cannot precisely be caught by valgrind's
suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg,
0) in AddCatcacheInvalidationMessage() et al. to suppress these
warnings. Imo we can just add them unconditionally, but if somebody else
prefers we can add #ifdef USE_VALGRIND around them.

I can do a), c), d), if others agree but I am not sure about b)?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-05-05 19:41:22 Re: Recursive ReceiveSharedInvalidMessages not safe
Previous Message Andres Freund 2014-05-05 19:12:15 Re: pg_shmem_allocations view