Re: crash with assertions and WAL_DEBUG

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: crash with assertions and WAL_DEBUG
Date: 2014-06-24 14:47:22
Message-ID: 20140624144722.GH5032@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> On 06/21/2014 01:58 PM, Heikki Linnakangas wrote:
> >It's a bit difficult to attach the mark to the palloc calls, as neither
> >the WAL_DEBUG or LWLOCK_STATS code is calling palloc directly, but
> >marking specific MemoryContexts as sanctioned ought to work. I'll take a
> >stab at that.
>
> I came up with the attached patch. It adds a function called
> MemoryContextAllowInCriticalSection(), which can be used to exempt
> specific memory contexts from the assertion. The following contexts
> are exempted:

There is a typo in the comment to that function, "This functions can be
used", s/functions/function/

Andres Freund wrote:

> > @@ -1258,6 +1259,25 @@ begin:;
> > if (XLOG_DEBUG)
> > {
> > StringInfoData buf;
> > + static MemoryContext walDebugCxt = NULL;
> > + MemoryContext oldCxt;
> > +
> > + /*
> > + * Allocations within a critical section are normally not allowed,
> > + * because allocation failure would lead to a PANIC. But this is just
> > + * debugging code that no-one is going to enable in production, so we
> > + * don't care. Use a memory context that's exempt from the rule.
> > + */
> > + if (walDebugCxt == NULL)
> > + {
> > + walDebugCxt = AllocSetContextCreate(TopMemoryContext,
> > + "WAL Debug",
> > + ALLOCSET_DEFAULT_MINSIZE,
> > + ALLOCSET_DEFAULT_INITSIZE,
> > + ALLOCSET_DEFAULT_MAXSIZE);
> > + MemoryContextAllowInCriticalSection(walDebugCxt, true);
> > + }
> > + oldCxt = MemoryContextSwitchTo(walDebugCxt);
>
> This will only work though if the first XLogInsert() isn't called from a
> critical section. I'm not sure it's a good idea to rely on that.

Ah, true -- AllocSetContextCreate cannot be called from within a
critical section.

> > diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
> > index 3c1c81a..4264373 100644
> > --- a/src/backend/storage/smgr/md.c
> > +++ b/src/backend/storage/smgr/md.c
> > @@ -219,6 +219,16 @@ mdinit(void)
> > &hash_ctl,
> > HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
> > pendingUnlinks = NIL;
> > +
> > + /*
> > + * XXX: The checkpointer needs to add entries to the pending ops
> > + * table when absorbing fsync requests. That is done within a critical
> > + * section. It means that there's a theoretical possibility that you
> > + * run out of memory while absorbing fsync requests, which leads to
> > + * a PANIC. Fortunately the hash table is small so that's unlikely to
> > + * happen in practice.
> > + */
> > + MemoryContextAllowInCriticalSection(MdCxt, true);
> > }
> > }
>
> Isn't that allowing a bit too much? We e.g. shouldn't allow
> _fdvec_alloc() within a crritical section. Might make sense to create a
> child context for it.

I agree.

Rahila Syed wrote:

> The patch on compilation gives following error,
>
> mcxt.c: In function ‘MemoryContextAllowInCriticalSection’:
> mcxt.c:322: error: ‘struct MemoryContextData’ has no member named
> ‘allowInCriticalSection’
>
> The member in MemoryContextData is defined as 'allowInCritSection' while
> the MemoryContextAllowInCriticalSection accesses the field as
> 'context->allowInCriticalSection'.

It appears Heikki did a search'n replace for "->allowInCritSection"
before submitting, which failed to match the struct declaration.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vik Fearing 2014-06-24 14:50:03 Re: idle_in_transaction_timeout
Previous Message David G Johnston 2014-06-24 14:31:26 Re: idle_in_transaction_timeout