Re: quickdie doing memory allocations (was atomic pin/unpin causing errors)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: quickdie doing memory allocations (was atomic pin/unpin causing errors)
Date: 2016-05-05 15:51:32
Message-ID: 20160505155132.rfmmyi3ppcyvt3gn@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Teodor,

Thanks for analyzing this.

On 2016-05-05 13:50:09 +0300, Teodor Sigaev wrote:
> > I'll try to get a coredump after SIGSEGV, but it could take a time.
>
> Got it!
>
> #0 0x00000008014321d7 in sbrk () from /lib/libc.so.7
> #1 0x0000000801431ddd in sbrk () from /lib/libc.so.7
> #2 0x000000080142e5bb in sbrk () from /lib/libc.so.7
> #3 0x000000080142e085 in sbrk () from /lib/libc.so.7
> #4 0x000000080142de28 in sbrk () from /lib/libc.so.7
> #5 0x000000080142e1cf in sbrk () from /lib/libc.so.7
> #6 0x0000000801439815 in free () from /lib/libc.so.7
> #7 0x000000080149e3d6 in nsdispatch () from /lib/libc.so.7
> #8 0x00000008014a41c6 in __cxa_finalize () from /lib/libc.so.7
> #9 0x000000080144525c in exit () from /lib/libc.so.7
> #10 0x00000000008e1bc2 in quickdie (postgres_signal_arg=3) at postgres.c:2623
> #11 <signal handler called>
> #12 0x0000000801431847 in sbrk () from /lib/libc.so.7
> #13 0x0000000801431522 in sbrk () from /lib/libc.so.7
> #14 0x000000080142d47f in sbrk () from /lib/libc.so.7
> #15 0x0000000801434628 in malloc () from /lib/libc.so.7
> #16 0x0000000000aca278 in AllocSetAlloc (context=0x801c0bb88, size=24) at aset.c:853
> #17 0x0000000000acca0e in MemoryContextAlloc (context=0x801c0bb88, size=24)
> at mcxt.c:764
> #18 0x0000000000aebdb8 in PushActiveSnapshot (snap=0xf4ae10) at snapmgr.c:652
> #19 0x00000000008e54bd in exec_bind_message (input_message=0x7fffffffdf60)
> at postgres.c:1602
> #20 0x00000000008e3957 in PostgresMain (argc=1, argv=0x801d3c968,
> dbname=0x801d3c948 "teodor", username=0x801d3c928 "teodor") at
> postgres.c:4105
> #21 0x0000000000839744 in BackendRun (port=0x801c991c0) at postmaster.c:4258
> #22 0x0000000000838d54 in BackendStartup (port=0x801c991c0) at postmaster.c:3932
> #23 0x0000000000835617 in ServerLoop () at postmaster.c:1690
> #24 0x0000000000832c69 in PostmasterMain (argc=4, argv=0x7fffffffe420) at
> postmaster.c:1298
> #25 0x000000000075f228 in main (argc=4, argv=0x7fffffffe420) at main.c:228
>
> Seems, we have some memory corruption, but it could either separate or the
> same problem.

That looks like independent issue, namely that we're trigger memory
allocations from a signal handler (see frames 12, 11, 10, 9). Presumably
due to system registered atexit handlers. I suspect we should be using
_exit() here? Tom?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-05-05 17:32:55 Initial release notes created for 9.6
Previous Message David Rowley 2016-05-05 14:48:43 Re: pg9.6 segfault using simple query (related to use fk for join estimates)