Re: dynamic shared memory

From: Jim Nasby <jim(at)nasby(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: dynamic shared memory
Date: 2013-09-06 19:40:46
Message-ID: 522A2FBE.9090104@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/5/13 11:37 AM, Robert Haas wrote:
>> ISTM that at some point we'll want to look at putting top-level shared
>> >memory into this system (ie: allowing dynamic resizing of GUCs that affect
>> >shared memory size).
> A lot of people want that, but being able to resize the shared memory
> chunk itself is only the beginning of the problem. So I wouldn't hold
> my breath.

<starts breathing again>

>> >Wouldn't it protect against a crash while writing the file? I realize the
>> >odds of that are pretty remote, but AFAIK it wouldn't cost that much to
>> >write a new file and do an atomic mv...
> If there's an OS-level crash, we don't need the state file; the shared
> memory will be gone anyway. And if it's a PostgreSQL-level failure,
> this game neither helps nor hurts.
>
>>> >>Sure. A messed-up backend can clobber the control segment just as it
>>> >>can clobber anything else in shared memory. There's really no way
>>> >>around that problem. If the control segment has been overwritten by a
>>> >>memory stomp, we can't use it to clean up. There's no way around that
>>> >>problem except to not the control segment, which wouldn't be better.
>> >
>> >Are we trying to protect against "memory stomps" when we restart after a
>> >backend dies? I thought we were just trying to ensure that all shared data
>> >structures were correct and consistent. If that's the case, then I was
>> >thinking that by using a pointer that can be updated in a CPU-atomic fashion
>> >we know we'd never end up with a corrupted entry that was in use; the
>> >partial write would be to a slot with nothing pointing at it so it could be
>> >safely reused.
> When we restart after a backend dies, shared memory contents are
> completely reset, from scratch. This is true of both the fixed size
> shared memory segment and of the dynamic shared memory control
> segment. The only difference is that, with the dynamic shared memory
> control segment, we need to use the segment for cleanup before
> throwing it out and starting over. Extra caution is required because
> we're examining memory that could hypothetically have been stomped on;
> we must not let the postmaster do anything suicidal.

Not doing something suicidal is what I'm worried about (that and not cleaning up as well as possible).

The specific scenario I'm worried about is something like a PANIC in the middle of the snprintf call in dsm_write_state_file(). That would leave that file in a completely unknown state so who knows what would then happen on restart. ISTM that writing a temp file and then doing a filesystem mv would eliminate that issue.

Or is it safe to assume that the snprintf call will be atomic since we're just spitting out a long?
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-09-06 19:49:51 Re: Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?
Previous Message Jeff Janes 2013-09-06 19:21:14 Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?