Re: configurability of OOM killer

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: configurability of OOM killer
Date: 2008-02-04 18:57:26
Message-ID: 1202151446.10057.759.camel@dogma.ljc.laika.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2008-02-01 at 19:08 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > This page
> > http://linux-mm.org/OOM_Killer
>
> Egad. Whoever thought *this* was a good idea should be taken out
> and shot:

+1

> /*
> * Processes which fork a lot of child processes are likely
> * a good choice. We add the vmsize of the childs if they
> * have an own mm. This prevents forking servers to flood the
> * machine with an endless amount of childs
> */
>
> In other words, server daemons are preferentially killed, and the parent
> will *always* get zapped in place of its child (since the child cannot
> have a higher score). No wonder we have to turn off OOM kill.
>

Technically, the child could have a higher score, because it only counts
half of the total vm size of the children. At first glance it's not that
bad of an idea, except that it takes into account the total vm size
(including shared memory), not only memory that is exclusive to the
process in question.

It's pretty easy to see that badness() (the function that determines
which process is killed when the OOM killer is invoked) will count the
same byte of memory many times over when calculating the "badness" of a
process like the postgres daemon. If you have shared_buffers=1GB on a
4GB box, and 100 connections open, badness() apparently thinks
postgresql is using about 50GB of memory. Oops. One would think a VM
hacker would know better.

I tried bringing this up on LKML several times (Ron Mayer linked to one
of my posts: http://lkml.org/lkml/2007/2/9/275). If anyone has an inside
connection to the linux developer community, I suggest that they raise
this issue.

If you want to experiment, start a postgres process with shared_buffers
set at 25% of the available memory, and then start about 100 idle
connections. Then, start a process that just slowly eats memory, such
that it will invoke the OOM killer after a couple minutes (badness()
takes into account the time the process has been alive, as well, so you
can't just eat memory in a tight loop).

The postgres process will always be killed, and then it will realize
that it didn't alleviate the memory pressure much, and then kill the
runaway process.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-02-04 19:24:18 Re: release checklist
Previous Message Simon Riggs 2008-02-04 18:46:41 Why are we waiting?