Linux2.6 overcommit behaviour

Lists: pgsql-hackers
From: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Linux2.6 overcommit behaviour
Date: 2003-08-28 10:43:29
Message-ID: 200308281613.29020.shridhar_daithankar@persistent.co.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

Following is from Documentation/vm/overcommit-accounting
-------------
2 - (NEW) strict overcommit. The total address space commit
for the system is not permitted to exceed swap + a
configurable percentage (default is 50) of physical RAM.
Depending on the percentage you use, in most situations
this means a process will not be killed while accessing
pages but will receive errors on memory allocation as
appropriate.
-------------

Looks like it's been taken care once for all.

Shridhar


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-28 11:41:52
Message-ID: 3F4DEA80.60607@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Yes, in 2.6, which is not yet released. Even after it is released I
expect it to take some time to bed down and make its way into vendor
releases, if the history of 2.4 is anything to go by.

Incidentally, it looks to me like it is only in 2.6 if your kernel is
built with CONFIG_SECURITY, which I expect most will be.

andrew

Shridhar Daithankar wrote:

>Hi all,
>
>Following is from Documentation/vm/overcommit-accounting
>-------------
>2 - (NEW) strict overcommit. The total address space commit
> for the system is not permitted to exceed swap + a
> configurable percentage (default is 50) of physical RAM.
> Depending on the percentage you use, in most situations
> this means a process will not be killed while accessing
> pages but will receive errors on memory allocation as
> appropriate.
>-------------
>
>Looks like it's been taken care once for all.
>
> Shridhar
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Don't 'kill -9' the postmaster
>
>
>


From: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-28 11:52:36
Message-ID: 200308281722.36156.shridhar_daithankar@persistent.co.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday 28 August 2003 17:11, Andrew Dunstan wrote:
> Yes, in 2.6, which is not yet released. Even after it is released I
> expect it to take some time to bed down and make its way into vendor
> releases, if the history of 2.4 is anything to go by.

Better late than never. I sincerely hope that it lives by the expectations..

> Incidentally, it looks to me like it is only in 2.6 if your kernel is
> built with CONFIG_SECURITY, which I expect most will be.

No. It isn't. Just checked the config file. It isn't set. But I haven't tried
setting overcommit as well.

BTW, what is the sway of switching disk IO scheduler in 2.6? Could not find
any references to sysctl switching. Andrew Morton's TODO list still list it
as TODO.

So am I stuck with whatever disk scheduler it's bundled with?

Shridhar


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-28 11:59:04
Message-ID: 3F4DEE88.1080009@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I take that last remark back - it is there whether or not
CONFIG_SECURITY is defined or not. The code is in 2 places - ugh.

andrew

Andrew Dunstan wrote:

>
> Yes, in 2.6, which is not yet released. Even after it is released I
> expect it to take some time to bed down and make its way into vendor
> releases, if the history of 2.4 is anything to go by.
>
> Incidentally, it looks to me like it is only in 2.6 if your kernel is
> built with CONFIG_SECURITY, which I expect most will be.
>
> andrew
>
> Shridhar Daithankar wrote:
>
>> Hi all,
>>
>> Following is from Documentation/vm/overcommit-accounting
>> -------------
>> 2 - (NEW) strict overcommit. The total address space commit
>> for the system is not permitted to exceed swap + a
>> configurable percentage (default is 50) of physical RAM.
>> Depending on the percentage you use, in most situations
>> this means a process will not be killed while accessing
>> pages but will receive errors on memory allocation as
>> appropriate.
>> -------------
>>
>> Looks like it's been taken care once for all.
>>
>> Shridhar
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 4: Don't 'kill -9' the postmaster
>>
>>
>>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>


From: Neil Conway <neilc(at)samurai(dot)com>
To: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-29 02:10:12
Message-ID: 20030829021012.GG63737@home.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 28, 2003 at 05:22:36PM +0530, Shridhar Daithankar wrote:
> BTW, what is the sway of switching disk IO scheduler in 2.6? Could not find
> any references to sysctl switching. Andrew Morton's TODO list still list it
> as TODO.

Sorry, I was mistaken: you can switch I/O schedulers by specifying a
flag at boot-time:

http://marc.theaimsgroup.com/?l=linux-kernel&m=105743728122143&w=2

-Neil


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-30 15:02:50
Message-ID: 200308301502.h7UF2oE03669@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Shridhar Daithankar wrote:
> Hi all,
>
> Following is from Documentation/vm/overcommit-accounting
> -------------
> 2 - (NEW) strict overcommit. The total address space commit
> for the system is not permitted to exceed swap + a
> configurable percentage (default is 50) of physical RAM.
> Depending on the percentage you use, in most situations
> this means a process will not be killed while accessing
> pages but will receive errors on memory allocation as
> appropriate.

It is strange to choose 50% of RAM plus swap (what if your spam is
small). I thought it would be 100% of RAM plus the swap that exceeds RAM
size.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-30 15:16:02
Message-ID: 3F50BFB2.3000701@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


You need to allow some head room, I should think. Actually, the
equivalent of the previously discussed paranoid mode would be to set the
percentage to 0, i.e. ensure you can put every page in swap. If you say
50% then the chances of your running out of room are exceedingly small.

andrew

Bruce Momjian wrote:

>Shridhar Daithankar wrote:
>
>
>>Hi all,
>>
>>Following is from Documentation/vm/overcommit-accounting
>>-------------
>>2 - (NEW) strict overcommit. The total address space commit
>> for the system is not permitted to exceed swap + a
>> configurable percentage (default is 50) of physical RAM.
>> Depending on the percentage you use, in most situations
>> this means a process will not be killed while accessing
>> pages but will receive errors on memory allocation as
>> appropriate.
>>
>>
>
>It is strange to choose 50% of RAM plus swap (what if your spam is
>small). I thought it would be 100% of RAM plus the swap that exceeds RAM
>size.
>
>
>


From: Manfred Spraul <manfred(at)colorfullife(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-30 20:18:16
Message-ID: 3F510688.1050709@colorfullife.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:

>Shridhar Daithankar wrote:
>
>
>>Hi all,
>>
>>Following is from Documentation/vm/overcommit-accounting
>>-------------
>>2 - (NEW) strict overcommit. The total address space commit
>> for the system is not permitted to exceed swap + a
>> configurable percentage (default is 50) of physical RAM.
>> Depending on the percentage you use, in most situations
>> this means a process will not be killed while accessing
>> pages but will receive errors on memory allocation as
>> appropriate.
>>
>>
>
>It is strange to choose 50% of RAM plus swap (what if your spam is
>small). I thought it would be 100% of RAM plus the swap that exceeds RAM
>size.
>
>
Linux doesn't release the swap file page when a page is read back: If a
page is only read by the user space app, then the swapped out page
remains valid, and thus the kernel can skip the write to disk on the
next swapout. Thus if you are paranoid, you must limit the total address
space to the size of your swap files.
If your swap space (your wrote "spam" - I assume a typo) is small, then
you'll run into problems. It's recommended that your swap space should
be 2*physical memory. I assume that many oom killer reports are from
system with too small swap files, and then an updatedb run pushes the
system into oom.

--
Manfred


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-08-31 14:02:08
Message-ID: 3F51FFE0.4080503@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Manfred Spraul wrote:

>>
>> It is strange to choose 50% of RAM plus swap (what if your spam is
>> small). I thought it would be 100% of RAM plus the swap that exceeds RAM
>> size.
>>
>>
> Linux doesn't release the swap file page when a page is read back: If
> a page is only read by the user space app, then the swapped out page
> remains valid, and thus the kernel can skip the write to disk on the
> next swapout. Thus if you are paranoid, you must limit the total
> address space to the size of your swap files.
> If your swap space (your wrote "spam" - I assume a typo) is small,
> then you'll run into problems. It's recommended that your swap space
> should be 2*physical memory. I assume that many oom killer reports are
> from system with too small swap files, and then an updatedb run pushes
> the system into oom.

I believe that the swap slot can be subsequently freed, though. In
theory your available virtual memory should be (almost) RAM+swap. In
practice, Linux can run too close to that limit, (or way over it if you
turn the checks off). But restricting the maximum possible pages to
RAM/2 + swap should normally be fine. IANAKH, though.

Also note that the truly bad thing about the OOM killer is that it can
affect a process that is not making any new memory demands at all.

cheers

andrew


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-09-01 01:30:52
Message-ID: 200309010130.h811UqV24211@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:
> I believe that the swap slot can be subsequently freed, though. In
> theory your available virtual memory should be (almost) RAM+swap. In
> practice, Linux can run too close to that limit, (or way over it if you
> turn the checks off). But restricting the maximum possible pages to
> RAM/2 + swap should normally be fine. IANAKH, though.
>
> Also note that the truly bad thing about the OOM killer is that it can
> affect a process that is not making any new memory demands at all.

How does the OOM killer kill processes, kill -9 or kill -1 and wait?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-09-01 02:37:58
Message-ID: 3F52B106.8030600@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:

>Andrew Dunstan wrote:
>
>
>>I believe that the swap slot can be subsequently freed, though. In
>>theory your available virtual memory should be (almost) RAM+swap. In
>>practice, Linux can run too close to that limit, (or way over it if you
>>turn the checks off). But restricting the maximum possible pages to
>>RAM/2 + swap should normally be fine. IANAKH, though.
>>
>>Also note that the truly bad thing about the OOM killer is that it can
>>affect a process that is not making any new memory demands at all.
>>
>>
>
>How does the OOM killer kill processes, kill -9 or kill -1 and wait?
>
>

It sends a SIGKILL (9) unless the process is doing raw io, in which case
it sends SIGTERM (15). It can't really wait - at this stage the kernel
is in trouble - it can either kill processes or panic. The whole idea of
strict accounting is not to let it get to this stage in the first place.

see mm/oom_kill.c

cheers

andrew


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Linux2.6 overcommit behaviour
Date: 2003-09-01 02:40:33
Message-ID: 200309010240.h812eXt29901@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:
> It sends a SIGKILL (9) unless the process is doing raw io, in which case
> it sends SIGTERM (15). It can't really wait - at this stage the kernel
> is in trouble - it can either kill processes or panic. The whole idea of
> strict accounting is not to let it get to this stage in the first place.
>
> see mm/oom_kill.c

So it brings down all the backends --- I see. :-(

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073