Big trouble with memory !!

Lists: pgsql-general
From: Hervé Piedvache <herve(at)elma(dot)fr>
To: pgsql-general(at)postgresql(dot)org
Subject: Big trouble with memory !!
Date: 2005-04-06 14:35:43
Message-ID: 200504061635.43863.herve@elma.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

We have switched to kernel 2.6.11.6 from kernel 2.4.26 ... since this date we
have many troubles with PostgreSQL and most of them seems to be memory
troubles.

As far as we can see, kernel kills the postmaster process when it begins to
use swap. You can see the output from dmesg at the bottom of the message.
The first thing I am not sure to understand is that the kernel should kill
processes to reallocate memory only when physical memory and swap memory are
exhausted, shouldn't it ?
Second thing: it seems to be related to our kernel switch as it did not happen
before that.

This can occur when queries / a vacuum require too much memory to run.
I have configured my kernel with these options:
# shared mem
kernel.shmmax= 641604096
# semaphore
kernel.sem = 250 32000 100 400
fs.file-max=65536
# overcommit
vm.overcommit_memory=2

Does anyone can explain me why I have this problem and how to resolve it ?

This server is a dedicated PostgreSQL server with 4Gb of RAM.

PostgreSQL 7.4.6 on i686-pc-linux-gnu, compiled by GCC 2.95.4

Linux kernel 2.6.11.6 #1 SMP

Last Vacuum :

VACUUM full VERBOSE ANALYZE prefs;
INFO: vacuuming "public.prefs"
INFO: "prefs": found 32 removable, 1010549 nonremovable row versions in 30847
pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 165 to 505 bytes long.
There were 102930 unused item pointers.
Total free space (including removable row versions) is 39264528 bytes.
17 pages are or will become empty, including 0 at the end of the table.
23600 pages containing 38830876 free bytes are potential move destinations.
CPU 0.60s/0.32u sec elapsed 16.80 sec.
INFO: index "ix_anon_prefs" now contains 1010549 row versions in 7596 pages
DETAIL: 32 index row versions were removed.
123 index pages have been deleted, 123 are currently reusable.
CPU 0.04s/1.27u sec elapsed 1.32 sec.
INFO: index "ix_prefs_fromsite" now contains 1010549 row versions in 10472
pages
DETAIL: 32 index row versions were removed.
859 index pages have been deleted, 859 are currently reusable.
CPU 0.07s/0.86u sec elapsed 0.95 sec.
INFO: index "ix_datecrea_prefs" now contains 1010549 row versions in 5956
pages
DETAIL: 32 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/1.15u sec elapsed 1.23 sec.
INFO: index "prefs_pkey" now contains 1010549 row versions in 8360 pages
DETAIL: 32 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.15s/1.23u sec elapsed 1.67 sec.
INFO: "prefs": moved 52064 row versions, truncated 30847 to 26221 pages
DETAIL: CPU 2.60s/22.43u sec elapsed 136.48 sec.
INFO: index "ix_anon_prefs" now contains 1010549 row versions in 7639 pages
DETAIL: 52064 index row versions were removed.
123 index pages have been deleted, 123 are currently reusable.
CPU 0.01s/0.33u sec elapsed 0.39 sec.
INFO: index "ix_prefs_fromsite" now contains 1010549 row versions in 10472
pages
DETAIL: 52064 index row versions were removed.
834 index pages have been deleted, 834 are currently reusable.
CPU 0.13s/0.29u sec elapsed 1.94 sec.
INFO: index "ix_datecrea_prefs" now contains 1010549 row versions in 5956
pages
DETAIL: 52064 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.28u sec elapsed 0.28 sec.
INFO: index "prefs_pkey" now contains 1010549 row versions in 8360 pages
DETAIL: 52064 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.04s/0.34u sec elapsed 0.56 sec.
INFO: vacuuming "pg_toast.pg_toast_17230"
INFO: "pg_toast_17230": found 0 removable, 0 nonremovable row versions in 0
pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 0 to 0 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 0 bytes.
0 pages are or will become empty, including 0 at the end of the table.
0 pages containing 0 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_toast_17230_index" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.35 sec.
INFO: analyzing "public.prefs"
INFO: "prefs": 26221 pages, 3000 rows sampled, 1015468 estimated total rows
ERROR: out of memory
DETAIL: Failed on request of size 24000.

dmesg :
=======
Free pages: 11996kB (4672kB HighMem)
Active:620816 inactive:114574 dirty:76 writeback:0 unstable:0 free:3031
slab:14426 mapped:592875 pagetables:202522
DMA free:3588kB min:68kB low:84kB high:100kB active:64kB inactive:0kB
present:16384kB pages_scanned:103 all_unreclaimable? yes
lowmem_reserve[]: 0 880 3759
Normal free:3736kB min:3756kB low:4692kB high:5632kB active:312kB inactive:0kB
present:901120kB pages_scanned:517 all_unreclaimable? yes
lowmem_reserve[]: 0 0 23039
HighMem free:5312kB min:512kB low:640kB high:768kB active:2482232kB
inactive:458296kB present:2948992kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB
1*2048kB 0*4096kB = 3588kB
Normal: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB
1*2048kB 0*4096kB = 3736kB
HighMem: 1384*4kB 6*8kB 3*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 6208kB
Swap cache: add 107282, delete 91308, find 40542/48162, race 0+0
Free swap = 2020244kB
Total swap = 2097136kB
Out of Memory: Killed process 14286 (postmaster).

postgresql.conf :
============
# - Memory -

shared_buffers = 60000 # min 16, at least max_connections*2, 8KB each
sort_mem = 32768 # min 64, size in KB
vacuum_mem = 32768 # min 1024, size in KB

# - Free Space Map -

max_fsm_pages = 600000 # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 5000 # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000 # min 25
#preload_libraries = ''

Regards,
--
Hervé Piedvache

NOUVELLE ADRESSE - NEW ADDRESS :
Elma Ingénierie Informatique
3 rue d'Uzès
F-75002 - Paris - France
Pho. 33-144949901
Fax. 33-144882747


From: Richard Huxton <dev(at)archonet(dot)com>
To: Hervé Piedvache <herve(at)elma(dot)fr>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 15:03:18
Message-ID: 4253FA36.60406@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hervé Piedvache wrote:
> Hi,
>
> We have switched to kernel 2.6.11.6 from kernel 2.4.26 ... since this date we
> have many troubles with PostgreSQL and most of them seems to be memory
> troubles.
>
> As far as we can see, kernel kills the postmaster process when it begins to
> use swap. You can see the output from dmesg at the bottom of the message.
> The first thing I am not sure to understand is that the kernel should kill
> processes to reallocate memory only when physical memory and swap memory are
> exhausted, shouldn't it ?
> Second thing: it seems to be related to our kernel switch as it did not happen
> before that.
>
> This can occur when queries / a vacuum require too much memory to run.
> I have configured my kernel with these options:
> # shared mem
> kernel.shmmax= 641604096
> # semaphore
> kernel.sem = 250 32000 100 400
> fs.file-max=65536
> # overcommit
> vm.overcommit_memory=2
>
> Does anyone can explain me why I have this problem and how to resolve it ?
>
> This server is a dedicated PostgreSQL server with 4Gb of RAM.

You might want to try vm.overcommit_memory=1. You don't appear to be the
only one suffering from an over-zealous oom-killer.

http://www.ussg.iu.edu/hypermail/linux/kernel/0501.2/1295.html
http://www.linuxquestions.org/questions/history/291119

--
Richard Huxton
Archonet Ltd


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Hervé Piedvache <herve(at)elma(dot)fr>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 15:08:31
Message-ID: 20050406150806.GD14589@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, Apr 06, 2005 at 04:35:43PM +0200, Hervé Piedvache wrote:
> Hi,
>
> We have switched to kernel 2.6.11.6 from kernel 2.4.26 ... since this date we
> have many troubles with PostgreSQL and most of them seems to be memory
> troubles.
>
> As far as we can see, kernel kills the postmaster process when it begins to
> use swap. You can see the output from dmesg at the bottom of the message.
> The first thing I am not sure to understand is that the kernel should kill
> processes to reallocate memory only when physical memory and swap memory are
> exhausted, shouldn't it ?
> Second thing: it seems to be related to our kernel switch as it did not happen
> before that.

My guess is that your problem stems from this line:

> vm.overcommit_memory=2

The code was changed so this *really* would not let anything exceed the
available memory. Instead of all shared pages for each library being
counted in common, each copy gets counted individually because each
process could possibly demand their own copy by writing to it.

This link seems to imply it was done in the 2.5 series.
http://kerneltrap.org/node/326

What I don't understand is that with true strict overcommit, the kernel
should never need to kill your process since there is always in
principle enough room. That worked, see:

> ERROR: out of memory
> DETAIL: Failed on request of size 24000.

Except the kernel killed it anyway, very odd. Which means someone isn't
counting properly. Your shared_buffers are on the high side, but that
may be the appropriate setting for your system...

Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Hervé Piedvache <herve(at)elma(dot)fr>
To: pgsql-general(at)postgresql(dot)org
Cc: Richard Huxton <dev(at)archonet(dot)com>
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 16:25:50
Message-ID: 200504061825.51573.herve@elma.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wednesday 06 April 2005 17:03, Richard Huxton wrote:
> Hervé Piedvache wrote:
> > Hi,
> >
> > We have switched to kernel 2.6.11.6 from kernel 2.4.26 ... since this
> > date we have many troubles with PostgreSQL and most of them seems to be
> > memory troubles.
> >
> > As far as we can see, kernel kills the postmaster process when it begins
> > to use swap. You can see the output from dmesg at the bottom of the
> > message. The first thing I am not sure to understand is that the kernel
> > should kill processes to reallocate memory only when physical memory and
> > swap memory are exhausted, shouldn't it ?
> > Second thing: it seems to be related to our kernel switch as it did not
> > happen before that.
> >
> > This can occur when queries / a vacuum require too much memory to run.
> > I have configured my kernel with these options:
> > # shared mem
> > kernel.shmmax= 641604096
> > # semaphore
> > kernel.sem = 250 32000 100 400
> > fs.file-max=65536
> > # overcommit
> > vm.overcommit_memory=2
> >
> > Does anyone can explain me why I have this problem and how to resolve it
> > ?
> >
> > This server is a dedicated PostgreSQL server with 4Gb of RAM.
>
> You might want to try vm.overcommit_memory=1. You don't appear to be the
> only one suffering from an over-zealous oom-killer.
>

But if you look at the documentation of PostgreSQL :
-------------------------------------------------------------------------------------------------
16.5.3. Linux Memory Overcommit

In Linux 2.4 and later, the default virtual memory behavior is not optimal for
PostgreSQL. Because of the way that the kernel implements memory overcommit,
the kernel may terminate the PostgreSQL server (the postmaster process) if
the memory demands of another process cause the system to run out of virtual
memory.

If this happens, you will see a kernel message that looks like this (consult
your system documentation and configuration on where to look for such a
message):

Out of Memory: Killed process 12345 (postmaster).

This indicates that the postmaster process has been terminated due to memory
pressure. Although existing database connections will continue to function
normally, no new connections will be accepted. To recover, PostgreSQL will
need to be restarted.

One way to avoid this problem is to run PostgreSQL on a machine where you can
be sure that other processes will not run the machine out of memory.

On Linux 2.6 and later, a better solution is to modify the kernel's behavior
so that it will not "overcommit" memory. This is done by selecting strict
overcommit mode via sysctl:

sysctl -w vm.overcommit_memory=2

or placing an equivalent entry in /etc/sysctl.conf. You may also wish to
modify the related setting vm.overcommit_ratio. For details see the kernel
documentation file Documentation/vm/overcommit-accounting.

Some vendors' Linux 2.4 kernels are reported to have early versions of the 2.6
overcommit sysctl. However, setting vm.overcommit_memory to 2 on a kernel
that does not have the relevant code will make things worse not better. It is
recommended that you inspect the actual kernel source code (see the function
vm_enough_memory in the file mm/mmap.c) to verify what is supported in your
copy before you try this in a 2.4 installation. The presence of the
overcommit-accounting documentation file should not be taken as evidence that
the feature is there. If in any doubt, consult a kernel expert or your kernel
vendor.
-------------------------------------------------------------------------------------------------

Where is the good solution ?
--
Hervé Piedvache

NOUVELLE ADRESSE - NEW ADDRESS :
Elma Ingénierie Informatique
3 rue d'Uzès
F-75002 - Paris - France
Pho. 33-144949901
Fax. 33-144882747


From: Richard Huxton <dev(at)archonet(dot)com>
To: Hervé Piedvache <herve(at)elma(dot)fr>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 16:47:58
Message-ID: 425412BE.5080709@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hervé Piedvache wrote:
> On Wednesday 06 April 2005 17:03, Richard Huxton wrote:
>
>>Hervé Piedvache wrote:
>>
>>>Hi,
>>>
>>>We have switched to kernel 2.6.11.6 from kernel 2.4.26 ... since this
>>>date we have many troubles with PostgreSQL and most of them seems to be
>>>memory troubles.
>>>
>>>As far as we can see, kernel kills the postmaster process when it begins
>>>to use swap. You can see the output from dmesg at the bottom of the
>>>message. The first thing I am not sure to understand is that the kernel
>>>should kill processes to reallocate memory only when physical memory and
>>>swap memory are exhausted, shouldn't it ?
>>>Second thing: it seems to be related to our kernel switch as it did not
>>>happen before that.
>>>
>>>This can occur when queries / a vacuum require too much memory to run.
>>>I have configured my kernel with these options:
>>># shared mem
>>>kernel.shmmax= 641604096
>>># semaphore
>>>kernel.sem = 250 32000 100 400
>>>fs.file-max=65536
>>># overcommit
>>>vm.overcommit_memory=2
>>>
>>>Does anyone can explain me why I have this problem and how to resolve it
>>>?
>>>
>>>This server is a dedicated PostgreSQL server with 4Gb of RAM.
>>
>>You might want to try vm.overcommit_memory=1. You don't appear to be the
>>only one suffering from an over-zealous oom-killer.
>>
>
>
> But if you look at the documentation of PostgreSQL :
> -------------------------------------------------------------------------------------------------
> 16.5.3. Linux Memory Overcommit
>
> In Linux 2.4 and later, the default virtual memory behavior is not optimal for
> PostgreSQL. Because of the way that the kernel implements memory overcommit,
> the kernel may terminate the PostgreSQL server (the postmaster process) if
> the memory demands of another process cause the system to run out of virtual
> memory.
>
> If this happens, you will see a kernel message that looks like this (consult
> your system documentation and configuration on where to look for such a
> message):
>
> Out of Memory: Killed process 12345 (postmaster).
>
> This indicates that the postmaster process has been terminated due to memory
> pressure. Although existing database connections will continue to function
> normally, no new connections will be accepted. To recover, PostgreSQL will
> need to be restarted.
>
> One way to avoid this problem is to run PostgreSQL on a machine where you can
> be sure that other processes will not run the machine out of memory.
>
> On Linux 2.6 and later, a better solution is to modify the kernel's behavior
> so that it will not "overcommit" memory. This is done by selecting strict
> overcommit mode via sysctl:
>
> sysctl -w vm.overcommit_memory=2
>
> or placing an equivalent entry in /etc/sysctl.conf. You may also wish to
> modify the related setting vm.overcommit_ratio. For details see the kernel
> documentation file Documentation/vm/overcommit-accounting.
>
> Some vendors' Linux 2.4 kernels are reported to have early versions of the 2.6
> overcommit sysctl. However, setting vm.overcommit_memory to 2 on a kernel
> that does not have the relevant code will make things worse not better. It is
> recommended that you inspect the actual kernel source code (see the function
> vm_enough_memory in the file mm/mmap.c) to verify what is supported in your
> copy before you try this in a 2.4 installation. The presence of the
> overcommit-accounting documentation file should not be taken as evidence that
> the feature is there. If in any doubt, consult a kernel expert or your kernel
> vendor.
> -------------------------------------------------------------------------------------------------

Ah, but I believe there have been some issues with the Linux kernel
recently:
http://www.kerneltraffic.org/kernel-traffic/kt20050212_296.html#6

If the oom-killer is running riot on your system that would suggest the
bug is still there in some form.

--
Richard Huxton
Archonet Ltd


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Hervé Piedvache <herve(at)elma(dot)fr>, pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 16:52:55
Message-ID: 28176.1112806375@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> What I don't understand is that with true strict overcommit, the kernel
> should never need to kill your process since there is always in
> principle enough room.

Indeed. Are you *sure* you have overcommit turned off? That should
disable the OOM killer altogether. You should probably go read the
kernel documentation rather than assume Postgres' documentation knows
what it's talking about ;-)

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Hervé Piedvache <herve(at)elma(dot)fr>, pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 16:57:18
Message-ID: 28243.1112806638@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Richard Huxton <dev(at)archonet(dot)com> writes:
> You might want to try vm.overcommit_memory=1. You don't appear to be the
> only one suffering from an over-zealous oom-killer.

> http://www.ussg.iu.edu/hypermail/linux/kernel/0501.2/1295.html

Hmm, in particular Andrea Arcangeli implies here
http://www.ussg.iu.edu/hypermail/linux/kernel/0501.2/1358.html
that there are some pretty serious bugs in this area in 2.6.10.

So the answer is you need a different kernel version.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hervé Piedvache <herve(at)elma(dot)fr>, pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 18:57:44
Message-ID: 20050406185744.GE14589@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, Apr 06, 2005 at 12:52:55PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > What I don't understand is that with true strict overcommit, the kernel
> > should never need to kill your process since there is always in
> > principle enough room.
>
> Indeed. Are you *sure* you have overcommit turned off? That should
> disable the OOM killer altogether. You should probably go read the
> kernel documentation rather than assume Postgres' documentation knows
> what it's talking about ;-)

What I don't understand is the problem with overcommitting. Not
overcommitting is rather wasteful of memory and you'll start seeing
failed allocations long before "free" indicates there's a problem. See
the original poster of this thread whose memory allocation for 24KB was
being rejected while there is still megabytes of "free" memory (not to
mention swap). The memory which is "overcommitted" is being used for
disk cache and buffers in the meantime.

Note the kernel message says "postmaster" was killed by the kernel, yet
that was most probably a child process. Both the postmaster, the stats
collector and any child process will be referred to as "postmaster".
AFAIK no exec() is being done to change the name (see
/proc/<pid>/status).

BTW, according to this page in the archive:
http://archives.postgresql.org/pgsql-patches/2003-11/msg00194.php

2 - (NEW) strict overcommit. The total address space commit
for the system is not permitted to exceed swap + a
configurable percentage (default is 50) of physical RAM.
Depending on the percentage you use, in most situations
this means a process will not be killed while accessing
pages but will receive errors on memory allocation as
appropriate.

So the origial poster had 4GB of RAM and 2GB of swap, so by that rule
he's only allowed to allocate a total of (swap)2GB+ 50%(memory)4GB =
4GB. In other words, he'd never use swap at all. Maybe fiddling that
percentage would be a better idea. But overcommit still seems useless
to me.

PS, I only know about Linux here.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Hervé Piedvache <herve(at)elma(dot)fr>, pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 20:37:59
Message-ID: 5905.1112819879@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> What I don't understand is the problem with overcommitting.

The problem with Linux overcommit is that when the kernel does run out
of memory, the process it chooses to kill isn't necessarily one that was
using an unreasonable amount of memory. The earlier versions were quite
willing to kill "init" ;-) ... I think they hacked it to prevent that
disaster, but it's still entirely capable of deciding to take out the
(real) postmaster, your mail daemon, or other processes you'd prefer not
to lose. As such, the feature is really too dangerous to enable on
machines being used for production purposes.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hervé Piedvache <herve(at)elma(dot)fr>, pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 22:15:10
Message-ID: 20050406221507.GF14589@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, Apr 06, 2005 at 04:37:59PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > What I don't understand is the problem with overcommitting.
>
> The problem with Linux overcommit is that when the kernel does run out
> of memory, the process it chooses to kill isn't necessarily one that was
> using an unreasonable amount of memory. The earlier versions were quite
> willing to kill "init" ;-) ... I think they hacked it to prevent that
> disaster, but it's still entirely capable of deciding to take out the
> (real) postmaster, your mail daemon, or other processes you'd prefer not
> to lose. As such, the feature is really too dangerous to enable on
> machines being used for production purposes.

Ok, I think the point I'm trying to make is that with "strict
autocommit" in its current state isn't really that strict and just
causes the problem to happen elsewhere. The guy had heaps of memory
available and the system is dying on him. Better turn autocommit off
and let him use up the 2GB of swap before having processes killed.

Or rather, if you can't stop the killer anyway, take advantage of the
extra memory.

That formula they use is bizarre. I tend to allocate max 250MB swap
even on machines with a gig of memory. If I'm using a lot of swap I'm
doing something wrong. By the formula they use strict autocommit would
start failing memory allocations with my memory only half used...
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Hervé Piedvache <herve(at)elma(dot)fr>, pgsql-general(at)postgresql(dot)org
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 22:27:01
Message-ID: 6507.1112826421@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> Ok, I think the point I'm trying to make is that with "strict
> autocommit" in its current state isn't really that strict and just
> causes the problem to happen elsewhere.

Right, but that is surely just a kernel bug, and one that's not been
around very long. Presumably it'll be fixed soon.

> That formula they use is bizarre.

I presume that the point of the formula is that you want some fraction
of physical RAM to be reserved for disk buffers ... 50% might be
unreasonably high, or not ...

regards, tom lane


From: Hervé Piedvache <herve(at)elma(dot)fr>
To: pgsql-general(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Huxton <dev(at)archonet(dot)com>
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 23:30:17
Message-ID: 200504070130.17487.herve@elma.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wednesday 06 April 2005 18:57, Tom Lane wrote:
> Richard Huxton <dev(at)archonet(dot)com> writes:
> > You might want to try vm.overcommit_memory=1. You don't appear to be the
> > only one suffering from an over-zealous oom-killer.
> >
> > http://www.ussg.iu.edu/hypermail/linux/kernel/0501.2/1295.html
>
> Hmm, in particular Andrea Arcangeli implies here
> http://www.ussg.iu.edu/hypermail/linux/kernel/0501.2/1358.html
> that there are some pretty serious bugs in this area in 2.6.10.

OK I downgrade to 2.6.9 ...

Let's see !
Thanks !
--
Hervé Piedvache

NOUVELLE ADRESSE - NEW ADDRESS :
Elma Ingénierie Informatique
3 rue d'Uzès
F-75002 - Paris - France
Pho. 33-144949901
Fax. 33-144882747


From: Hervé Piedvache <herve(at)elma(dot)fr>
To: pgsql-general(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Big trouble with memory !!
Date: 2005-04-06 23:42:33
Message-ID: 200504070142.33921.herve@elma.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wednesday 06 April 2005 18:52, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > What I don't understand is that with true strict overcommit, the kernel
> > should never need to kill your process since there is always in
> > principle enough room.
>
> Indeed. Are you *sure* you have overcommit turned off?

How to know this point ? I mean is there any sure action to do to know this
point ... ?

> That should disable the OOM killer altogether. You should probably go read
> the kernel documentation rather than assume Postgres' documentation knows
> what it's talking about ;-)

hugh ? I can't read that ... PostgreSQL documentation is my bible ... the only
good explanation are always inside the PostgreSQL documentation ...

:o)

May be a correction for the next release ?
--
Hervé Piedvache

NOUVELLE ADRESSE - NEW ADDRESS :
Elma Ingénierie Informatique
3 rue d'Uzès
F-75002 - Paris - France
Pho. 33-144949901
Fax. 33-144882747