Re: PostgreSQL's vacuumdb fails to allocate memory for

Lists: pgsql-general
From: Sven Willenberger <sven(at)dmv(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: stable(at)freebsd(dot)org
Subject: PostgreSQL's vacuumdb fails to allocate memory for non-root users
Date: 2005-06-29 13:01:28
Message-ID: 1120050088.19603.7.camel@lanshark.dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

FreeBSD 5.4-Release
PostgreSQL 8.0.3

I noticed that the nightly cron consisting of a vacuumdb was failing due
to "unable to allocate memory". I do have maintenance_mem set at 512MB,
and the /boot/loader.conf file sets the max datasize to 1GB (verified by
limit). The odd thing is that if I run the command (either vacuumdb from
the command line or vacuum verbose analyze from a psql session) as the
Unix user root (and any psql superuser) the vacuum runs fine. It is when
the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that
this memory error occurs. All users use the "default" class for
login.conf purposes which has not been modified from its installed
settings. Any ideas on how to a) troubleshoot this or b) fix this (if it
is something obvious that I just cannot see).

Thanks,

Sven


From: Douglas McNaught <doug(at)mcnaught(dot)org>
To: Sven Willenberger <sven(at)dmv(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, stable(at)freebsd(dot)org
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for non-root users
Date: 2005-06-29 13:43:23
Message-ID: m2slz12s90.fsf@Douglas-McNaughts-Powerbook.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Sven Willenberger <sven(at)dmv(dot)com> writes:

> FreeBSD 5.4-Release
> PostgreSQL 8.0.3
>
> I noticed that the nightly cron consisting of a vacuumdb was failing due
> to "unable to allocate memory". I do have maintenance_mem set at 512MB,
> and the /boot/loader.conf file sets the max datasize to 1GB (verified by
> limit). The odd thing is that if I run the command (either vacuumdb from
> the command line or vacuum verbose analyze from a psql session) as the
> Unix user root (and any psql superuser) the vacuum runs fine. It is when
> the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that
> this memory error occurs. All users use the "default" class for
> login.conf purposes which has not been modified from its installed
> settings. Any ideas on how to a) troubleshoot this or b) fix this (if it
> is something obvious that I just cannot see).

Is the out-of-memory condition occurring on the server or client side?
Is there anything in the Postgres logs?

You might put a 'ulimit -a' command in your cron script to make sure
your memory limit settings are propagating correctly...

-Doug


From: Sven Willenberger <sven(at)dmv(dot)com>
To: Douglas McNaught <doug(at)mcnaught(dot)org>
Cc: pgsql-general(at)postgresql(dot)org, stable(at)freebsd(dot)org
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-29 14:28:24
Message-ID: 1120055305.19603.25.camel@lanshark.dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2005-06-29 at 09:43 -0400, Douglas McNaught wrote:
> Sven Willenberger <sven(at)dmv(dot)com> writes:
>
> > FreeBSD 5.4-Release
> > PostgreSQL 8.0.3
> >
> > I noticed that the nightly cron consisting of a vacuumdb was failing due
> > to "unable to allocate memory". I do have maintenance_mem set at 512MB,
> > and the /boot/loader.conf file sets the max datasize to 1GB (verified by
> > limit). The odd thing is that if I run the command (either vacuumdb from
> > the command line or vacuum verbose analyze from a psql session) as the
> > Unix user root (and any psql superuser) the vacuum runs fine. It is when
> > the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that
> > this memory error occurs. All users use the "default" class for
> > login.conf purposes which has not been modified from its installed
> > settings. Any ideas on how to a) troubleshoot this or b) fix this (if it
> > is something obvious that I just cannot see).
>
> Is the out-of-memory condition occurring on the server or client side?
> Is there anything in the Postgres logs?

In this case they are one and the same machine ... i.e whether invoked
from the command-line as vacuumdb or invoked from psql (connecting to
localhost) as "vacuum analyze;" the memory error occurs. The logfile
reveals:
ERROR: out of memory
DETAIL: Failed on request of size 536870910.

> You might put a 'ulimit -a' command in your cron script to make sure
> your memory limit settings are propagating correctly...

I created a cron that consisted of just that command (ulimit -a) and the
output revealed nothing abnormal (i.e. max dataseg still 1G, etc). This
occurs outside of cron also, (it was just the failing cronjob that
brought it to my attention). Again, if I log in as myself and try to run
the command vaccumdb -a -z it fails; if I su to root and repeat it works
fine. I am trying to narrow this down to a PostgreSQL issue vs FreeBSD
issue.

Sven


From: Douglas McNaught <doug(at)mcnaught(dot)org>
To: Sven Willenberger <sven(at)dmv(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, stable(at)freebsd(dot)org
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for non-root users
Date: 2005-06-29 14:35:01
Message-ID: m2ekal2puy.fsf@Douglas-McNaughts-Powerbook.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Sven Willenberger <sven(at)dmv(dot)com> writes:

> Again, if I log in as myself and try to run
> the command vaccumdb -a -z it fails; if I su to root and repeat it works
> fine. I am trying to narrow this down to a PostgreSQL issue vs FreeBSD
> issue.

That's really weird, because in both cases the actual VACUUM work
happens in the backend, which should be running as 'postgres'. I
don't see how the client user could possibly make a difference.

-Doug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sven Willenberger <sven(at)dmv(dot)com>
Cc: Douglas McNaught <doug(at)mcnaught(dot)org>, pgsql-general(at)postgresql(dot)org, stable(at)freebsd(dot)org
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-29 15:21:41
Message-ID: 9731.1120058501@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Sven Willenberger <sven(at)dmv(dot)com> writes:
> ERROR: out of memory
> DETAIL: Failed on request of size 536870910.

That's a server-side failure ...

> Again, if I log in as myself and try to run
> the command vaccumdb -a -z it fails; if I su to root and repeat it works
> fine. I am trying to narrow this down to a PostgreSQL issue vs FreeBSD
> issue.

That's fairly hard to believe, assuming that you are talking to the same
server in both cases (I have seen trouble reports that turned out to be
because the complainant was accidentally using two different servers...)
The ulimit the backend is running under couldn't change depending on
where the client is su'd to.

Is it possible that you've got per-user configuration settings that
affect this, like a different maintenance_mem value for the root user?

regards, tom lane


From: Sven Willenberger <sven(at)dmv(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Douglas McNaught <doug(at)mcnaught(dot)org>, pgsql-general(at)postgresql(dot)org, stable(at)freebsd(dot)org
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-29 16:22:36
Message-ID: 1120062156.19598.49.camel@lanshark.dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2005-06-29 at 11:21 -0400, Tom Lane wrote:
> Sven Willenberger <sven(at)dmv(dot)com> writes:
> > ERROR: out of memory
> > DETAIL: Failed on request of size 536870910.
>
> That's a server-side failure ...
>
> > Again, if I log in as myself and try to run
> > the command vaccumdb -a -z it fails; if I su to root and repeat it works
> > fine. I am trying to narrow this down to a PostgreSQL issue vs FreeBSD
> > issue.
>
> That's fairly hard to believe, assuming that you are talking to the same
> server in both cases (I have seen trouble reports that turned out to be
> because the complainant was accidentally using two different servers...)
> The ulimit the backend is running under couldn't change depending on
> where the client is su'd to.
>
> Is it possible that you've got per-user configuration settings that
> affect this, like a different maintenance_mem value for the root user?
>
> regards, tom lane
>
I have done some more tests and tried to keep the results of vacuumdb
distinct from connecting to the backend (psql -U pgsql ...) and running
vaccum analyze. Apparently the hopping back and forth from both methods
interfered with my original interpretations of what appeared to be
happening. Anyway, here is what I see:
First test psql connection version:
psql then vacuum analyze => works fine whether the current unix user is
root or plain user. (ran this a couple times via new psql connections to
verify).
Then quit psql and move to command line
vacuumdb => whether running as su -l pgsql -c "vacuumdb -a -z" (or
specifying a dbname instead of all) or directly as a user the out of
memory error occurs.
If I then connect via psql to the backend and try to run vacuum analyze
I receive an out of memory error.

This last connection to psql after a failed vacuumdb was confabulating
my interpretations earlier of the error being based on unix user. top
shows:
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU
COMMAND
6754 pgsql 4 0 602M 88688K sbwait 0 0:03 0.00% 0.00%
postgres
until I disconnect the psql session. I can then psql again and the same
error happens (out of memory) and top shows the same again. At this
point I am not sure if it is a memory issue of vacuumdb, vacuum itself,
or the FreeBSD memory management system. Again, if enough time passes
(or some other events) since I last try vacuumdb, then running vacuum
[verbose][analyze] via a psql connection works fine.

Sven


From: Vivek Khera <vivek(at)khera(dot)org>
To: postgres general <pgsql-general(at)postgresql(dot)org>
Cc: stable(at)freebsd(dot)org
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for non-root users
Date: 2005-06-29 18:59:46
Message-ID: 58067976-00C6-4380-90DF-F448D9008C81@khera.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Jun 29, 2005, at 9:01 AM, Sven Willenberger wrote:

> Unix user root (and any psql superuser) the vacuum runs fine. It is
> when
> the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that
> this memory error occurs. All users use the "default" class for
> login.conf purposes which has not been modified from its installed
> settings. Any ideas on how to a) troubleshoot this or b) fix this
> (if it
> is something obvious that I just cannot see).

This doesn't make sense: the actual command is executed by the
backend postgres server, so the uid of the client program doens't
make a bit of difference.

You need to see exactly who is generating that error. It certainly
is not the Pg backend.


From: Sven Willenberger <sven(at)dmv(dot)com>
To: Vivek Khera <vivek(at)khera(dot)org>
Cc: postgres general <pgsql-general(at)postgresql(dot)org>, stable(at)freebsd(dot)org
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-29 20:12:26
Message-ID: 1120075946.19603.68.camel@lanshark.dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2005-06-29 at 14:59 -0400, Vivek Khera wrote:
> On Jun 29, 2005, at 9:01 AM, Sven Willenberger wrote:
>
> > Unix user root (and any psql superuser) the vacuum runs fine. It is
> > when
> > the unix user is non-root (e.g. su -l pgsql -c "vacuumdb -a -z") that
> > this memory error occurs. All users use the "default" class for
> > login.conf purposes which has not been modified from its installed
> > settings. Any ideas on how to a) troubleshoot this or b) fix this
> > (if it
> > is something obvious that I just cannot see).
>
> This doesn't make sense: the actual command is executed by the
> backend postgres server, so the uid of the client program doens't
> make a bit of difference.
>
> You need to see exactly who is generating that error. It certainly
> is not the Pg backend.
>
The issue being tied to a certain "login" user has been negated by
further testing (the illusion that it was based on user happened as a
result of the order in which I ran tests to try and find out what was
going on ) -- it does seem tied to invoking vacuumdb at this point. As a
point of clarification, when maxdsiz and dfldsiz are set, those values
are per "process" not per "user", correct? Something I have noticed,
when the memory error occurs during the psql session (after a failed
vacuumdb attempt) the memory stays at 600+MB in top (under size) until
the psql session is closed -- that may just be the way top reports it
though.

Sven

Sven


From: Charles Swiger <cswiger(at)mac(dot)com>
To: Sven Willenberger <sven(at)dmv(dot)com>
Cc: Vivek Khera <vivek(at)khera(dot)org>, stable(at)freebsd(dot)org, postgres general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for non-root users
Date: 2005-06-29 20:40:52
Message-ID: 51C9ABA2-20B8-4DE8-B647-A26A168A78FC@mac.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Jun 29, 2005, at 4:12 PM, Sven Willenberger wrote:
[ ... ]
> Something I have noticed,
> when the memory error occurs during the psql session (after a failed
> vacuumdb attempt) the memory stays at 600+MB in top (under size) until
> the psql session is closed -- that may just be the way top reports it
> though.

Double-check your system limits via "ulimit -a" or "ulimit -aH". By
default, FreeBSD will probably restrict the maximum data size of the
process to 512MB, which may be what you are running into. You can
rebuild the kernel to permit a larger data size, or else tweak /boot/
loader.conf:

echo 'kern.maxdsiz="1024M"' >> /boot/loader.conf

--
-Chuck


From: Sven Willenberger <sven(at)dmv(dot)com>
To: Charles Swiger <cswiger(at)mac(dot)com>
Cc: Vivek Khera <vivek(at)khera(dot)org>, stable(at)freebsd(dot)org, postgres general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-29 20:58:35
Message-ID: 1120078715.19598.79.camel@lanshark.dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2005-06-29 at 16:40 -0400, Charles Swiger wrote:
> On Jun 29, 2005, at 4:12 PM, Sven Willenberger wrote:
> [ ... ]
> > Something I have noticed,
> > when the memory error occurs during the psql session (after a failed
> > vacuumdb attempt) the memory stays at 600+MB in top (under size) until
> > the psql session is closed -- that may just be the way top reports it
> > though.
>
> Double-check your system limits via "ulimit -a" or "ulimit -aH". By
> default, FreeBSD will probably restrict the maximum data size of the
> process to 512MB, which may be what you are running into. You can
> rebuild the kernel to permit a larger data size, or else tweak /boot/
> loader.conf:
>
> echo 'kern.maxdsiz="1024M"' >> /boot/loader.conf
>

:>ulimit -a
cpu time (seconds, -t) unlimited
file size (512-blocks, -f) unlimited
data seg size (kbytes, -d) 1048576
stack size (kbytes, -s) 65536
core file size (512-blocks, -c) unlimited
max memory size (kbytes, -m) unlimited
locked memory (kbytes, -l) unlimited
max user processes (-u) 5547
open files (-n) 11095
virtual mem size (kbytes, -v) unlimited
sbsize (bytes, -b) unlimited
:> cat /boot/loader.conf
kern.maxdsiz="1073741824"
kern.dfldsiz="891289600"

and if I don't run vacuumdb at all, but rather connect to the backend
via psql and run vacuum, it works ok with full memory allocation. Still
testing RAM to see if the issue is physical.

Sven


From: Sven Willenberger <sven(at)dmv(dot)com>
To: Charles Swiger <cswiger(at)mac(dot)com>
Cc: Vivek Khera <vivek(at)khera(dot)org>, stable(at)freebsd(dot)org, postgres general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-29 22:34:01
Message-ID: 1120084441.19614.87.camel@lanshark.dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2005-06-29 at 16:58 -0400, Sven Willenberger wrote:
> On Wed, 2005-06-29 at 16:40 -0400, Charles Swiger wrote:
> > On Jun 29, 2005, at 4:12 PM, Sven Willenberger wrote:
> > [ ... ]
> > > Something I have noticed,
> > > when the memory error occurs during the psql session (after a failed
> > > vacuumdb attempt) the memory stays at 600+MB in top (under size) until
> > > the psql session is closed -- that may just be the way top reports it
> > > though.
> >
> > Double-check your system limits via "ulimit -a" or "ulimit -aH". By
> > default, FreeBSD will probably restrict the maximum data size of the
> > process to 512MB, which may be what you are running into. You can
> > rebuild the kernel to permit a larger data size, or else tweak /boot/
> > loader.conf:
> >
> > echo 'kern.maxdsiz="1024M"' >> /boot/loader.conf
> >
>
> :>ulimit -a
> cpu time (seconds, -t) unlimited
> file size (512-blocks, -f) unlimited
> data seg size (kbytes, -d) 1048576
> stack size (kbytes, -s) 65536
> core file size (512-blocks, -c) unlimited
> max memory size (kbytes, -m) unlimited
> locked memory (kbytes, -l) unlimited
> max user processes (-u) 5547
> open files (-n) 11095
> virtual mem size (kbytes, -v) unlimited
> sbsize (bytes, -b) unlimited
> :> cat /boot/loader.conf
> kern.maxdsiz="1073741824"
> kern.dfldsiz="891289600"
>
> and if I don't run vacuumdb at all, but rather connect to the backend
> via psql and run vacuum, it works ok with full memory allocation. Still
> testing RAM to see if the issue is physical.
>
> Sven
>
>
I have found the answer/problem. On a hunch I increased maxdsiz to 1.5G
in the loader.conf file and rebooted. I ran vacuumdb and watched top as
the process proceeded. What I saw was SIZE sitting at 603MB (which was
512MB plus another 91MB which corresponded nicely to the value of RES
for the process. A bit into the process I saw SIZE jump to 1115 -- i.e.
another 512 MB of RAM was requested and this time allocated. At one
point SIZE dropped back to 603 and then back up to 1115. I suspect the
same type of issue was occuring in regular vacuum from the psql client
connecting to the backend, for some reason not as frequently. I am
gathering that maintenance work mem is either not being recognized as
having already been allocated and another malloc is made or the process
is thinking the memory was released and tried to grab a chunk of memory
again. This would correspond to the situation where I was size stuck at
603MB after a failed memory allocation (when maxdsiz was only 1G). Now I
am not sure if I will run into the situation where yet another 512MB
request would be made (when already 1115 appears in SIZE) but if so,
then the same problem will arise. I will keep an eye on it ...

Sven


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sven Willenberger <sven(at)dmv(dot)com>
Cc: Charles Swiger <cswiger(at)mac(dot)com>, Vivek Khera <vivek(at)khera(dot)org>, stable(at)freebsd(dot)org, postgres general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-29 23:12:19
Message-ID: 165.1120086739@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Sven Willenberger <sven(at)dmv(dot)com> writes:
> I have found the answer/problem. On a hunch I increased maxdsiz to 1.5G
> in the loader.conf file and rebooted. I ran vacuumdb and watched top as
> the process proceeded. What I saw was SIZE sitting at 603MB (which was
> 512MB plus another 91MB which corresponded nicely to the value of RES
> for the process. A bit into the process I saw SIZE jump to 1115 -- i.e.
> another 512 MB of RAM was requested and this time allocated. At one
> point SIZE dropped back to 603 and then back up to 1115. I suspect the
> same type of issue was occuring in regular vacuum from the psql client
> connecting to the backend, for some reason not as frequently. I am
> gathering that maintenance work mem is either not being recognized as
> having already been allocated and another malloc is made or the process
> is thinking the memory was released and tried to grab a chunk of memory
> again.

Hmm. It's probably a fragmentation issue. VACUUM will allocate a
maintenance work mem-sized chunk during command startup, but that's
likely not all that gets allocated, and if any stuff allocated after
it is not freed at the same time, the process size won't go back down.
Which wouldn't be a killer in itself, but unless the next iteration
is able to fit that array in the same space, you'd see the above
behavior.

BTW, do you have any evidence that it's actually useful to set
maintenance work mem that high for VACUUM? A quick and dirty solution
would be to bound the dead-tuples array size at something more sane...

regards, tom lane


From: Sven Willenberger <sven(at)dmv(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Charles Swiger <cswiger(at)mac(dot)com>, Vivek Khera <vivek(at)khera(dot)org>, stable(at)freebsd(dot)org, postgres general <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL's vacuumdb fails to allocate memory for
Date: 2005-06-30 03:46:07
Message-ID: 42C36AFF.6000405@dmv.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane presumably uttered the following on 06/29/05 19:12:
> Sven Willenberger <sven(at)dmv(dot)com> writes:
>
>>I have found the answer/problem. On a hunch I increased maxdsiz to 1.5G
>>in the loader.conf file and rebooted. I ran vacuumdb and watched top as
>>the process proceeded. What I saw was SIZE sitting at 603MB (which was
>>512MB plus another 91MB which corresponded nicely to the value of RES
>>for the process. A bit into the process I saw SIZE jump to 1115 -- i.e.
>>another 512 MB of RAM was requested and this time allocated. At one
>>point SIZE dropped back to 603 and then back up to 1115. I suspect the
>>same type of issue was occuring in regular vacuum from the psql client
>>connecting to the backend, for some reason not as frequently. I am
>>gathering that maintenance work mem is either not being recognized as
>>having already been allocated and another malloc is made or the process
>>is thinking the memory was released and tried to grab a chunk of memory
>>again.
>
>
> Hmm. It's probably a fragmentation issue. VACUUM will allocate a
> maintenance work mem-sized chunk during command startup, but that's
> likely not all that gets allocated, and if any stuff allocated after
> it is not freed at the same time, the process size won't go back down.
> Which wouldn't be a killer in itself, but unless the next iteration
> is able to fit that array in the same space, you'd see the above
> behavior.
>
So maintenance work mem is not a measure of the max that can allocated
by a maintenance procedure but rather an increment of memory that is
requested by a maintenance process (which currently are vacuum and
index, no?), if my reading of the above is correct.

> BTW, do you have any evidence that it's actually useful to set
> maintenance work mem that high for VACUUM? A quick and dirty solution
> would be to bound the dead-tuples array size at something more sane...
>

I was under the assumption that on systems with RAM to spare, it was
beneficial to set main work mem high to make those processes more
efficient. Again my thinking was that the value you set for that
variable determined a *max* allocation by any given maintenance process,
not a memory allocation request size. If, as my tests would indicate,
the process can request and receive more memory than specified in
maintenance work mem, then to play it safe I imagine I could drop that
value to 256MB or so.

Sven