From: | Gurjeet Singh <gurjeet(at)singh(dot)im> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: /proc/self/oom_adj is deprecated in newer Linux kernels |
Date: | 2014-06-10 15:45:41 |
Message-ID: | CABwTF4W8YBEfgHm64CcZLvYHBj7yRYKM9ZcThW2aNsdDPeKixw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jun 10, 2014 at 11:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> In my view, the root-owned startup script grants OOM exemption to
> the postmaster because it *knows* that the postmaster's children
> will drop the exemption. If that trust can be violated because some
> clueless DBA decided to frob a GUC setting, I'd be a lot more hesitant
> about giving the postmaster the exemption in the first place.
Even if the clueless DBA tries to set the oom_score_adj below that of
Postmaster, Linux kernel prevents that from being done. I demonstrate
that in the below screen share. I used GUC as well as plain command
line to try and set a child's badness (oom_score_adj) to be lower than
that of Postmaster's, and Linux disallows doing that, unless I use
root's powers.
So the argument that this GUC is a security concern, can be ignored.
Root user (one with control of start script) still controls the lowest
badness setting of all Postgres processes. If done at fork_process
time, the child process simply inherits parent's badness setting.
Best regards,
# Set postmaster badness: -1000
$ sudo echo -1000 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-1000
# Set backend badness the same as postmaster: -1000
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj =
-1000/' $PGDATA/postgresql.conf ; grep oom $PGDATA/postgresql.conf
linux_oom_score_adj = -1000
$ pgstop; pgstart
pg_ctl: server is running (PID: 65355)
/home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres "-D"
"/home/gurjeet/dev/pgdbuilds/oom_guc/db/data"
waiting for server to shut down.... done
server stopped
pg_ctl: no server running
waiting for server to start.... done
server started
Database cluster state: in production
# Backends and the postmaster have the same badness; lower than default 0
$ for p in $(echo $(pgserverPIDList)| cut -d , --output-delimiter=' '
-f 1-); do echo $p $(cat /proc/$p/oom_score_adj) $(cat
/proc/$p/cmdline); done
537 -1000 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
539 -1000 postgres: checkpointer process
540 -1000 postgres: writer process
541 -1000 postgres: wal writer process
542 -1000 postgres: autovacuum launcher process
543 -1000 postgres: stats collector process
# Set postmaster badness: -100
$ sudo echo -100 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-100
# Set backend badness the lower than postmaster: -500 vs. -100
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj = -500...
linux_oom_score_adj = -500
$ pgstop; pgstart
...
# Backends are capped to parent's badness.
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
716 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
718 -100 postgres: checkpointer process
719 -100 postgres: writer process
720 -100 postgres: wal writer process
721 -100 postgres: autovacuum launcher process
722 -100 postgres: stats collector process
# Set postmaster badness: -100
$ sudo echo -100 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-100
# Set backend badness the higher than postmaster: +500 vs. -100
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj = 500/'
$PGDATA/postgresql.conf ; grep oom $PGDATA/postgresql.conf
linux_oom_score_adj = 500
$ pgstop; pgstart
...
# Backends have higher badness, hence higher likelyhood to be killed
than the Postmaster.
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1602 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1604 500 postgres: checkpointer process
1605 500 postgres: writer process
1606 500 postgres: wal writer process
1607 500 postgres: autovacuum launcher process
1608 500 postgres: stats collector process
# Set postmaster badness: -100
$ sudo echo -100 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-100
# Reset backend badness to default: 0
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj = 0/'
$PGDATA/postgresql.conf ; grep oom $PGDATA/postgresql.conf
linux_oom_score_adj = 0
$ pgstop; pgstart
...
# Backends have higher badness, hence higher likelyhood to be killed
than the Postmaster.
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 0 postgres: checkpointer process
1838 0 postgres: writer process
1839 0 postgres: wal writer process
1840 0 postgres: autovacuum launcher process
1841 0 postgres: stats collector process
# Lower checkpointer's badness, but keep it higher than postmaster's
$ echo -1 > /proc/1837/oom_score_adj
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -1 postgres: checkpointer process
...
# Lower checkpointer's badness, but keep it same as postmaster's
$ echo -100 > /proc/1837/oom_score_adj
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -100 postgres: checkpointer process
...
# Lower checkpointer's badness, and try to lower it below postmaster's
$ echo -101 > /proc/1837/oom_score_adj
bash: echo: write error: Permission denied
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -100 postgres: checkpointer process
...
# As root, force child process' badness to be lower than postnaster's
$ sudo echo -101 > /proc/1837/oom_score_adj
[sudo] password for gurjeet:
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -101 postgres: checkpointer process
...
--
Gurjeet Singh http://gurjeet.singh.im/
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-06-10 15:46:22 | Re: /proc/self/oom_adj is deprecated in newer Linux kernels |
Previous Message | Andres Freund | 2014-06-10 15:45:36 | Re: /proc/self/oom_adj is deprecated in newer Linux kernels |