Re: One source of constant annoyance identified

From: "Markus Wollny" <Markus(dot)Wollny(at)computec(dot)de>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: One source of constant annoyance identified
Date: 2002-07-03 13:23:39
Message-ID: 2266D0630E43BB4290742247C8910575014CE2C8@dozer.computec.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Okay, now it's even more annoying...

We just upgraded the server from 1GB to 2GB of RAM because it kept
swapping out about 300MB.

I updated postgresql.conf accordingly, setting

max_connections = 190 (this is the minimum we need to satisfy webservers
and backend-jobs)
shared_buffers = 60000 (that's 468,75MB; I took this value assuming a
recommended value of 25% of RAM)
sort_mem = 80000 (that's 78,125MB, recommended value is 4% of RAM,
equalling 82MB).

Now top-output sorted by memory usage turns out to be:

2:43pm up 1:34, 3 users, load average: 3.29, 2.25, 2.08
123 processes: 118 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 60.5% user, 3.1% system, 0.0% nice, 35.4% idle
CPU1 states: 47.3% user, 15.3% system, 0.0% nice, 36.4% idle
CPU2 states: 51.0% user, 17.1% system, 0.0% nice, 31.3% idle
CPU3 states: 75.4% user, 4.0% system, 0.0% nice, 20.0% idle
Mem: 2061560K av, 2054268K used, 7292K free, 0K shrd, 13924K
buff
Swap: 2097136K av, 7788K used, 2089348K free 1825104K
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
1652 postgres 9 0 469M 469M 467M S 0.0 23.3 2:21 postmaster
2041 postgres 9 0 431M 431M 429M S 1.2 21.4 1:32 postmaster
1588 postgres 9 0 411M 411M 410M S 15.6 20.4 1:25 postmaster
1597 postgres 9 0 289M 289M 287M S 0.1 14.3 0:31 postmaster
1849 postgres 9 0 259M 259M 258M S 0.0 12.9 0:14 postmaster
2046 postgres 9 0 239M 239M 238M S 0.0 11.9 0:25 postmaster
1973 postgres 9 0 172M 172M 171M S 0.0 8.5 0:21 postmaster
2142 postgres 9 0 128M 128M 127M S 0.0 6.3 0:04 postmaster
2156 postgres 9 0 116M 116M 114M S 0.0 5.7 0:02 postmaster
1598 postgres 9 0 86548 84M 84752 S 1.7 4.1 2:36 postmaster
1608 postgres 9 0 60932 59M 59356 S 0.0 2.9 2:09 postmaster
1582 postgres 9 0 57624 56M 55444 S 0.0 2.7 1:58 postmaster
1609 postgres 9 0 56408 55M 55164 S 8.3 2.7 2:05 postmaster
1766 postgres 9 0 45248 44M 43824 S 0.0 2.1 1:09 postmaster
2139 postgres 9 0 45276 44M 43892 S 0.0 2.1 0:01 postmaster
2045 postgres 9 0 41500 40M 39820 S 1.7 2.0 2:43 postmaster
1610 postgres 9 0 41336 40M 40172 S 0.0 2.0 0:53 postmaster
2044 postgres 14 0 38328 37M 37056 R 25.0 1.8 0:33 postmaster
1881 postgres 9 0 34936 34M 33676 S 17.2 1.6 0:29 postmaster
2042 postgres 9 0 33144 32M 31920 S 3.7 1.6 1:13 postmaster
1679 postgres 9 0 32516 31M 31288 S 0.0 1.5 0:05 postmaster
1678 postgres 9 0 31996 31M 30812 S 0.0 1.5 0:19 postmaster
1653 postgres 9 0 29424 28M 28180 S 0.0 1.4 0:05 postmaster
2048 postgres 9 0 27772 27M 26556 S 0.0 1.3 0:09 postmaster
1802 postgres 9 0 26676 26M 25504 S 0.0 1.2 0:06 postmaster
2211 postgres 9 0 25940 25M 24592 S 0.0 1.2 0:02 postmaster
2047 postgres 14 0 25588 24M 23532 R 8.0 1.2 0:55 postmaster
2065 postgres 11 0 25584 24M 24404 S 3.9 1.2 0:04 postmaster
1980 postgres 11 0 24584 24M 22864 S 3.5 1.1 0:07 postmaster
1872 postgres 9 0 23908 23M 22800 S 0.0 1.1 0:04 postmaster
2068 postgres 9 0 21352 20M 20188 S 0.0 1.0 0:01 postmaster
2138 postgres 9 0 20928 20M 19644 S 0.0 1.0 0:06 postmaster
1983 postgres 9 0 20544 20M 19344 S 0.0 0.9 0:04 postmaster
2342 postgres 9 0 20352 19M 13772 S 0.0 0.9 0:23 postmaster
2357 postgres 15 0 20260 19M 18532 R 93.1 0.9 1:34 postmaster
2204 postgres 9 0 19816 19M 18244 S 0.0 0.9 0:01 postmaster
2199 postgres 9 0 16840 16M 15452 S 0.0 0.8 0:00 postmaster
2207 postgres 9 0 16784 16M 15512 S 0.0 0.8 0:00 postmaster
2050 postgres 9 0 15880 15M 14136 S 0.0 0.7 0:06 postmaster
2200 postgres 9 0 15568 15M 14080 S 0.0 0.7 0:00 postmaster
2301 postgres 9 0 15076 14M 13940 S 0.0 0.7 0:00 postmaster
2236 postgres 9 0 14132 13M 12824 S 0.0 0.6 0:00 postmaster
2346 postgres 9 0 14080 13M 12972 S 0.0 0.6 0:09 postmaster
2347 postgres 9 0 14064 13M 12960 S 0.0 0.6 0:09 postmaster
2205 postgres 9 0 13904 13M 12412 S 0.0 0.6 0:00 postmaster
2339 postgres 11 0 12660 12M 11448 S 36.1 0.6 0:27 postmaster

which is not at all good...

And still the major problem in finding the cause of all this is to
identify the query which causes these huge backends. At the moment I
tend to suspect that there certainly are several queries which need a
bit of RAM, not that much however, and the backend fails to free up the
memory used by processed queries. What puzzles me here is that all these
400MB+ backend-processes are marked as idle (ps-output) respectively
sleeping, so they don't actually do much. But how come a sleeping
process grabs 20% of 2GB of RAM and 15% of processing time?

Another fact that hints at backends just eating memory without reason is
that when I stop the database, processes keep lingering - kill just
doesn't help, I have to kill -9 the last of the lot to get red of them
before restarting the database:

dunkles:/var/lib/pgsql/data/base # /etc/init.d/postgresql stop
Shutting down PostgreSQL
done
dunkles:/var/lib/pgsql/data/base # ps ax|grep post
863 pts/1 S 0:00 login -- postgres
885 pts/2 S 0:00 login -- postgres
1552 pts/0 S 0:01 /opt/pgsql/bin/postmaster -i
-D/var/lib/pgsql/data/base
1553 pts/0 S 0:00 postgres: stats buffer process

1554 pts/0 S 0:01 postgres: stats collector process

1650 pts/0 S 0:00 postgres: postgres template1 212.123.109.25
idle
1681 pts/0 S 0:00 postgres: postgres abo 212.123.109.25 idle

1682 pts/0 S 0:00 postgres: postgres bluebox 212.123.109.25
idle
1683 pts/0 S 0:00 postgres: postgres kidszone 212.123.109.25
idle
1684 pts/0 S 0:00 postgres: postgres mcv 212.123.109.25 idle

1685 pts/0 S 0:00 postgres: postgres mpo 212.123.109.25 idle

1686 pts/0 S 0:00 postgres: postgres nzone 212.123.109.25 idle

1687 pts/0 S 0:00 postgres: postgres pcaction 212.123.109.25
idle
1688 pts/0 S 0:00 postgres: postgres pcgames 212.123.109.25
idle
1689 pts/0 S 0:00 postgres: postgres phppgadmin 212.123.109.25
idle
1690 pts/0 S 0:00 postgres: postgres pszone 212.123.109.25 idle

1691 pts/0 S 0:00 postgres: postgres saturn 212.123.109.25 idle

1693 pts/0 S 0:00 postgres: postgres template1 212.123.109.25
idle
1780 pts/0 S 0:00 postgres: postgres template1 212.123.108.149
idle
1781 pts/0 S 0:00 postgres: postgres template1 212.123.108.149
idle
1784 pts/0 S 0:00 postgres: postgres pcgames 212.123.108.149
idle

This just has some sort of Windows-look&feel to it - processes not
responding any more.

Now I set back the original values before upgrading RAM: shared_buffers
= 32768 (256MB) and sort_mem = 51200 (50MB). Limit for backend size not
seems to be some 265 odd MB (about four to six of those around at any
given time, as always) and top reports:

3:11pm up 2:03, 3 users, load average: 3.24, 1.65, 1.81
163 processes: 156 sleeping, 7 running, 0 zombie, 0 stopped
CPU0 states: 81.3% user, 15.5% system, 0.0% nice, 2.2% idle
CPU1 states: 67.1% user, 26.0% system, 0.0% nice, 6.3% idle
CPU2 states: 69.2% user, 21.0% system, 0.0% nice, 9.3% idle
CPU3 states: 65.4% user, 20.2% system, 0.0% nice, 13.3% idle
Mem: 2061560K av, 2025392K used, 36168K free, 0K shrd, 13108K
buff
Swap: 2097136K av, 7732K used, 2089404K free 1750556K
cached

Which isn't quite as bad anymore - although it cannot be the optimum
performance for this machine, I can't imagine.

Now what on earth can I do to get this DB running, and when I say
"running" I don't mean "sort of crawling uphills"? Are there any
OS-patches or environment-settings I need to consider? Are there known
memory-leaks? I just doesn't seem to matter whichever settings I take in
postgresql.conf, the database just eats up any available memory in any
case. I cannot use it for production purposes this way. I am quite
dispaired right now - and I am definitely running out of time. Which
bits of linux (glibc, kernel, whatnot) does postgresql 7.2.1 need in
which version to run smoothly? As I mentioned before, we don't use any
"fancy" features of the database like foreign keys, triggers (except
one) or whatever, it's just basic functionality that seems to fall down
on us...

I am willing to try almost anything - but I need to squeeze more
performance out of this thing and I need some hints on which tools to
use to identify the problem. And unfortunately I need all this as soon
as possible... Hints and help are very, very much appreciated. Thank
you!

Regards,

Markus

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Markus Wollny 2002-07-03 13:39:35 Why is index disregarded when querying a timestamp?
Previous Message eunice palmeira 2002-07-03 13:20:50 Temporal Event