From: | David Rees <drees76(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Occasional giant spikes in CPU load |
Date: | 2010-04-07 22:56:11 |
Message-ID: | y2r72dbd3151004071556j22244233l8fa1db295b7f9cd3@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, Apr 7, 2010 at 2:37 PM, Craig James <craig_james(at)emolecules(dot)com> wrote:
> Most of the time Postgres runs nicely, but two or three times a day we get a
> huge spike in the CPU load that lasts just a short time -- it jumps to 10-20
> CPU loads. Today it hit 100 CPU loads. Sometimes days go by with no spike
> events. During these spikes, the system is completely unresponsive (you
> can't even login via ssh).
You need to find out what all those Postgres processes are doing. You
might try enabling update_process_title and then using ps to figure
out what each instance is using. Otherwise, you might try enabling
logging of commands that take a certain amount of time to run (see
log_min_duration_statement).
> I managed to capture one such event using top(1) with the "batch" option as
> a background process. See output below - it shows 19 active postgress
> processes, but I think it missed the bulk of the spike.
Looks like it. The system doesn't appear to be overloaded at all at that point.
> 8 CPUs, 8 GB memory
> 8-disk RAID10 (10k SATA)
> Postgres 8.3.0
Should definitely update to the latest 8.3.10 - 8.3 has a LOT of known bugs.
> Fedora 8, kernel is 2.6.24.4-64.fc8
Wow, that is very old, too.
> Diffs from original postgres.conf:
>
> max_connections = 1000
> shared_buffers = 2000MB
> work_mem = 256MB
work_mem is way too high for 1000 connections and 8GB ram. You could
simply be starting up too many postgres processes and overwhelming the
machine. Either significantly reduce max_connections or work_mem.
> max_fsm_pages = 16000000
> max_fsm_relations = 625000
> synchronous_commit = off
You are playing with fire here. You should never turn this off unless
you do not care if your data becomes irrecoverably corrupted.
> top - 11:24:59 up 81 days, 20:27, 4 users, load average: 0.98, 0.83, 0.92
> Tasks: 366 total, 20 running, 346 sleeping, 0 stopped, 0 zombie
> Cpu(s): 30.6%us, 1.5%sy, 0.0%ni, 66.3%id, 1.5%wa, 0.0%hi, 0.0%si,
> 0.0%st
> Mem: 8194800k total, 8118688k used, 76112k free, 36k buffers
> Swap: 2031608k total, 169348k used, 1862260k free, 7313232k cached
System load looks very much OK given that you have 8 CPUs.
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 18972 postgres 20 0 2514m 11m 8752 R 11 0.1 0:00.35 postmaster
> 10618 postgres 20 0 2514m 12m 9456 R 9 0.2 0:00.54 postmaster
> 10636 postgres 20 0 2514m 11m 9192 R 9 0.1 0:00.45 postmaster
> 25903 postgres 20 0 2514m 11m 8784 R 9 0.1 0:00.21 postmaster
> 10626 postgres 20 0 2514m 11m 8716 R 6 0.1 0:00.45 postmaster
> 10645 postgres 20 0 2514m 12m 9352 R 6 0.2 0:00.42 postmaster
> 10647 postgres 20 0 2514m 11m 9172 R 6 0.1 0:00.51 postmaster
> 18502 postgres 20 0 2514m 11m 9016 R 6 0.1 0:00.23 postmaster
> 10641 postgres 20 0 2514m 12m 9296 R 5 0.2 0:00.36 postmaster
> 10051 postgres 20 0 2514m 13m 10m R 4 0.2 0:00.70 postmaster
> 10622 postgres 20 0 2514m 12m 9216 R 4 0.2 0:00.39 postmaster
> 10640 postgres 20 0 2514m 11m 8592 R 4 0.1 0:00.52 postmaster
> 18497 postgres 20 0 2514m 11m 8804 R 4 0.1 0:00.25 postmaster
> 18498 postgres 20 0 2514m 11m 8804 R 4 0.1 0:00.22 postmaster
> 10341 postgres 20 0 2514m 13m 9m R 2 0.2 0:00.57 postmaster
> 10619 postgres 20 0 2514m 12m 9336 R 1 0.2 0:00.38 postmaster
> 15687 postgres 20 0 2321m 35m 35m R 0 0.4 8:36.12 postmaster
Judging by the amount of CPU time each postmaster as accumulated, they
are all fairly new processes. How many pg proceses of the ~350
currently running are there?
-Dave
From | Date | Subject | |
---|---|---|---|
Next Message | Craig James | 2010-04-07 22:57:25 | Re: Occasional giant spikes in CPU load |
Previous Message | Joshua D. Drake | 2010-04-07 22:36:48 | Re: Occasional giant spikes in CPU load |