Re: Context switch storm

From: Andreas Kostyrka <andreas(at)kostyrka(dot)org>
To: Cosimo Streppone <cosimo(at)streppone(dot)it>
Cc: Richard Huxton <dev(at)archonet(dot)com>, Postgresql Performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Context switch storm
Date: 2006-11-14 10:13:11
Message-ID: 20061114101311.GN8410@andi-lap.la.revver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

* Cosimo Streppone <cosimo(at)streppone(dot)it> [061114 10:52]:
> Richard Huxton wrote:
> >Cosimo Streppone wrote:
> >>Richard Huxton wrote:
> >>
> >>>>The average context switching for this server as vmstat shows is 1
> >>>>but when the problem occurs it goes to 250000.
> >>>
> >>I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server
> >upgrade from 8.0.1 - the most recent is 8.0.9 iirc
> >[...]
> >Are you seeing a jump in context-switching in top? You'll know when you do - it's a *large* jump. That's the key diagnosis. Otherwise it might simply be your configuration settings
> >aren't ideal for that workload.
>
> Sorry for the delay.
>
> I have logged vmstat results for the last 3 days.
> Max context switches figure is 20500.
>
> If I understand correctly, this does not mean a "storm",
Nope, 20500 is a magnitude to low to the storms we were experiencing.

> but only that the 2 Xeons are overloaded.
> Probably, I can do a good thing switching off the HyperThreading.
> I get something like 12/15 *real* concurrent processes hitting
> the server.

Actually, for the storms we had, the number of concurrent processes
AND the workload is important:

many processes that do all different things => overloaded server
many processes that do all the same queries => storm.

Basically, it seems that postgresql implementation of locking is on
quite unfriendly standings with the Xeon memory subsystems. googling
around might provide more details.

>
> I must say I lowered "shared_buffers" to 8192, as it was before.
> I tried raising it to 16384, but I can't seem to find a relationship
> between shared_buffers and performance level for this server.
>
> >Well, the client I saw it with just bought a dual-opteron server and used their quad-Xeon for something else. However, I do remember that 8.1 seemed better than 7.4 before they
> >switched. Part of that might just have been better query-planning and other efficiences though.
>
> An upgrade to 8.1 is definitely the way to go.
> Any 8.0 - 8.1 migration advice?
Simple, there are basically two ways:
a) you can take downtime: pg_dump + restore
b) you cannot take downtime: install slony, install your new 8.1
server, replicate into it, switchover to the new server.

If you can get new hardware for the 8.1 box, you have two benefits:
a) order Opterons. That doesn't solve the overload problem as such,
but these pesky cs storms seems to have gone away this way.
(that was basically the "free" advice from an external consultant,
which luckily matched with my ideas what the problem could be. Cheap
solution at $3k :) )
b) you can use the older box still as readonly replica.
c) you've got a hot backup of your db.

Andreas

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2006-11-14 14:17:08 Re: Context switch storm
Previous Message Cosimo Streppone 2006-11-14 09:51:44 Re: Context switch storm