Re: "soft lockup" in kernel

From: Dennis Jenkins <dennis(dot)jenkins(dot)75(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: "soft lockup" in kernel
Date: 2013-07-05 13:24:51
Message-ID: CAAEzAp-7hVXgJnZixBtgfOkU2QdXThxyuNuopxenC4CyO6g7ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jul 5, 2013 at 7:00 AM, Stuart Ford <stuart(dot)ford(at)glide(dot)uk(dot)com>wrote:

> Dear community
>
> Twice today our PG 9.1 server has caused a "soft lockup", with a kernel
> message like this:
>
> [1813775.496127] BUG: soft lockup - CPU#3 stuck for 73s! [postgres:18723]
>
> Full dmesg output - http://pastebin.com/YdWSmNUp
>
> The incidents were approximately two hours apart and the server was
> momentarily unavailable before coming back again, with no restore actions
> required - it just carried on. It's never done this before, I've checked
> in the logs. The server is about 3 weeks old and runs Debian 7 under
> VMWare.
>
> Does anybody know what could cause this, and, if so, is it something to be
> concerned about and what can be done to stop it?
>
>
Before I looked at your pastebin, I was going to ask "What kind of storage
are the VMDKs on? If they are on NFS, iSCSI or FC, could the NAS/SAN be
experiencing a problem?"

But I see in the stack trace that the kernel thread hung in
"vmxnet3_xmit_frame" (sending an Ethernet frame on your virtual NIC).

Describe your vmware network topology.

Do you share the same VLAN for guest traffic with NFS or iSCSI used by
vmware for storing VMDKs?

Are there any errors recorded on the Ethernet switch connected to your
VMWare servers?

What path does a packet take to get from a postgresql server process in
your VM to a client?

What version of VMWare are you running?

If you are managing it with vCenter, are there any alarms of events in the
VMWare logs?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2013-07-05 13:31:15 Re: unable to call a function
Previous Message Magnus Hagander 2013-07-05 13:12:44 Re: [GENERAL] autoanalyze criteria