Re: Slow application response on lightly loaded server?

Lists: pgsql-performance
From: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Slow application response on lightly loaded server?
Date: 2012-07-17 16:27:23
Message-ID: CANPAkgtV-0r=ZqGGJm6mvi1CZNvBJSWdfFX2bjFOQJMebr035Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

We're seeing slow application performance on a PostgreSQL 9.1 server which
appears to be relatively lightly loaded. Some graphs from pgstatview are
at http://www2.uptimeforce.com/pgstatview/e35ba4e7db0842a1b9cf2e10a4c03d91/
These cover approximately 40 minutes, during which there was some activity
from a web application and two bulk loads in process.

The machine running the bulk loads (perl scripts) is also running at about
70% idle with very little iowait. That seems to suggest network latency to
me.

Am I missing something in the server stats that would indicate a problem?
If not, where should I look next?

__________________________________________________________________________________
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
Mike(dot)Blackwell(at)rrd(dot)com
http://www.rrdonnelley.com

<http://www.rrdonnelley.com/>
* <Mike(dot)Blackwell(at)rrd(dot)com>*


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow application response on lightly loaded server?
Date: 2012-07-17 16:35:03
Message-ID: CAOR=d=3mbyZcK5RsJchpF4mOQ7g2V9BerXKiG92vyvhJuGdzBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Tue, Jul 17, 2012 at 10:27 AM, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com> wrote:
> We're seeing slow application performance on a PostgreSQL 9.1 server which
> appears to be relatively lightly loaded. Some graphs from pgstatview are at
> http://www2.uptimeforce.com/pgstatview/e35ba4e7db0842a1b9cf2e10a4c03d91/
> These cover approximately 40 minutes, during which there was some activity
> from a web application and two bulk loads in process.
>
> The machine running the bulk loads (perl scripts) is also running at about
> 70% idle with very little iowait. That seems to suggest network latency to
> me.
>
> Am I missing something in the server stats that would indicate a problem?
> If not, where should I look next?

I'd run vmstat and look for high cs or int numbers (100k and above) to
see if you're maybe seeing an issue with that. A lot of times a
"slow" server is just too much process switching. But yeah, the
graphs you've posted don't seem overly bad.


From: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow application response on lightly loaded server?
Date: 2012-07-17 17:37:43
Message-ID: CANPAkguGHV21b2PA7mo4KqR7-Rzis=ESMW_fEXUDCm=RQMt0kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Tue, Jul 17, 2012 at 11:35 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
wrote:

I'd run vmstat and look for high cs or int numbers (100k and above) to
> see if you're maybe seeing an issue with that. A lot of times a
> "slow" server is just too much process switching. But yeah, the
> graphs you've posted don't seem overly bad.
>

Thanks for the tip. Here's a quick look at those numbers under that same
load. Watching it for a while longer didn't show any spikes. That doesn't
seem to be it, either.

$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy id
wa
3 0 11868 34500 16048 3931436 0 0 4 2 0 0 6 2
91 1
2 0 11868 21964 16088 3931396 0 0 0 212 8667 8408 15 3
80 2
0 0 11868 37772 16112 3932152 0 0 2 249 9109 8811 34 2
62 1
2 0 11868 34068 16124 3932400 0 0 1 168 9142 9165 12 3
84 1
1 0 11868 38036 16124 3932920 0 0 8 155 9995 10904 16 4
80 1
1 0 11868 40212 16124 3933440 0 0 0 146 9586 9825 13 3
83 1

__________________________________________________________________________________
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
Mike(dot)Blackwell(at)rrd(dot)com
http://www.rrdonnelley.com

<http://www.rrdonnelley.com/>


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow application response on lightly loaded server?
Date: 2012-07-17 17:49:51
Message-ID: CAOR=d=3=TWY_SFNaOf_ZdeD+JSGD9zSRj_4XXuK60p0uKcv7uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Tue, Jul 17, 2012 at 11:37 AM, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com> wrote:
>
> On Tue, Jul 17, 2012 at 11:35 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
> wrote:
>
>> I'd run vmstat and look for high cs or int numbers (100k and above) to
>> see if you're maybe seeing an issue with that. A lot of times a
>> "slow" server is just too much process switching. But yeah, the
>> graphs you've posted don't seem overly bad.
>
>
>
> Thanks for the tip. Here's a quick look at those numbers under that same
> load. Watching it for a while longer didn't show any spikes. That doesn't
> seem to be it, either.

Yep it all looks good to me. Are you sure you're not getting network
lag or something like that?


From: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow application response on lightly loaded server?
Date: 2012-07-17 18:10:31
Message-ID: CANPAkgsBcq_RHNmSU5-6aSndUnDOEnitViMHdYSVkrsU4j8LBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I'm wondering about that. However, the database server and the server
doing the bulk loads are on the same subnet. Traceroute shows only a
single hop. Traceroute and ping both show reply times in the area of .25 -
.50 ms or so. Is that reasonable?

__________________________________________________________________________________
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
Mike(dot)Blackwell(at)rrd(dot)com
http://www.rrdonnelley.com

<http://www.rrdonnelley.com/>
* <Mike(dot)Blackwell(at)rrd(dot)com>*

On Tue, Jul 17, 2012 at 12:49 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>wrote:

> On Tue, Jul 17, 2012 at 11:37 AM, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
> wrote:
> >
> > On Tue, Jul 17, 2012 at 11:35 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com
> >
> > wrote:
> >
> >> I'd run vmstat and look for high cs or int numbers (100k and above) to
> >> see if you're maybe seeing an issue with that. A lot of times a
> >> "slow" server is just too much process switching. But yeah, the
> >> graphs you've posted don't seem overly bad.
> >
> >
> >
> > Thanks for the tip. Here's a quick look at those numbers under that same
> > load. Watching it for a while longer didn't show any spikes. That
> doesn't
> > seem to be it, either.
>
> Yep it all looks good to me. Are you sure you're not getting network
> lag or something like that?
>


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow application response on lightly loaded server?
Date: 2012-07-17 19:36:19
Message-ID: CAOR=d=2b+vzSpWkA2xCds=z3CEor6Ca_z-8XZSvVnAh479OuUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Yeah seems reasonable. The last thing I'd look at is something like
improperly configured dns service. Are you connecting by IP or by
host name?

On Tue, Jul 17, 2012 at 12:10 PM, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com> wrote:
> I'm wondering about that. However, the database server and the server doing
> the bulk loads are on the same subnet. Traceroute shows only a single hop.
> Traceroute and ping both show reply times in the area of .25 - .50 ms or so.
> Is that reasonable?
>
> __________________________________________________________________________________
> Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management
> | RR Donnelley
> 1750 Wallace Ave | St Charles, IL 60174-3401
> Office: 630.313.7818
> Mike(dot)Blackwell(at)rrd(dot)com
> http://www.rrdonnelley.com
>
>
>
>
>
> On Tue, Jul 17, 2012 at 12:49 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
> wrote:
>>
>> On Tue, Jul 17, 2012 at 11:37 AM, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
>> wrote:
>> >
>> > On Tue, Jul 17, 2012 at 11:35 AM, Scott Marlowe
>> > <scott(dot)marlowe(at)gmail(dot)com>
>> > wrote:
>> >
>> >> I'd run vmstat and look for high cs or int numbers (100k and above) to
>> >> see if you're maybe seeing an issue with that. A lot of times a
>> >> "slow" server is just too much process switching. But yeah, the
>> >> graphs you've posted don't seem overly bad.
>> >
>> >
>> >
>> > Thanks for the tip. Here's a quick look at those numbers under that
>> > same
>> > load. Watching it for a while longer didn't show any spikes. That
>> > doesn't
>> > seem to be it, either.
>>
>> Yep it all looks good to me. Are you sure you're not getting network
>> lag or something like that?
>
>

--
To understand recursion, one must first understand recursion.


From: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow application response on lightly loaded server?
Date: 2012-07-17 19:48:26
Message-ID: CANPAkgu56WvovORpfxoP39ZnWUmFDuCXixJPPHsoh19p6kKtWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Tue, Jul 17, 2012 at 2:36 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
wrote:

> Yeah seems reasonable. The last thing I'd look at is something like
> improperly configured dns service. Are you connecting by IP or by
> host name?
>
>
Interesting possibility. We're currently connecting by host name. I could
try temporarily using the IP from one of the servers to see if that helps.
I'm not familiar enough with DNS services to do any diagnostics other than
using dig to see where something points.

Thanks for your help, BTW!

__________________________________________________________________________________
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
Mike(dot)Blackwell(at)rrd(dot)com
http://www.rrdonnelley.com


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow application response on lightly loaded server?
Date: 2012-07-17 21:00:30
Message-ID: CAOR=d=25UuJ+Xo-Twu41vFfw6j+SW3kd8SoFvMjT9DpWJdEH-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Well if it suddenly gets faster when connecting by IP, you'll know
where your problem lies. DNS issues are more common in windows
installs, due to Windows having more interesting ways to misconfigure
dns etc.

On Tue, Jul 17, 2012 at 1:48 PM, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com> wrote:
> On Tue, Jul 17, 2012 at 2:36 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
> wrote:
>
>
>>
>> Yeah seems reasonable. The last thing I'd look at is something like
>> improperly configured dns service. Are you connecting by IP or by
>> host name?
>>
>
> Interesting possibility. We're currently connecting by host name. I could
> try temporarily using the IP from one of the servers to see if that helps.
> I'm not familiar enough with DNS services to do any diagnostics other than
> using dig to see where something points.
>
> Thanks for your help, BTW!
>
> __________________________________________________________________________________
> Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management
> | RR Donnelley
> 1750 Wallace Ave | St Charles, IL 60174-3401
> Office: 630.313.7818
> Mike(dot)Blackwell(at)rrd(dot)com
> http://www.rrdonnelley.com
>
>

--
To understand recursion, one must first understand recursion.