Re: Do you know the reason for increased max latency due to xlog scaling?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: MauMau <maumau307(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Do you know the reason for increased max latency due to xlog scaling?
Date: 2014-02-18 21:30:05
Message-ID: CAMkU=1xc3bbPgffP1BPGNv9HGSL-HZdmJLCXqgqh7_J7BQd7kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 18, 2014 at 9:12 AM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 02/18/2014 06:27 PM, Jeff Janes wrote:
>
>> On Tue, Feb 18, 2014 at 3:49 AM, MauMau <maumau307(at)gmail(dot)com> wrote:
>>
>> --- or in other words, greater variance in response times. With my
>>> simple
>>> understanding, that sounds like a problem for response-sensitive users.
>>>
>>
>> If you need the throughput provided by 9.4, then using 9.3 gets lower
>> variance simply be refusing to do 80% of the assigned work. If you don't
>> need the throughput provided by 9.4, then you probably have some natural
>> throttling in place.
>>
>> If you want a real-world like test, you might try to crank up the -c and
>> -j
>> to the limit in 9.3 in a vain effort to match 9.4's performance, and see
>> what that does to max latency. (After all, that is what a naive web app
>> is
>> likely to do--continue to make more and more connections as requests come
>> in faster than they can finish.)
>>
>
> You're missing MauMau's point. In essence, he's comparing two systems with
> the same number of clients, issuing queries as fast as they can, and one
> can do 2000 TPS while the other one can do 10000 TPS. You would expect the
> lower-throughput system to have a *higher* average latency. Each query
> takes longer, that's why the throughput is lower. If you look at the
> avg_latency columns in the graphs (http://hlinnaka.iki.fi/
> xloginsert-scaling/padding/), that's exactly what you see.
>
> But what MauMau is pointing out is that the *max* latency is much higher
> in the system that can do 10000 TPS. So some queries are taking much
> longer, even though in average the latency is lower. In an ideal, totally
> fair system, each query would take the same amount of time to execute, and
> after it's saturated, increasing the number of clients just makes that
> constant latency higher.
>

I thought that this was the point I was making, not the point I was
missing. You have the same hard drives you had before, but now due to a
software improvement you are cramming 5 times more stuff through them.
Yeah, you will get bigger latency spikes. Why wouldn't you? You are now
beating the snot out of your hard drives, whereas before you were not.

If you need 10,000 TPS, then you need to upgrade to 9.4. If you need it
with low maximum latency as well, then you probably need to get better IO
hardware as well (maybe not--maybe more tuning could help). With 9.3 you
didn't need better IO hardware, because you weren't capable of maxing out
what you already had. With 9.4 you can max it out, and this is a good
thing.

If you need 10,000 TPS but only 2000 TPS are completing under 9.3, then
what is happening to the other 8000 TPS? Whatever is happening to them, it
must be worse than a latency spike.

On the other hand, if you don't need 10,000 TPS, than measuring max latency
at 10,000 TPS is the wrong thing to measure.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-02-18 21:48:57 Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding
Previous Message Andres Freund 2014-02-18 21:05:04 Re: Do you know the reason for increased max latency due to xlog scaling?