Re: [HACKERS] kqueue

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Rui DeSousa <rui(at)crazybean(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Matteo Beccati <php(at)beccati(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Marko Tiikkaja <marko(at)joh(dot)to>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: [HACKERS] kqueue
Date: 2020-01-24 22:29:11
Message-ID: CA+hUKGLDfs-tcEYdOG6+7cFkGnuWNmVTJxri0MB=CE93aQNP_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 23, 2020 at 9:38 AM Rui DeSousa <rui(at)crazybean(dot)net> wrote:
> On Jan 22, 2020, at 2:19 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It's certainly possible that to see any benefit you need stress
>> levels above what I can manage on the small box I've got these
>> OSes on. Still, it'd be nice if a performance patch could show
>> some improved performance, before we take any portability risks
>> for it.

You might need more than one CPU socket, or at least lots more cores
so that you can create enough contention. That was needed to see the
regression caused by commit ac1d794 on Linux[1].

> Here is two charts comparing a patched and unpatched system.
> These systems are very large and have just shy of thousand
> connections each with averages of 20 to 30 active queries concurrently
> running at times including hundreds if not thousand of queries hitting
> the database in rapid succession. The effect is the unpatched system
> generates a lot of system load just handling idle connections where as
> the patched version is not impacted by idle sessions or sessions that
> have already received data.

Thanks. I can reproduce something like this on an Azure 72-vCPU
system, using pgbench -S -c800 -j32. The point of those settings is
to have many backends, but they're all alternating between work and
sleep. That creates a stream of poll() syscalls, and system time goes
through the roof (all CPUs pegged, but it's ~half system). Profiling
the kernel with dtrace, I see the most common stack (by a long way) is
in a poll-related lock, similar to a profile Rui sent me off-list from
his production system. Patched, there is very little system time and
the TPS number goes from 539k to 781k.

[1] https://www.postgresql.org/message-id/flat/CAB-SwXZh44_2ybvS5Z67p_CDz%3DXFn4hNAD%3DCnMEF%2BQqkXwFrGg%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesse Zhang 2020-01-24 22:51:56 Re: Parallel grouping sets
Previous Message Bossart, Nathan 2020-01-24 22:13:44 Re: [UNVERIFIED SENDER] Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM