Re: NUMA packaging and patch

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: NUMA packaging and patch
Date: 2014-06-10 14:54:14
Message-ID: CA+TgmoYaUh0EYUUba7MMcPu7ZwyqG_fRDa6J=PMgyBf76dU7nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 9, 2014 at 1:00 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> On 2014-06-09 08:59:03 -0700, Kevin Grittner wrote:
>>>> *) There is a lot of advice floating around (for example here:
>>>> http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html )
>>>> to instruct operators to disable zone_reclaim. Will your changes
>>>> invalidate any of that advice?
>>>
>>> I expect that it will make the need for that far less acute,
>>> although it is probably still best to disable zone_reclaim (based
>>> on the documented conditions under which disabling it makes sense).
>>
>> I think it'll still be important unless you're running an OLTP workload
>> (i.e. minimal per backend allocations) and your entire workload fits
>> into shared buffers. What zone_reclaim > 0 essentially does is to never
>> allocate memory from remote nodes. I.e. it will throw away all numa node
>> local OS cache to satisfy a memory allocation (including
>> pagefaults).
>
> I don't think that cpuset spreading of OS buffers and cache, and
> the patch to spread shared memory, will make too much difference
> unless the working set is fully cached. Where I have seen the
> biggest problems is when the active set > one memory node and <
> total machine RAM.

But that's precisely the scenario where vm.zone_reclaim_mode != 0 is a
disaster. You'll end up throwing away the cached pages and rereading
the data from disk, even though the memory *could* have been kept all
in cache.

> I would agree that unless this patch is
> providing benefit for such a fully-cached load, it won't make any
> difference regarding the need for zone_reclaim_mode. Where the
> data is heavily cached, zone_reclaim > 0 might discard some cached
> pages to allow, say, a RAM sort to be done in faster memory (for
> the current process's core), so it might be a wash or even make
> zone_reclaim > 0 a win.

I will believe that when, and only when, I see benchmarks convincingly
demonstrating it. Setting zone_reclaim_mode can only be a win if the
performance benefit from using faster memory is greater than the
performance cost of any rereading-from-disk that happens. IME, that's
a highly unusual situation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-06-10 14:56:48 Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Previous Message Tom Lane 2014-06-10 14:51:16 Re: /proc/self/oom_adj is deprecated in newer Linux kernels