Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Dave Chinner <david(at)fromorbit(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joshua Drake <jd(at)commandprompt(dot)com>, Mel Gorman <mgorman(at)suse(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date: 2014-01-17 16:18:42
Message-ID: 52D957E2.4040701@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/17/2014 06:40 AM, Dave Chinner wrote:
> On Thu, Jan 16, 2014 at 08:48:24PM -0500, Robert Haas wrote:
>> On Thu, Jan 16, 2014 at 7:31 PM, Dave Chinner <david(at)fromorbit(dot)com> wrote:
>>> But there's something here that I'm not getting - you're talking
>>> about a data set that you want ot keep cache resident that is at
>>> least an order of magnitude larger than the cyclic 5-15 minute WAL
>>> dataset that ongoing operations need to manage to avoid IO storms.
>>> Where do these temporary files fit into this picture, how fast do
>>> they grow and why are do they need to be so large in comparison to
>>> the ongoing modifications being made to the database?
> [ snip ]
>
>> Temp files are something else again. If PostgreSQL needs to sort a
>> small amount of data, like a kilobyte, it'll use quicksort. But if it
>> needs to sort a large amount of data, like a terabyte, it'll use a
>> merge sort.[1]
> IOWs the temp files contain data that requires transformation as
> part of a query operation. So, temp file size is bound by the
> dataset,
Basically yes, though the size of the "dataset" can be orders of
magnitude bigger than the database in case of some queries.
> growth determined by data retreival and transformation
> rate.
>
> IOWs, there are two very different IO and caching requirements in
> play here and tuning the kernel for one actively degrades the
> performance of the other. Right, got it now.
Yes. A step in right solutions would be some way to tune this
on per-device basis, but as large part of this in linux seems
to be driven from the keeping-vm-clean side it guess it will
be far from simple.
>
> Cheers,
>
> Dave.

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-01-17 16:26:49 currawong is not a happy animal
Previous Message Tom Lane 2014-01-17 15:53:50 Re: Feature request: Logging SSL connections