Quick Links

Re: Why we are going to have to go DirectIO

From:	KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Why we are going to have to go DirectIO
Date:	2013-12-05 08:35:31
Message-ID:	52A03AD3.6000606@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

(2013/12/04 16:39), Claudio Freire wrote:
> On Wed, Dec 4, 2013 at 4:28 AM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>>>> Can we avoid the Linux kernel problem by simply increasing our shared
>>>> buffer size, say up to 80% of memory?
>>> It will be swap more easier.
>>
>> Is that the case? If the system has not enough memory, the kernel
>> buffer will be used for other purpose, and the kernel cache will not
>> work very well anyway. In my understanding, the problem is, even if
>> there's enough memory, the kernel's cache does not work as expected.
>
> Problem is, Postgres relies on a working kernel cache for checkpoints.
> Checkpoint logic would have to be heavily reworked to account for an
> impaired kernel cache.
>
> Really, there's no difference between fixing the I/O problems in the
> kernel(s) vs in postgres. The only difference is, in the kernel(s),
> everyone profits, and you've got a huge head start.
Yes.　And using something efficiently DirectIO is more difficult than BufferedIO.
If we change write() flag with direct IO in PostgreSQL, it will execute hardest
ugly randomIO.

> Communicating more with the kernel (through posix_fadvise, fallocate,
> aio, iovec, etc...) would probably be good, but it does expose more
> kernel issues. posix_fadvise, for instance, is a double-edged sword
> ATM. I do believe, however, that exposing those issues and prompting a
> fix is far preferable than silently working around them.
Agreed. And, I believe that controled BufferedIO is faster and easier than
controled DirectIO perfectly. In actually, Oracle database uses BufferedIO to
access small datasets, and uses DirectIO to access big datasets. It is because
using OS file cache more efficiently.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

In response to

Re: Why we are going to have to go DirectIO at 2013-12-04 07:39:23 from Claudio Freire

Responses

Re: Why we are going to have to go DirectIO at 2013-12-05 14:42:29 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2013-12-05 08:41:31	Re: Why we are going to have to go DirectIO
Previous Message	Michael Paquier	2013-12-05 08:25:22	Re: Regression tests failing if not launched on db "regression"