Re: to enable O_DIRECT within postgresql

Lists: pgsql-hackers
From: Daniel Ng <danielng1985(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: to enable O_DIRECT within postgresql
Date: 2010-06-11 06:31:41
Message-ID: AANLkTinOYAKY-g0l3aIvofjgM_TlHiUpzTx4eqwwCfzi@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dear all,
I am trying to enable the direct IO for the disk-resident
hash partitions of hashjoin in postgresql. The basic postgres
environment settings are:
centos 5.5
kernel 2.6.18
ext3 fs
PostgreSQL 8.4.3

Previously I added the O_DIRECT flag to the "fileFlags"
parameter of open() within BasicOpenFile() (line 505 in
src/backend/storage/file/fd.c), but strangely I cannot even
start the server, with error:

PANIC: could not read from control file: Invalid argument
Aborted

So far what I did is to add the O_DIRECT flag to the
"fileFlags" parameter of PathNameOpenFile() (line 992 & 1007 in
src/backend/storage/file/fd.c), which calls the BasicOpenFIle()
and passes the "fileFlags". This time, I can start the sever,
but when I submit a hashjoin query from the client, it happens

ERROR: could not write to hash-join temporary file: Invalid argument

Can anyone advise what's the reason and how to fix this?
Or what's the correct way to enable the direct disk IO within
postgres? I appreciate the suggestions and thanks very much!

Regards
Daniel


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Ng <danielng1985(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: to enable O_DIRECT within postgresql
Date: 2010-06-16 03:05:36
Message-ID: 24785.1276657536@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Ng <danielng1985(at)gmail(dot)com> writes:
> I am trying to enable the direct IO for the disk-resident
> hash partitions of hashjoin in postgresql.

Why would you think that's a good idea?

> Can anyone advise what's the reason and how to fix this?

Per the open(2) man page:

The O_DIRECT flag may impose alignment restrictions on the length and
address of userspace buffers and the file offset of I/Os. In Linux
alignment restrictions vary by file system and kernel version and might
be absent entirely. However there is currently no file system-indepen-
dent interface for an application to discover these restrictions for a
given file or file system.

It's unlikely that the code you're hacking makes any attempt to align
the buffers it's using to read/write files.

regards, tom lane


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Daniel Ng <danielng1985(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: to enable O_DIRECT within postgresql
Date: 2010-06-17 18:02:33
Message-ID: 4C1A6339.9080300@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Ng wrote:
> I am trying to enable the direct IO for the disk-resident
> hash partitions of hashjoin in postgresql.

As Tom already mentioned this isn't working because of alignment
issues. I'm not sure what you expect to achieve though. You should be
warned that other than the WAL, every experiment I've ever seen that
tries to add more direct I/O to the database has failed to improve
anything; the result is neither barely noticeable, or a major
performance drop. This is particularly futile if you're doing your
research on Linux/ext3, where even if your code works delivers a speed
up no one will trust it enough to ever merge and deploy it, due to the
generally poor quality of that area of the kernel so far.

This particular area is magnetic for drawing developer attention as it
seems like there's a big win just under the surface if things were
improved a bit. There isn't. On operating systems like Solaris where
it's possible to prototype here by use mounting options to silently
covert parts of the database to direct I/O, experiments in that area
have all been disappointing. One of the presentations from Jignesh Shah
at Sun covered his experiments in this area, can't seem to find it at
the moment but I remember the results were not positive in any way.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: Daniel Ng <danielng1985(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: to enable O_DIRECT within postgresql
Date: 2010-06-18 06:54:24
Message-ID: AANLkTilt3jl5V7-8QHEuryWNWoyfKA6VRRwDbyW3BYIB@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg: Thank you very much for your insightful comments on the performance of

direct io applied to postgres! That inspired me a lot.

Tom: thank you for the reference to man page!

On Fri, Jun 18, 2010 at 2:02 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:

> Daniel Ng wrote:
>
>> I am trying to enable the direct IO for the disk-resident
>> hash partitions of hashjoin in postgresql.
>>
>
> As Tom already mentioned this isn't working because of alignment issues.
> I'm not sure what you expect to achieve though. You should be warned that
> other than the WAL, every experiment I've ever seen that tries to add more
> direct I/O to the database has failed to improve anything; the result is
> neither barely noticeable, or a major performance drop. This is
> particularly futile if you're doing your research on Linux/ext3, where even
> if your code works delivers a speed up no one will trust it enough to ever
> merge and deploy it, due to the generally poor quality of that area of the
> kernel so far.
>
> This particular area is magnetic for drawing developer attention as it
> seems like there's a big win just under the surface if things were improved
> a bit. There isn't. On operating systems like Solaris where it's possible
> to prototype here by use mounting options to silently covert parts of the
> database to direct I/O, experiments in that area have all been
> disappointing. One of the presentations from Jignesh Shah at Sun covered
> his experiments in this area, can't seem to find it at the moment but I
> remember the results were not positive in any way.
>
> --
> Greg Smith 2ndQuadrant US Baltimore, MD
> PostgreSQL Training, Services and Support
> greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us
>
>