Re: BUG #8167: false EINVAL -22 for opening a file

Lists: pgsql-bugspgsql-hackers
From: nelson(at)monkey(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #8167: false EINVAL -22 for opening a file
Date: 2013-05-16 01:17:57
Message-ID: E1UcmpR-0004nn-2i@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

The following bug has been logged on the website:

Bug reference: 8167
Logged by: Nelson Minar
Email address: nelson(at)monkey(dot)org
PostgreSQL version: 9.2.4
Operating system: MacOS 10.8.3
Description:

This report and supporting files are available at
https://gist.github.com/NelsonMinar/5588719

I have a PostGIS insert/select operation that creates rows in a table.
Mostly these operations succeed, but every once in awhile one fails
with an error like
ERROR: could not open file "base/16384/24738_fsm": Invalid argument
Googling for this error shows it's occurred occasionally for other Mac
users, but I haven't seen any consistent explanation or solution.

I traced this down with the help of RhodiumToad and anders on IRC
#postgresql. Their guess was that the problem is that Postgres is
calling open() followed by free() before checking errno for open().
The call to free() subsequently calls madvise(), which throws EINVAL,
clobbering the ENOENT that open() set. When Postgres finally checks if
the open() worked it sees the EINVAL from the madvise() instead and
mistakenly reports a problem opening the file. It's unclear whether
the EINVAL from madvise() is a real error or just part of the normal
operation of free() on MacOS.

RhodiumToad asked me to report "pg in PathNameOpenFile / mdopen is
assuming that errno is preserved over calls to free() which is not
required by the spec" and "madvise is using MADV_FREE_REUSABLE".

I've attached dtruss output to this gist for both a successful query
and a failed query. Here's the essential part of the trace of a
failure:

open("base/16384/24738_fsm\0", 0x2, 0x180) = -1 Err#2
madvise(0x7FB983489000, 0x1000, 0x7) = -1 Err#22
sigprocmask(0x3, 0x7FFF580A0420, 0x0) = 0x0 0
sigreturn(0x0, 0x80000000, 0x0) = 0 0
write(0x2, "ERROR: could not open file \"base/16384/24738_fsm\": Invalid
argument\nSTATEMENT: insert into merged_rivers(gnis_id, name, strahler,
huc8, geometry)\n\t select\n\t
MAX(gnis_id) as gnis_id,\n\t MAX(name) as name", 0x204)
= 516 0
sendto(0xC, 0x7FB983841030, 0x65) = 101 0

Environment:

MacOS 10.8.3 Postgres 9.2.4 installed via Homebrew

I'm not positive if Postgres was built with gcc or clang. Here's the
compiler versions:

i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc.
build 5658) (LLVM build 2336.11.00)

Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn) Target:
x86_64-apple-darwin12.3.0 Thread model: posix


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: nelson(at)monkey(dot)org
Cc: pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: BUG #8167: false EINVAL -22 for opening a file
Date: 2013-05-16 15:55:10
Message-ID: 15677.1368719710@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

nelson(at)monkey(dot)org writes:
> RhodiumToad asked me to report "pg in PathNameOpenFile / mdopen is
> assuming that errno is preserved over calls to free() which is not
> required by the spec"

Good catch! The proximate problem appears to be this bit in
PathNameOpenFile:

vfdP->fd = BasicOpenFile(fileName, fileFlags, fileMode);

if (vfdP->fd < 0)
{
FreeVfd(file);
free(fnamecopy);
return -1;
}

The other code layers in this stack seem to be careful about preserving
errno where needed, but this bit isn't.

I've not really reviewed fd.c for other instances of the same bug,
but I do see some places where there are DO_DB(...) debug calls in
places where errno had better not change. I'm thinking maybe we should
improve that macro to include save/restore errno. Or else rip that
stuff out entirely --- I've sure never built this code with FDDEBUG
set, has anyone else?

regards, tom lane


From: Nelson Minar <nelson(at)monkey(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8167: false EINVAL -22 for opening a file
Date: 2013-05-16 20:45:49
Message-ID: CACqBh6tZMuz5GeL3c2cva3BPsqqWWp+TE-afUp_CkJHCPCxCVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Thanks for the reply; glad it looks like a probable bug with a
straightforward fix.

FWIW, here's a discussion from last year that looks like people hitting the
same problem. Same mysterious EINVAL logged opening a file, on Macs. No
conclusion in that discussion.

http://comments.gmane.org/gmane.comp.db.postgresql.general/162465


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: nelson(at)monkey(dot)org, pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: BUG #8167: false EINVAL -22 for opening a file
Date: 2013-05-17 00:12:27
Message-ID: CA+TgmoZ4JPsb5tv31KLqkupyMiUeXHzqVyYueMsux7-R4EA5wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Thu, May 16, 2013 at 11:55 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I've not really reviewed fd.c for other instances of the same bug,
> but I do see some places where there are DO_DB(...) debug calls in
> places where errno had better not change. I'm thinking maybe we should
> improve that macro to include save/restore errno. Or else rip that
> stuff out entirely --- I've sure never built this code with FDDEBUG
> set, has anyone else?

Not me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Nelson Minar <nelson(at)monkey(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8167: false EINVAL -22 for opening a file
Date: 2013-05-19 23:07:18
Message-ID: CACqBh6vpomSM36yF40J+Yz0CxqpypcNUoOVU1NQwUbtM+KHQgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

I've tested Tom Lane's fix, on 9.2.4 on my Mac, and it seems to have solved
the problem. Thanks!

https://github.com/postgres/postgres/commit/6563fb2b45146852601e63828308fe04fb03b9e9