Re: pg_hba.conf && ident ...

Lists: pgsql-hackers
From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Cc: darcy(at)vex(dot)net
Subject: pg_hba.conf && ident ...
Date: 2000-05-10 13:41:31
Message-ID: Pine.BSF.4.21.0005101037020.777-100000@thelab.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


has anyone played with/tested this in v7.0? I'm investigating the hanging
problem, and it just happened ... when I do an lsof on the process, it
shows these two:

postgres 4969 pgsql 5u IPv4 0xd4631500 0t0 TCP pgsql.tht.net:5432->smaug.vex.net:61189 (ESTABLISHED)
postgres 4969 pgsql 8u IPv4 0xd46300c0 0t0 TCP pgsql.tht.net:1046->smaug.vex.net:auth (ESTABLISHED)

it doesn't appear to lock it up every time though ... this time it
*eventually* came back again, but, afterwards, if you do another lsof,
there is one more line with that "can't read inpcb..." error on it ...

i pg_hba.conf, that host has:

host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser

And its the only time we have ident being used ...

right now, its the only theory I ahve to work with ...

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: The Hermit Hacker <scrappy(at)hub(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org, darcy(at)vex(dot)net
Subject: Re: pg_hba.conf && ident ...
Date: 2000-05-10 14:27:13
Message-ID: 18050.957968833@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> i pg_hba.conf, that host has:
> host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser
> And its the only time we have ident being used ...
> right now, its the only theory I ahve to work with ...

Bingo. All your cores show the thing waiting inside the ident code:

(gdb) bt
#0 0x18263890 in recvfrom () from /usr/lib/libc.so.4
#1 0x1825062b in recv () from /usr/lib/libc.so.4
#2 0x80ad4d0 in ident (remote_ip_addr={s_addr = 508067544}, local_ip_addr={
s_addr = 56131288}, remote_port=27631, local_port=14357,
ident_failed=0xbfbfeeef "\004\023 \b,\207\024\b\212\217(\030\223\203\204|\n\b\214+\0304P",
ident_username=0xbfbfeef0 "\004\023 \b,\207\024\b\212\217(\030\223\203\204|\n\b\214+\0304P") at hba.c:635
#3 0x80ad912 in authident (raddr=0x82011ac, laddr=0x8201140,
postgres_username=0x8201261 "db", auth_arg=0x8201304 "sameuser")
at hba.c:869
#4 0x80ac5b9 in be_recvauth (port=0x8201000) at auth.c:523
#5 0x80e0c4a in readStartupPacket (arg=0x8201000, len=292, pkt=0x820101c)
at postmaster.c:1214
#6 0x80aeb67 in PacketReceiveFragment (port=0x8201000) at pqpacket.c:102
#7 0x80e08ad in ServerLoop () at postmaster.c:982
#8 0x80e039a in PostmasterMain (argc=13, argv=0xbfbffbc4) at postmaster.c:723
#9 0x80aee43 in main (argc=13, argv=0xbfbffbc4) at main.c:93
#10 0x8063393 in _start ()

Looking at the code, there doesn't seem to be any defense against a
broken ident server --- there is no timeout or anything being used here!
Ugh. Has it always been like this?

Anyway, I think the immediate fix for you is to stop using ident auth
for that host, at least till we can improve this code...

regards, tom lane


From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, darcy(at)vex(dot)net
Subject: Re: pg_hba.conf && ident ...
Date: 2000-05-10 14:34:02
Message-ID: Pine.BSF.4.21.0005101132330.777-100000@thelab.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 10 May 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> > i pg_hba.conf, that host has:
> > host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser
> > And its the only time we have ident being used ...
> > right now, its the only theory I ahve to work with ...
>
> Bingo. All your cores show the thing waiting inside the ident code:
>
> (gdb) bt
> #0 0x18263890 in recvfrom () from /usr/lib/libc.so.4
> #1 0x1825062b in recv () from /usr/lib/libc.so.4
> #2 0x80ad4d0 in ident (remote_ip_addr={s_addr = 508067544}, local_ip_addr={
> s_addr = 56131288}, remote_port=27631, local_port=14357,
> ident_failed=0xbfbfeeef "\004\023 \b,\207\024\b\212\217(\030\223\203\204|\n\b\214+\0304P",
> ident_username=0xbfbfeef0 "\004\023 \b,\207\024\b\212\217(\030\223\203\204|\n\b\214+\0304P") at hba.c:635
> #3 0x80ad912 in authident (raddr=0x82011ac, laddr=0x8201140,
> postgres_username=0x8201261 "db", auth_arg=0x8201304 "sameuser")
> at hba.c:869
> #4 0x80ac5b9 in be_recvauth (port=0x8201000) at auth.c:523
> #5 0x80e0c4a in readStartupPacket (arg=0x8201000, len=292, pkt=0x820101c)
> at postmaster.c:1214
> #6 0x80aeb67 in PacketReceiveFragment (port=0x8201000) at pqpacket.c:102
> #7 0x80e08ad in ServerLoop () at postmaster.c:982
> #8 0x80e039a in PostmasterMain (argc=13, argv=0xbfbffbc4) at postmaster.c:723
> #9 0x80aee43 in main (argc=13, argv=0xbfbffbc4) at main.c:93
> #10 0x8063393 in _start ()
>
> Looking at the code, there doesn't seem to be any defense against a
> broken ident server --- there is no timeout or anything being used here!
> Ugh. Has it always been like this?
>
> Anyway, I think the immediate fix for you is to stop using ident auth
> for that host, at least till we can improve this code...

Once I started scanning with lsof and saw the auth stuff, I clued in and
we disabled the ident stuff ... looking at your backtrace above, I should
have clued in sooner, as I *saw* the ident on line 2, but didn't *see* it
:(

Thanks ...

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org


From: Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: The Hermit Hacker <scrappy(at)hub(dot)org>, pgsql-hackers(at)postgresql(dot)org, darcy(at)vex(dot)net
Subject: Re: pg_hba.conf && ident ...
Date: 2000-05-10 15:51:35
Message-ID: 20000510165135.C8661@sable.ox.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane writes:
> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> > i pg_hba.conf, that host has:
> > host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser
> > And its the only time we have ident being used ...
> > right now, its the only theory I ahve to work with ...
>
> Bingo. All your cores show the thing waiting inside the ident code:
[...]
> Looking at the code, there doesn't seem to be any defense against a
> broken ident server --- there is no timeout or anything being used here!
> Ugh. Has it always been like this?
>
> Anyway, I think the immediate fix for you is to stop using ident auth
> for that host, at least till we can improve this code...

I came across this problem a year and a half ago. In my case, the
problem was that the client was connecting more than the default limit
of 40 times per minute so inetd was suspending the auth/identd service.
I raised the limit by changing to "nowait.500" and that problem went
away. I'd thought that I'd fixed PostgreSQL itself too but looking
back in my mail logs I can only find my patch which fixes the problem
with sending ident requests from a server with an IP alias. I may have
forgotten to send in the patch (or even to write one) for the "ident
synchronous in postmaster" problem itself. Sorry. I'll look harder.

--Malcolm

--
Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk>
Unix Systems Programmer
Oxford University Computing Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk>
Cc: The Hermit Hacker <scrappy(at)hub(dot)org>, pgsql-hackers(at)postgresql(dot)org, darcy(at)vex(dot)net
Subject: Re: pg_hba.conf && ident ...
Date: 2000-05-10 16:09:29
Message-ID: 18576.957974969@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk> writes:
> I'd thought that I'd fixed PostgreSQL itself too but looking
> back in my mail logs I can only find my patch which fixes the problem
> with sending ident requests from a server with an IP alias. I may have
> forgotten to send in the patch (or even to write one) for the "ident
> synchronous in postmaster" problem itself. Sorry. I'll look harder.

Yes, I see your alias patch in there, but that doesn't have anything to
do with the problem of a nonresponding ident server. I agree with Jan
that a really good fix would allow the postmaster to return to its outer
event loop while waiting for the ident response. It'd be a nontrivial
rewrite though... anyone use ident enough to want to tackle it?

regards, tom lane


From: Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: The Hermit Hacker <scrappy(at)hub(dot)org>, pgsql-hackers(at)postgresql(dot)org, darcy(at)vex(dot)net
Subject: Re: pg_hba.conf && ident ...
Date: 2000-05-10 16:28:52
Message-ID: 20000510172851.D8661@sable.ox.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane writes:
> Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk> writes:
> > I'd thought that I'd fixed PostgreSQL itself too but looking
> > back in my mail logs I can only find my patch which fixes the problem
> > with sending ident requests from a server with an IP alias. I may have
> > forgotten to send in the patch (or even to write one) for the "ident
> > synchronous in postmaster" problem itself. Sorry. I'll look harder.
>
> Yes, I see your alias patch in there, but that doesn't have anything to
> do with the problem of a nonresponding ident server. I agree with Jan
> that a really good fix would allow the postmaster to return to its outer
> event loop while waiting for the ident response. It'd be a nontrivial
> rewrite though... anyone use ident enough to want to tackle it?

It looks like the whole pg_hba thing isn't really designed to be
asynchronous or event-driven. A cheap and cheerful fix would be to
replace the blocking connect/send/recv in ident() in hba.c with
foo_timeout ones (for foo one of connect/send/recv). Basically, set
O_NONBLOCK on the socket with fcntl and have foo_timeout() do
...
FD_SET(ourfd, &fds);
tv.tv_sec = TIMEOUT;
foo(...);
if (select(ourfd+1, &fds, &fds, 0, &tv) == -1)
return -1;
return foo(...);
At least you then have an upper bound of about 3*TIMEOUT on how long
the postmaster is busy. It would still be susceptible to a denial of
service attack though. The other option would be an alarm() timeout
which could wrap the entire ident process but doing alarms portably
and safely is weird on some platforms depending on what else is going
on at the time.

--Malcolm

--
Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk>
Unix Systems Programmer
Oxford University Computing Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk>
Cc: The Hermit Hacker <scrappy(at)hub(dot)org>, pgsql-hackers(at)postgresql(dot)org, darcy(at)vex(dot)net
Subject: Re: pg_hba.conf && ident ...
Date: 2000-05-10 16:43:49
Message-ID: 18796.957977029@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Malcolm Beattie <mbeattie(at)sable(dot)ox(dot)ac(dot)uk> writes:
> It looks like the whole pg_hba thing isn't really designed to be
> asynchronous or event-driven.

Nope, the module would need a pretty thorough rewrite ...

> A cheap and cheerful fix would be to
> replace the blocking connect/send/recv in ident() in hba.c with
> foo_timeout ones (for foo one of connect/send/recv).

That was what I was thinking too, unless we find a volunteer to do
the bigger job. I don't particularly care to spend that much time
on this problem myself.

regards, tom lane