DNS vs /etc/hosts

Lists: pgsql-general
From: Lowell(dot)Hought(at)faa(dot)gov
To: pgsql-general(at)postgresql(dot)org
Subject: DNS vs /etc/hosts
Date: 2005-08-04 15:13:43
Message-ID: OFAD2896F4.52DF6036-ON86257053.0052FE7E-86257053.00538D3B@faa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I am changing from 7.2 to 8.0 and have both installed now on various Linux
machines. When I use the psql command line interface with a -h hostname,
the connection time from 7.2 is instant while the connection time from 8.0
is 15 seconds. My assumption is that 7.2 checks the /etc/hosts file first
and if unable to find the specified host it reverts to a DNS lookup, and
the 8.0 is just the opposite. Is this a correct assumption, and if so,
can I modify 8.0 to behave as 7.2 does?


From: Tino Wildenhain <tino(at)wildenhain(dot)de>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 15:56:22
Message-ID: 1123170982.15416.75.camel@sabrina.peacock.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Am Donnerstag, den 04.08.2005, 10:13 -0500 schrieb
Lowell(dot)Hought(at)faa(dot)gov:
>
> I am changing from 7.2 to 8.0 and have both installed now on various
> Linux machines. When I use the psql command line interface with a -h
> hostname, the connection time from 7.2 is instant while the connection
> time from 8.0 is 15 seconds. My assumption is that 7.2 checks
> the /etc/hosts file first and if unable to find the specified host it
> reverts to a DNS lookup, and the 8.0 is just the opposite. Is this a
> correct assumption, and if so, can I modify 8.0 to behave as 7.2 does?

No, applications dont do lookups theirself.
The os (or rather the resolver lib) decides
how it works and therefore both 7.2 and 8.0
will behave the same.

I think you have different user policies in their
pg_hba.conf and 8.0 might (per default) want to
check ident. And if you firewall it or so it might
take a while to timeout.

--
Tino Wildenhain <tino(at)wildenhain(dot)de>


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 15:58:05
Message-ID: 20050804155805.GA88558@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 04, 2005 at 10:13:43AM -0500, Lowell(dot)Hought(at)faa(dot)gov wrote:
> I am changing from 7.2 to 8.0 and have both installed now on various Linux
> machines. When I use the psql command line interface with a -h hostname,
> the connection time from 7.2 is instant while the connection time from 8.0
> is 15 seconds. My assumption is that 7.2 checks the /etc/hosts file first
> and if unable to find the specified host it reverts to a DNS lookup, and
> the 8.0 is just the opposite. Is this a correct assumption, and if so,
> can I modify 8.0 to behave as 7.2 does?

Have you determined whether the difference is in the client (psql),
in the server, or in both? What happens if you use a 7.2 client
to connect to an 8.0 server, and if you use an 8.0 client to connect
to a 7.2 server? Have you run a process trace or network sniffer
to test your hypothesis? Let's find out exactly what and where the
problem is before looking for a solution. But if DNS is the problem,
why not fix it instead of working around it?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/


From: Gregory Youngblood <pgcluster(at)netio(dot)org>
To: Lowell(dot)Hought(at)faa(dot)gov, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 16:18:03
Message-ID: EE9A7CA7-B5A7-450A-8658-F6CBDAE23884@netio.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Aug 4, 2005, at 8:13 AM, Lowell(dot)Hought(at)faa(dot)gov wrote:
>
> I am changing from 7.2 to 8.0 and have both installed now on
> various Linux machines. When I use the psql command line interface
> with a -h hostname, the connection time from 7.2 is instant while
> the connection time from 8.0 is 15 seconds. My assumption is that
> 7.2 checks the /etc/hosts file first and if unable to find the
> specified host it reverts to a DNS lookup, and the 8.0 is just the
> opposite. Is this a correct assumption, and if so, can I modify
> 8.0 to behave as 7.2 does?

Is this on the same machine, or have you changed machines when you
changed db versions?

(1) the lookups are usually handled by system calls, and assuming
your are on a Unix type system, the files /etc/host.conf and /etc/
nsswitch.conf will determine the order lookups are performed. Most
every system I have seen comes with a default configuration of using
the files first, and dns second. It might be useful to make sure
these are set correctly.

(2) have you checked the 8.0 pg_hba.conf? It looks like ident is
used. I am not very familiar with ident, usually only seeing it used
for IRC chats, but I believe it looks to your client for the ident
information. Are you running an ident server, or do you possibly have
a firewall that just drops packets for blocked ports (assuming ident
is among the blocked ports)? I would guess that a simple dropped
packet would make it time out, while a rejected or no server on port
would cause the ident connection to fail more quickly.

Just a couple of ideas.
Greg


From: Lowell(dot)Hought(at)faa(dot)gov
To: Tino Wildenhain <tino(at)wildenhain(dot)de>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-general-owner(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 17:04:27
Message-ID: OF525AB70A.A842E801-ON86257053.005953A3-86257053.005DB05B@faa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Machine 1 is running version 8.0
Machine 2 is running version 7.2
Machine 3 has version 7.2 and version 8.0 installed, so both versions of
"psql" are available for testing.

From machine 3 to machine 2
Version 7.2 psql - /usr/bin/psql -d dbname -h machine2 ---- connection
time instant
Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h machine2 ----
conection time 15 seconds
Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h ip.address ----
connection time instant

From machine 3 to machine 1
Version 7.2 psql - /usr/bin/psql -d dbname -h machine1 ----
connection time instant
Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h machine1 ----
conection time 15 seconds
Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h ip.address ----
connection time instant

Tino Wildenhain <tino(at)wildenhain(dot)de>
Sent by: pgsql-general-owner(at)postgresql(dot)org
08/04/2005 10:56 AM

To
Lowell Hought/AGL/FAA(at)FAA
cc
pgsql-general(at)postgresql(dot)org
Subject
Re: [GENERAL] DNS vs /etc/hosts

Am Donnerstag, den 04.08.2005, 10:13 -0500 schrieb
Lowell(dot)Hought(at)faa(dot)gov:
>
> I am changing from 7.2 to 8.0 and have both installed now on various
> Linux machines. When I use the psql command line interface with a -h
> hostname, the connection time from 7.2 is instant while the connection
> time from 8.0 is 15 seconds. My assumption is that 7.2 checks
> the /etc/hosts file first and if unable to find the specified host it
> reverts to a DNS lookup, and the 8.0 is just the opposite. Is this a
> correct assumption, and if so, can I modify 8.0 to behave as 7.2 does?

No, applications dont do lookups theirself.
The os (or rather the resolver lib) decides
how it works and therefore both 7.2 and 8.0
will behave the same.

I think you have different user policies in their
pg_hba.conf and 8.0 might (per default) want to
check ident. And if you firewall it or so it might
take a while to timeout.

--
Tino Wildenhain <tino(at)wildenhain(dot)de>

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: Tino Wildenhain <tino(at)wildenhain(dot)de>, pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 19:29:56
Message-ID: 20050804192955.GA89372@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 04, 2005 at 12:04:27PM -0500, Lowell(dot)Hought(at)faa(dot)gov wrote:
> Version 7.2 psql - /usr/bin/psql -d dbname -h machine1 ----
> connection time instant
> Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h machine1 ----
> conection time 15 seconds
> Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h ip.address ----
> connection time instant

Do the 8.0 connections to a name take exactly 15 seconds every time,
or does the time vary?

Have you done process traces on 7.2 vs. 8.0 to see what they're
doing differently? You mentioned that you were using Linux, so
something like "strace -o filename -r psql ..." should work (the
-r option should add relative timestamps to the trace so you can
see where the slowness is happening). As others have mentioned,
name resolution is generally done by libraries that aren't part of
PostgreSQL, so if two versions of PostgreSQL behave differently in
that respect then we need to find out what's different about them.
Have you used ldd to see what libraries each version of psql is
linked against? Are there differences aside from libpq?

Have you used a tool like dig, host, or nslookup to test whether
DNS indeed has a problem? That wouldn't answer why different
versions of psql apparently behave differently, but it should at
least tell us whether DNS is really a problem.

Have you used a sniffer like tcpdump or ethereal to watch DNS queries
and PostgreSQL connections?

--
Michael Fuhr


From: Richard_D_Levine(at)raytheon(dot)com
To: pgsql-general(at)postgresql(dot)org
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 20:01:31
Message-ID: OF7331D2FD.E09EFA51-ON05257053.006D929C-05257053.006E00D2@ftw.us.ray.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I'd start by comparing the /etc/nsswitch.conf files on the various
machines. If the second column contains "files" for passwd and hosts on
the fast machines, and "dns" on the slow machine, then change the slow
machine to "files" and see if it speeds up. That's an easy way to rule out
or condemn DNS.

If you change a machine to "files", make sure the /etc/passwd has at least
the user you intend to login as, and /etc/hosts has the hostnames.

Rick


Michael Fuhr
<mike(at)fuhr(dot)org>
Sent by: To
pgsql-general-own Lowell(dot)Hought(at)faa(dot)gov
er(at)postgresql(dot)org cc
Tino Wildenhain
<tino(at)wildenhain(dot)de>,
08/04/2005 02:29 pgsql-general(at)postgresql(dot)org
PM Subject
Re: [GENERAL] DNS vs /etc/hosts





On Thu, Aug 04, 2005 at 12:04:27PM -0500, Lowell(dot)Hought(at)faa(dot)gov wrote:
> Version 7.2 psql - /usr/bin/psql -d dbname -h machine1 ----
> connection time instant
> Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h machine1 ----
> conection time 15 seconds
> Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h ip.address
----
> connection time instant

Do the 8.0 connections to a name take exactly 15 seconds every time,
or does the time vary?

Have you done process traces on 7.2 vs. 8.0 to see what they're
doing differently? You mentioned that you were using Linux, so
something like "strace -o filename -r psql ..." should work (the
-r option should add relative timestamps to the trace so you can
see where the slowness is happening). As others have mentioned,
name resolution is generally done by libraries that aren't part of
PostgreSQL, so if two versions of PostgreSQL behave differently in
that respect then we need to find out what's different about them.
Have you used ldd to see what libraries each version of psql is
linked against? Are there differences aside from libpq?

Have you used a tool like dig, host, or nslookup to test whether
DNS indeed has a problem? That wouldn't answer why different
versions of psql apparently behave differently, but it should at
least tell us whether DNS is really a problem.

Have you used a sniffer like tcpdump or ethereal to watch DNS queries
and PostgreSQL connections?

--
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
message can get through to the mailing list cleanly


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Richard_D_Levine(at)raytheon(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 20:25:50
Message-ID: 20050804202550.GA89836@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 04, 2005 at 03:01:31PM -0500, Richard_D_Levine(at)raytheon(dot)com wrote:
> I'd start by comparing the /etc/nsswitch.conf files on the various
> machines. If the second column contains "files" for passwd and hosts on
> the fast machines, and "dns" on the slow machine, then change the slow
> machine to "files" and see if it speeds up. That's an easy way to rule out
> or condemn DNS.

The information we've been given suggests that the same version of
psql behaves the same on different machines, and that different
versions of psql behave differently on the same machine. If that's
the case, then such behavior isn't easily explained by differing
nsswitch.conf configurations. Even if mucking around with nsswitch.conf
did appear to fix things, we'd still have the mystery of why the two
versions of psql behave differently.

--
Michael Fuhr


From: Richard_D_Levine(at)raytheon(dot)com
To: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 20:28:44
Message-ID: OF49DC12B5.E27BBB1A-ON05257053.0070539E-05257053.00707E78@ftw.us.ray.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Sorry to re-reply, but I had a much simpler idea. From the client machine
that is slow to connect, type the command "nslookup hostname1". If it
takes 15 seconds. If it does, DNS is the problem.

Rick

pgsql-general-owner(at)postgresql(dot)org wrote on 08/04/2005 03:01:31 PM:

> I'd start by comparing the /etc/nsswitch.conf files on the various
> machines. If the second column contains "files" for passwd and hosts on
> the fast machines, and "dns" on the slow machine, then change the slow
> machine to "files" and see if it speeds up. That's an easy way to rule
out
> or condemn DNS.
>
> If you change a machine to "files", make sure the /etc/passwd has at
least
> the user you intend to login as, and /etc/hosts has the hostnames.
>
> Rick
>
>
>

> Michael Fuhr

> <mike(at)fuhr(dot)org>

> Sent by:
To
> pgsql-general-own Lowell(dot)Hought(at)faa(dot)gov

> er(at)postgresql(dot)org
cc
> Tino Wildenhain

> <tino(at)wildenhain(dot)de>,

> 08/04/2005 02:29 pgsql-general(at)postgresql(dot)org

> PM
Subject
> Re: [GENERAL] DNS vs /etc/hosts

>

>

>

>

>

>

>
>
>
>
> On Thu, Aug 04, 2005 at 12:04:27PM -0500, Lowell(dot)Hought(at)faa(dot)gov wrote:
> > Version 7.2 psql - /usr/bin/psql -d dbname -h machine1 ----
> > connection time instant
> > Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h machine1
----
> > conection time 15 seconds
> > Version 8.0 psql - /usr/local/pgsql/bin/psql -d dbname -h ip.address
> ----
> > connection time instant
>
> Do the 8.0 connections to a name take exactly 15 seconds every time,
> or does the time vary?
>
> Have you done process traces on 7.2 vs. 8.0 to see what they're
> doing differently? You mentioned that you were using Linux, so
> something like "strace -o filename -r psql ..." should work (the
> -r option should add relative timestamps to the trace so you can
> see where the slowness is happening). As others have mentioned,
> name resolution is generally done by libraries that aren't part of
> PostgreSQL, so if two versions of PostgreSQL behave differently in
> that respect then we need to find out what's different about them.
> Have you used ldd to see what libraries each version of psql is
> linked against? Are there differences aside from libpq?
>
> Have you used a tool like dig, host, or nslookup to test whether
> DNS indeed has a problem? That wouldn't answer why different
> versions of psql apparently behave differently, but it should at
> least tell us whether DNS is really a problem.
>
> Have you used a sniffer like tcpdump or ethereal to watch DNS queries
> and PostgreSQL connections?
>
> --
> Michael Fuhr
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster


From: Lowell(dot)Hought(at)faa(dot)gov
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-general-owner(at)postgresql(dot)org, Richard_D_Levine(at)raytheon(dot)com
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 21:39:02
Message-ID: OFD1B551F4.083ED5E7-ON86257053.007677D7-86257053.0076D407@faa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Your assessment is correct ... the same version of
psql behaves the same on different machines, and different
versions of psql behave differently on the same machine.

The difference must have to do with the functions that differ in the
different versions of psql. In looking through the code for version 8.0
in the file /interfaces/libpq/ip.c, the function that resolves hostname is
"getaddrinfo". Is this the same function that was used in version 7.2,
and if not, how does it differ? Is there something on my machine that I
can configure?

Michael Fuhr <mike(at)fuhr(dot)org>
Sent by: pgsql-general-owner(at)postgresql(dot)org
08/04/2005 03:25 PM

To
Richard_D_Levine(at)raytheon(dot)com
cc
pgsql-general(at)postgresql(dot)org
Subject
Re: [GENERAL] DNS vs /etc/hosts

On Thu, Aug 04, 2005 at 03:01:31PM -0500, Richard_D_Levine(at)raytheon(dot)com
wrote:
> I'd start by comparing the /etc/nsswitch.conf files on the various
> machines. If the second column contains "files" for passwd and hosts on
> the fast machines, and "dns" on the slow machine, then change the slow
> machine to "files" and see if it speeds up. That's an easy way to rule
out
> or condemn DNS.

The information we've been given suggests that the same version of
psql behaves the same on different machines, and that different
versions of psql behave differently on the same machine. If that's
the case, then such behavior isn't easily explained by differing
nsswitch.conf configurations. Even if mucking around with nsswitch.conf
did appear to fix things, we'd still have the mystery of why the two
versions of psql behave differently.

--
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 22:30:52
Message-ID: 20050804223052.GA90539@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 04, 2005 at 04:01:43PM -0500, Lowell(dot)Hought(at)faa(dot)gov wrote:
> I also performed the trace you suggested. The results are the same until
> this point, where the time for
> version 8.0 totals 0.025960 and for
> version 7.2 totals 0.009481

Those differences probably don't matter, but what comes next does.

The 7.2 trace shows a DNS query to 10.32.104.5 for a name that
begins with zmpweb5.dms.ats.agl (the strace output is truncated
after that). The DNS server responds with a packet of 142 bytes,
after which the process makes a TCP connection to 10.32.104.110:5432,
which is presumably the database server.

The 8.0 trace is different: it appears to make the same DNS query
to 10.32.104.5, but the response it receives is only 98 bytes (was
it in fact the same query?). The process then makes a DNS query
to 10.32.104.5 for just zmpweb5, and that query times out after 5
seconds. Then the process sends a query for zmpweb5 to 172.17.46.46,
which refuses the connection, possibly because no DNS server is
running on that machine. We then see a query for zmpweb5 to
172.17.40.42, and that query times out after 6 seconds. Then another
query for zmpweb5 to 10.32.104.5 and a 5-second timeout, a query
for zmpweb5 to 172.17.46.46 and a refused connection, and a query
for zmpweb5 to 172.17.40.42 and a 6-second timeout. We then see
the process read /etc/hosts, but afterwards it makes another DNS
query to 10.32.104.5 for zmpweb5.dms.ats.agl.<truncated>, and this
time we see a 142-byte response, as 7.2 had received on its first
attempt. Finally we see a TCP connection to 10.32.104.110:5432.

So why does 8.0 receive a 98-byte response to its first DNS query
when 7.2 received a 142-byte response? We can tell a little something
about the responses by looking at the data in the strace output,
with the help of RFC 1035 Section 4.1.1. In octal, the DNS response
headers are:

7.2 \260\5\205\200\0\1\0\1\0\2\0\2
8.0 \30\310\205\200\0\1\0\0\0\1\0\0

The response to 7.2 has an ANCOUNT (number of records in the answer
section) of 1 and an NSCOUNT (number of records in the authority
section) of 2, whereas the response to 8.0 has an ANCOUNT of 0 and
an NSCOUNT of 1. That disparity is odd if the DNS queries were
indeed the same.

A few DNS queries with dig might show what's happening, and some
sniffer output of the DNS queries that psql makes might also be
enlightening. Something like the following ought to do the trick:

tcpdump -s526 -n -vv udp and port 53

The -s526 option tells tcpdump to grab enough data for the largest
possible UDP DNS packet (512 octets) plus a bit extra for the layer 2
header. It might be interesting to see the tcpdump output for psql
7.2's DNS queries and then 8.0's DNS queries (or use ethereal/tethereal
or another sniffer if you prefer, as long as we can see as much of the
DNS packets as possible).

BTW, some resolver libraries can be configured not to attempt DNS
queries for just "hostname" when "hostname.subdomain.domain" fails.
I seldom find such queries useful and I do occasionally find them
problematic, so if my resolver has such an option then I usually
enable it (e.g., "options no_tld_query" in /etc/resolv.conf on
FreeBSD).

--
Michael Fuhr


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: pgsql-general(at)postgresql(dot)org, Richard_D_Levine(at)raytheon(dot)com
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-04 22:49:22
Message-ID: 20050804224922.GA90868@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 04, 2005 at 04:39:02PM -0500, Lowell(dot)Hought(at)faa(dot)gov wrote:
> The difference must have to do with the functions that differ in the
> different versions of psql. In looking through the code for version 8.0
> in the file /interfaces/libpq/ip.c, the function that resolves hostname is
> "getaddrinfo". Is this the same function that was used in version 7.2,
> and if not, how does it differ? Is there something on my machine that I
> can configure?

Good catch -- the use of getaddrinfo() appears to have been added
in 7.4. I see calls to inet_aton() and gethostbyname() in earlier
versions, so maybe that explains the difference. A simple test
program should be able to confirm or refute that hypothesis. The
tcpdump output I suggested in another message should show exactly
what queries are being made and what responses are being received.

Different systems have different resolver customizations; you'll
have to check your local documentation. I'd start with "man
resolv.conf". I'd especially look for options that control if and
when queries for the top-level domain "hostname" are made when
queries for "hostname.domain" fail. You might also want to examine
your domain search list.

--
Michael Fuhr


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-05 00:29:46
Message-ID: 20050805002946.GA91570@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 04, 2005 at 04:30:52PM -0600, Michael Fuhr wrote:
> The response to 7.2 has an ANCOUNT (number of records in the answer
> section) of 1 and an NSCOUNT (number of records in the authority
> section) of 2, whereas the response to 8.0 has an ANCOUNT of 0 and
> an NSCOUNT of 1. That disparity is odd if the DNS queries were
> indeed the same.

I wonder if the use of getaddrinfo() in 8.0 is causing the first
DNS query to be for an AAAA record instead of for an A record. The
connectDBStart() function in src/interfaces/libpq/fe-connect.c sets
hint.ai_family = AF_UNSPEC, which on some systems might cause the
resolver to try an AAAA query first. That would explain the above
disparity: the response to the AAAA query would return a response
code of NOERROR, no records in the answer section, and the zone's
SOA record in the authority section (at least that's how BIND 9
responds). The resolver then makes AAAA queries for the unqualified
name (i.e., the name as a top-level domain) and those queries time
out; finally it makes A queries for the fully-qualified name and
we get success. This is exactly what the strace output appears to
show. A packet sniff should be able to confirm or refute.

Anybody know if AAAA queries can be disabled in Linux? Lowell, if
nobody answers here then you might need to seek help in a different
forum. Or you could just hack the code and change AF_UNSPEC to
AF_INET ;-)

--
Michael Fuhr


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-05 00:43:14
Message-ID: 20050805004314.GA91684@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Aug 04, 2005 at 06:29:46PM -0600, Michael Fuhr wrote:
> Anybody know if AAAA queries can be disabled in Linux? Lowell, if
> nobody answers here then you might need to seek help in a different
> forum. Or you could just hack the code and change AF_UNSPEC to
> AF_INET ;-)

Lowell, aside from trying to disable AAAA queries altogether, you
might want to investigate why those top-level domain queries are
timing out. Those queries should fail fairly quickly -- is your
connectivity to the root DNS servers poor or non-existent? But
that's getting off-topic for this list....

--
Michael Fuhr


From: Thomas Pundt <mlists(at)rp-online(dot)de>
To: pgsql-general(at)postgresql(dot)org
Cc: Lowell(dot)Hought(at)faa(dot)gov
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-05 12:19:27
Message-ID: 200508051419.27653.mlists@rp-online.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

On Thursday 04 August 2005 17:13, Lowell(dot)Hought(at)faa(dot)gov wrote:
| I am changing from 7.2 to 8.0 and have both installed now on various Linux
| machines. When I use the psql command line interface with a -h hostname,
| the connection time from 7.2 is instant while the connection time from 8.0
| is 15 seconds. My assumption is that 7.2 checks the /etc/hosts file first
| and if unable to find the specified host it reverts to a DNS lookup, and
| the 8.0 is just the opposite. Is this a correct assumption, and if so,
| can I modify 8.0 to behave as 7.2 does?

I've once seen nameservice and connection delays caused by improperly
configured IPV6 support on some Linux machines. Removing the responsible
modules from the kernel fixed it. Just another guess though :-)

Ciao,
Thomas

--
Dr. Thomas Pundt <thomas(dot)pundt(at)rp-online(dot)de> ---- http://rp-online.de/ ----


From: Lowell(dot)Hought(at)faa(dot)gov
To: Thomas Pundt <mlists(at)rp-online(dot)de>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-general-owner(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-05 14:21:49
Message-ID: OF5562B954.9F386C69-ON86257054.004EB1DE-86257054.004ECAF2@faa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

How might I check for that? And if it is determined to be a problem, how
would I remove the guilty modules?

Thomas Pundt <mlists(at)rp-online(dot)de>
Sent by: pgsql-general-owner(at)postgresql(dot)org
08/05/2005 07:19 AM

To
pgsql-general(at)postgresql(dot)org
cc
Lowell Hought/AGL/FAA(at)FAA
Subject
Re: [GENERAL] DNS vs /etc/hosts

Hi,

On Thursday 04 August 2005 17:13, Lowell(dot)Hought(at)faa(dot)gov wrote:
| I am changing from 7.2 to 8.0 and have both installed now on various
Linux
| machines. When I use the psql command line interface with a -h
hostname,
| the connection time from 7.2 is instant while the connection time from
8.0
| is 15 seconds. My assumption is that 7.2 checks the /etc/hosts file
first
| and if unable to find the specified host it reverts to a DNS lookup, and
| the 8.0 is just the opposite. Is this a correct assumption, and if so,
| can I modify 8.0 to behave as 7.2 does?

I've once seen nameservice and connection delays caused by improperly
configured IPV6 support on some Linux machines. Removing the responsible
modules from the kernel fixed it. Just another guess though :-)

Ciao,
Thomas

--
Dr. Thomas Pundt <thomas(dot)pundt(at)rp-online(dot)de> ---- http://rp-online.de/
----

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


From: Lowell(dot)Hought(at)faa(dot)gov
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-general-owner(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-05 15:01:13
Message-ID: OF7ADF9091.49FAA497-ON86257054.0051F9D5-86257054.00526683@faa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Your are correct in that 8.0 is doing a AAAA request first. I am running
Red Hat version 8.0. The difference in the way 7.2 and 8.0 resolve the
host option has to be because of the change from gethostbyname to
getaddrinfo. Is there some way I can force my machine to do an A search
before a AAAA search?

Here is the output from the tcpdump you suggested for 7.2:
--------------------------------------------------------------------------------------------------------------------
14:50:37.679429 10.32.104.97.32777 > 10.32.104.5.domain: [udp sum ok]
9750+ A? zmpweb5.dms.ats.agl.faa.gov. [|domain] (DF) (ttl 64, id 23879,
len 73)
14:50:37.680131 10.32.104.5.domain > 10.32.104.97.32777: [udp sum ok]
9750* q: A? zmpweb5.dms.ats.agl.faa.gov. 1/2/2
zmpweb5.dms.ats.agl.faa.gov. A 10.32.104.110 ns: dms.ats.agl.faa.gov. NS
agldmszmps1.dms.ats.agl.faa.gov., dms.ats.agl.faa.gov. NS
agldmss3.dms.ats.agl.faa.gov. ar: agldmss3.dms.ats.agl.faa.gov. A
10.32.104.3, agldmszmps1.dms.ats.agl.faa.gov. A 10.32.104.5 (142) (ttl
128, id 33877, len 170)
--------------------------------------------------------------------------------------------------------------------

Here is the output from 8.0:
--------------------------------------------------------------------------------------------------------------------
14:50:03.736903 10.32.104.97.32777 > 10.32.104.5.domain: [udp sum ok]
18412+ AAAA? zmpweb5.dms.ats.agl.faa.gov. [|domain] (DF) (ttl 64, id 6499,
len 73)
14:50:03.737652 10.32.104.5.domain > 10.32.104.97.32777: [udp sum ok]
18412* q: AAAA? zmpweb5.dms.ats.agl.faa.gov. 0/1/0 ns:
dms.ats.agl.faa.gov. SOA agldmszmps1.dms.ats.agl.faa.gov.
root.dms.ats.agl.faa.gov. 2001145122 10800 3600 43200 7200 (98) (ttl 128,
id 44115, len 126)
14:50:03.737822 10.32.104.97.32777 > 10.32.104.5.domain: [udp sum ok]
18413+ AAAA? zmpweb5. [|domain] (DF) (ttl 64, id 6500, len 53)
14:50:08.738756 10.32.104.97.32777 > 10.32.104.5.domain: [udp sum ok]
18413+ AAAA? zmpweb5. [|domain] (DF) (ttl 64, id 6501, len 53)
14:50:10.686497 10.32.104.5.domain > 10.32.104.97.32777: [udp sum ok]
21278 ServFail q: AAAA? zmpweb5. 0/0/0 (25) (ttl 128, id 7764, len 53)
14:50:10.686617 10.32.104.5.domain > 10.32.104.97.32777: [udp sum ok]
21278 ServFail q: AAAA? zmpweb5. 0/0/0 (25) (ttl 128, id 8020, len 53)
14:50:10.686622 10.32.104.5.domain > 10.32.104.97.32777: [udp sum ok]
18413 ServFail q: AAAA? zmpweb5. 0/0/0 (25) (ttl 128, id 8276, len 53)
14:50:10.686676 10.32.104.5.domain > 10.32.104.97.32777: [udp sum ok]
18413 ServFail q: AAAA? zmpweb5. 0/0/0 (25) (ttl 128, id 8532, len 53)
14:50:10.687162 10.32.104.97.32777 > 10.32.104.5.domain: [udp sum ok]
18414+ A? zmpweb5.dms.ats.agl.faa.gov. [|domain] (DF) (ttl 64, id 10058,
len 73)
14:50:10.688109 10.32.104.5.domain > 10.32.104.97.32777: [udp sum ok]
18414* q: A? zmpweb5.dms.ats.agl.faa.gov. 1/2/2
zmpweb5.dms.ats.agl.faa.gov. A 10.32.104.110 ns: dms.ats.agl.faa.gov. NS
agldmss3.dms.ats.agl.faa.gov., dms.ats.agl.faa.gov. NS
agldmszmps1.dms.ats.agl.faa.gov. ar: agldmss3.dms.ats.agl.faa.gov. A
10.32.104.3, agldmszmps1.dms.ats.agl.faa.gov. A 10.32.104.5 (142) (ttl
128, id 8788, len 170)
-----------------------------------------------------------------------------------------------------------------------

Michael Fuhr <mike(at)fuhr(dot)org>
Sent by: pgsql-general-owner(at)postgresql(dot)org
08/04/2005 05:30 PM

To
Lowell Hought/AGL/FAA(at)FAA
cc
pgsql-general(at)postgresql(dot)org
Subject
Re: [GENERAL] DNS vs /etc/hosts

On Thu, Aug 04, 2005 at 04:01:43PM -0500, Lowell(dot)Hought(at)faa(dot)gov wrote:
> I also performed the trace you suggested. The results are the same
until
> this point, where the time for
> version 8.0 totals 0.025960 and for
> version 7.2 totals 0.009481

Those differences probably don't matter, but what comes next does.

The 7.2 trace shows a DNS query to 10.32.104.5 for a name that
begins with zmpweb5.dms.ats.agl (the strace output is truncated
after that). The DNS server responds with a packet of 142 bytes,
after which the process makes a TCP connection to 10.32.104.110:5432,
which is presumably the database server.

The 8.0 trace is different: it appears to make the same DNS query
to 10.32.104.5, but the response it receives is only 98 bytes (was
it in fact the same query?). The process then makes a DNS query
to 10.32.104.5 for just zmpweb5, and that query times out after 5
seconds. Then the process sends a query for zmpweb5 to 172.17.46.46,
which refuses the connection, possibly because no DNS server is
running on that machine. We then see a query for zmpweb5 to
172.17.40.42, and that query times out after 6 seconds. Then another
query for zmpweb5 to 10.32.104.5 and a 5-second timeout, a query
for zmpweb5 to 172.17.46.46 and a refused connection, and a query
for zmpweb5 to 172.17.40.42 and a 6-second timeout. We then see
the process read /etc/hosts, but afterwards it makes another DNS
query to 10.32.104.5 for zmpweb5.dms.ats.agl.<truncated>, and this
time we see a 142-byte response, as 7.2 had received on its first
attempt. Finally we see a TCP connection to 10.32.104.110:5432.

So why does 8.0 receive a 98-byte response to its first DNS query
when 7.2 received a 142-byte response? We can tell a little something
about the responses by looking at the data in the strace output,
with the help of RFC 1035 Section 4.1.1. In octal, the DNS response
headers are:

7.2 \260\5\205\200\0\1\0\1\0\2\0\2
8.0 \30\310\205\200\0\1\0\0\0\1\0\0

The response to 7.2 has an ANCOUNT (number of records in the answer
section) of 1 and an NSCOUNT (number of records in the authority
section) of 2, whereas the response to 8.0 has an ANCOUNT of 0 and
an NSCOUNT of 1. That disparity is odd if the DNS queries were
indeed the same.

A few DNS queries with dig might show what's happening, and some
sniffer output of the DNS queries that psql makes might also be
enlightening. Something like the following ought to do the trick:

tcpdump -s526 -n -vv udp and port 53

The -s526 option tells tcpdump to grab enough data for the largest
possible UDP DNS packet (512 octets) plus a bit extra for the layer 2
header. It might be interesting to see the tcpdump output for psql
7.2's DNS queries and then 8.0's DNS queries (or use ethereal/tethereal
or another sniffer if you prefer, as long as we can see as much of the
DNS packets as possible).

BTW, some resolver libraries can be configured not to attempt DNS
queries for just "hostname" when "hostname.subdomain.domain" fails.
I seldom find such queries useful and I do occasionally find them
problematic, so if my resolver has such an option then I usually
enable it (e.g., "options no_tld_query" in /etc/resolv.conf on
FreeBSD).

--
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


From: Thomas Pundt <mlists(at)rp-online(dot)de>
To: pgsql-general(at)postgresql(dot)org, Lowell(dot)Hought(at)faa(dot)gov
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-05 19:06:22
Message-ID: 200508052106.23000.mlists@rp-online.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

On Friday 05 August 2005 16:21 Lowell(dot)Hought(at)faa(dot)gov wrote:
| How might I check for that?

If it's a standard distribution kernel, try "lsmod | grep ipv6" - this will
show you if you have loaded the IPv6 module; try to remove the module
by issueing "rmmod ipv6". If that fails, you probably have to edit
your /etc/modprobe.conf or /etc/modprobe.d/aliases file, comment
out the ipv6 module entry and reboot the machine.

Then repeat your test.

But again, I'm just guessing in the wild.

Ciao,
Thomas

--
Thomas Pundt ------- http://www.pundt.de/
E-Mail: thomas(at)pundt(dot)de


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lowell(dot)Hought(at)faa(dot)gov
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-06 04:38:50
Message-ID: 15258.1123303130@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Lowell(dot)Hought(at)faa(dot)gov writes:
> Your are correct in that 8.0 is doing a AAAA request first. I am running
> Red Hat version 8.0. The difference in the way 7.2 and 8.0 resolve the
> host option has to be because of the change from gethostbyname to
> getaddrinfo. Is there some way I can force my machine to do an A search
> before a AAAA search?

On a recent RH system, "man 5 resolver" suggests that putting "options
inet6" into /etc/resolv.conf is what makes this happen ... if there is
such an entry on your system, try removing it. RH 8.0 is a good ways
back though, so read the local version of that man page before doing
anything with that config file.

I concur with Michael's previous suggestion that the best answer
is to fix the clearly-broken DNS environment you're dealing with.
It is no longer acceptable for anyone to be running nameservers
that have not heard of IPv6 --- unless it's for a network that
only contains clients that have not heard of IPv6, which yours
evidently is not. Have a word with your local network admin.

regards, tom lane


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Lowell(dot)Hought(at)faa(dot)gov, pgsql-general(at)postgresql(dot)org
Subject: Re: DNS vs /etc/hosts
Date: 2005-08-06 15:45:41
Message-ID: 20050806154541.GA33721@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Sat, Aug 06, 2005 at 12:38:50AM -0400, Tom Lane wrote:
> Lowell(dot)Hought(at)faa(dot)gov writes:
> > Your are correct in that 8.0 is doing a AAAA request first. I am running
> > Red Hat version 8.0. The difference in the way 7.2 and 8.0 resolve the
> > host option has to be because of the change from gethostbyname to
> > getaddrinfo. Is there some way I can force my machine to do an A search
> > before a AAAA search?
>
> On a recent RH system, "man 5 resolver" suggests that putting "options
> inet6" into /etc/resolv.conf is what makes this happen ... if there is
> such an entry on your system, try removing it. RH 8.0 is a good ways
> back though, so read the local version of that man page before doing
> anything with that config file.

Hmmm...I have unprivileged access to a RH 7.3 box and I see the
"inet6" option in its resolver(5) manual page, but /etc/resolv.conf
doesn't have that option. Yet a test program that calls getaddrinfo()
with hints.ai_family = AF_UNSPEC nevertheless tries AAAA queries
first (I can't run a sniffer on that box, so I tweaked the test
program's _res structure to send DNS queries to a server that I can
sniff). The resolver algorithm for an unqualified hostname is:

1. AAAA query for hostname.domain (for each domain in the search list).
2. AAAA query for hostname (i.e., as a top-level domain).
3. A query for hostname.domain.
4. A query for hostname.

Lowell's sniffer output shows this algorithm in action. The (1)
query returns zero answers, so we proceed to the (2) query. Here we
see a retry due to a timeout and eventually the DNS server responds
with SERVFAIL (see later comments on this). Then we proceed to (3)
and finally get an answer.

Thomas Pundt suggested running "lsmod | grep ipv6" and disabling
the ipv6 module if it's not needed. On the RH 7.3 box I have access
to, lsmod shows nothing like "ipv6", "ip6", "inet6", etc.

So, /etc/resolv.conf doesn't have an "inet6" option and the kernel
doesn't appear to have an IPv6 module, and yet getaddrinfo() still
makes AAAA queries. Does anybody know if this behavior can be
disabled on Linux if the box doesn't use IPv6?

The (2) and (4) queries above (the queries for the hostname as a
top-level domain) are also a nuisance. On FreeBSD those can be
disabled with the "no_tld_query" option in /etc/resolv.conf, but a
glance through Linux's resolver(5) manual page doesn't show any
such option. Can these queries be disabled on Linux?

(This is becoming a Linux configuration thread, so these questions
might need to be asked elsewhere.)

> I concur with Michael's previous suggestion that the best answer
> is to fix the clearly-broken DNS environment you're dealing with.
> It is no longer acceptable for anyone to be running nameservers
> that have not heard of IPv6 --- unless it's for a network that
> only contains clients that have not heard of IPv6, which yours
> evidently is not. Have a word with your local network admin.

Something Wrong does appear to be happening with this site's DNS.
The top-level domain AAAA queries should fail fairly quickly with
NXDOMAIN after the query goes to a root DNS server that responds
with "sorry, ain't no such name," yet the DNS server takes several
seconds to respond at all, and when it does it responds with SERVFAIL.
That's why I was wondering about connectivity problems to the roots.

In summary, several things would be desirable:

1. Disable AAAA queries if the box doesn't use IPv6.

2. Disable top-level domain queries in the resolver search
algorithm when looking up an unqualified hostname.

3. Fix the DNS servers so that if top-level domain queries for
hostnames are made, responses are made quickly instead of taking
so long and failing with SERVFAIL.

Lowell, you'll probably have to look elsewhere for solutions to
these problems, as they're not PostgreSQL-specific.

--
Michael Fuhr