Networking Forums

Networking Forums > Computer Networking > Linux Networking > Mysterious delay establishing any TCP/IP connection

Reply
Thread Tools Display Modes

Mysterious delay establishing any TCP/IP connection

 
 
Carlos Moreno
Guest
Posts: n/a

 
      01-05-2004, 02:36 PM

Hi,

I wonder if you guys could shed some light on this
mystery (to me, it's a mystery, at least)

We are having a mysterious delay of approximately
10 seconds to establish *any* connection to any
port. The following commands:

telnet localhost 22
telnet loaclhost 5555

Freeze for about 10 seconds, then respond (the
first one responds with th openssh prompt; the
second one responds "connection refused").

If I put telnet 127.0.0.1, then both commands
respond instantly. Same thing if I put the
"actual" IP (the IP as seen from the "outside
world").

We first suspected a DNS problem, and indeed
our hoster company told us that they had to
reboot one of their DNS servers. But they
assure us that everything is fine on their
side now. The thing is, I added manually
the IP's to the /etc/hosts (as a temporary
solution), and still the problem is the same.

To make it more mysterious, if I run the commands
dig, or host, the system responds instantly!!!
(dig localhost, or dig the-domain-name, or
dig -x 127.0.0.1, or dig -x the.actual.IP)

I'm clueless as to why would connections behave
that strange way (e.g., telnet slow) without any
other apparent reason.

Can you think of something that could be causing
this?

The system is a dual Athlon with 1GB of RAM,
sufficient disk space, running RedHat 7.3 with
most of the upgrades -- kernel, glibc, compiler,
etc.

Thanks!

Carlos
--

 
Reply With Quote
 
 
 
 
Leon.
Guest
Posts: n/a

 
      01-05-2004, 03:23 PM

"Carlos Moreno" <(E-Mail Removed)> wrote in message
news:awfKb.52054$(E-Mail Removed)...
>
> Hi,
>
> I wonder if you guys could shed some light on this
> mystery (to me, it's a mystery, at least)
>
> We are having a mysterious delay of approximately
> 10 seconds to establish *any* connection to any
> port. The following commands:
>
> telnet localhost 22
> telnet loaclhost 5555
>
> Freeze for about 10 seconds, then respond (the
> first one responds with th openssh prompt; the
> second one responds "connection refused").
>
> If I put telnet 127.0.0.1, then both commands
> respond instantly. Same thing if I put the
> "actual" IP (the IP as seen from the "outside
> world").
>
> We first suspected a DNS problem, and indeed
> our hoster company told us that they had to
> reboot one of their DNS servers. But they
> assure us that everything is fine on their
> side now. The thing is, I added manually
> the IP's to the /etc/hosts (as a temporary
> solution), and still the problem is the same.


>
> Can you think of something that could be causing
> this?



Only the DNS system to blame.

One thing is, /etc/hosts is only used when /etc/resolve.conf ? is telling it
to be used , and used first... like this.

order hosts,bind

Could be the local and remote ip address lookup, forward and reverse.

hosts file takes care of forward and reverse though,



 
Reply With Quote
 
Neil Horman
Guest
Posts: n/a

 
      01-05-2004, 04:24 PM
Carlos Moreno wrote:
>
> Hi,
>
> I wonder if you guys could shed some light on this
> mystery (to me, it's a mystery, at least)
>
> We are having a mysterious delay of approximately
> 10 seconds to establish *any* connection to any
> port. The following commands:
>
> telnet localhost 22
> telnet loaclhost 5555
>
> Freeze for about 10 seconds, then respond (the
> first one responds with th openssh prompt; the
> second one responds "connection refused").
>
> If I put telnet 127.0.0.1, then both commands
> respond instantly. Same thing if I put the
> "actual" IP (the IP as seen from the "outside
> world").
>
> We first suspected a DNS problem, and indeed
> our hoster company told us that they had to
> reboot one of their DNS servers. But they
> assure us that everything is fine on their
> side now. The thing is, I added manually
> the IP's to the /etc/hosts (as a temporary
> solution), and still the problem is the same.
>
> To make it more mysterious, if I run the commands
> dig, or host, the system responds instantly!!!
> (dig localhost, or dig the-domain-name, or
> dig -x 127.0.0.1, or dig -x the.actual.IP)
>
> I'm clueless as to why would connections behave
> that strange way (e.g., telnet slow) without any
> other apparent reason.
>
> Can you think of something that could be causing
> this?
>
> The system is a dual Athlon with 1GB of RAM,
> sufficient disk space, running RedHat 7.3 with
> most of the upgrades -- kernel, glibc, compiler,
> etc.
>
> Thanks!
>
> Carlos
> --
>

This could be any number of things. Have you tried taking a network
trace with tcpdump of an attempted connection? Does it take 10 seconds
before your initial packet for a connection request is sent, or does it
take 10 seconds between the tcp syn packet and the first ack, or is the
tcp connection negotiated and a later packet delayed? This trace will
give you your first clues as to whats going on here.
Neil

--
Neil Horman
Red Hat, Inc., http://people.redhat.com/nhorman
gpg keyid: 1024D / 0x92A74FA1, http://www.keyserver.net

 
Reply With Quote
 
Carlos Moreno
Guest
Posts: n/a

 
      01-06-2004, 01:58 AM
Neil Horman wrote:

> This could be any number of things. Have you tried taking a network
> trace with tcpdump of an attempted connection? Does it take 10 seconds
> before your initial packet for a connection request is sent, or does it
> take 10 seconds between the tcp syn packet and the first ack, or is the
> tcp connection negotiated and a later packet delayed? This trace will
> give you your first clues as to whats going on here.


Hi Neil,

I'm not too skilled on this tcpdump thing (I have used it in
the past, trying to debug a problem in the postgres frontend-to-
backend communications).

Maybe you could verify if what I did makes sense: I just ran
the following command:

tcpdump -s1500 -xX port 53

Then, at a different shell on the same machine, I ran the
command:

telnet localhost 5555

(it should respond "Connection refused", since there is no
service running on that port, but there is no stealth firewall
dropping the packets either)

So, after some 3 or 5 seconds, I see a packet that seems to
be a request to resolve the name "localhost.blahblah.com"

(blahblah.com is the domain name associated to the machine;
the command hostname outputs "www.blahblah.com" -- I mean,
blahblah.com is not the actual name, but you understand
what I'm trying to say... I hope :-))

Then, another 3 or 5 seconds later, a second request that
I suspect is now a request for the name localhost. Some
3 seconds later, the telnet command finally responds with
"Connection refused".

An even funnier thing is that if I run the command:

telnet www.blahblah.com

Then the packet looks like a request to resolve the name:

www.blahblah.com.blahblah.com

(yes, the domain name is repeated).

Do I have some serious screw-up in my network setup? :-(

Thanks,

Carlos
--


 
Reply With Quote
 
Pierre Tranié
Guest
Posts: n/a

 
      01-06-2004, 08:43 AM
Hi,

"Carlos Moreno" <(E-Mail Removed)> a écrit dans le message
de news:TupKb.104723$(E-Mail Removed) ...
> Neil Horman wrote:
>
> > This could be any number of things. Have you tried taking a network
> > trace with tcpdump of an attempted connection? Does it take 10 seconds
> > before your initial packet for a connection request is sent, or does it
> > take 10 seconds between the tcp syn packet and the first ack, or is the
> > tcp connection negotiated and a later packet delayed? This trace will
> > give you your first clues as to whats going on here.

>
> Hi Neil,
>
> I'm not too skilled on this tcpdump thing (I have used it in
> the past, trying to debug a problem in the postgres frontend-to-
> backend communications).
>
> Maybe you could verify if what I did makes sense: I just ran
> the following command:
>
> tcpdump -s1500 -xX port 53
>
> Then, at a different shell on the same machine, I ran the
> command:
>
> telnet localhost 5555
>
> (it should respond "Connection refused", since there is no
> service running on that port, but there is no stealth firewall
> dropping the packets either)
>
> So, after some 3 or 5 seconds, I see a packet that seems to
> be a request to resolve the name "localhost.blahblah.com"
>
> (blahblah.com is the domain name associated to the machine;
> the command hostname outputs "www.blahblah.com" -- I mean,
> blahblah.com is not the actual name, but you understand
> what I'm trying to say... I hope :-))
>
> Then, another 3 or 5 seconds later, a second request that
> I suspect is now a request for the name localhost. Some
> 3 seconds later, the telnet command finally responds with
> "Connection refused".
>
> An even funnier thing is that if I run the command:
>
> telnet www.blahblah.com
>
> Then the packet looks like a request to resolve the name:
>
> www.blahblah.com.blahblah.com
>
> (yes, the domain name is repeated).
>
> Do I have some serious screw-up in my network setup? :-(
>
> Thanks,
>
> Carlos
> --
>
>


Obviously looks like bind is misconfigured. You should check your zone
files. In the blahblah.com.zone file, you should find something like :

@ IN NS ns.blahblah.com.

ns IN A xxx.yyy.zzz.ttt
www IN A xxx.yyy.zzz.uuu

And in the zzz.yyy.xxx.in-addr.arpa.zone file, something like :

@ IN NS ns.blahblah.com.

ttt IN PTR ns.blahblah.com.
uuu IN PTR www.blahblah.com.

Don't forget the trailing dot at the end of a full system name.

Hope this helps (although this is not the Bind newsgroup).

Pierre


 
Reply With Quote
 
Neil Horman
Guest
Posts: n/a

 
      01-06-2004, 01:08 PM
Carlos Moreno wrote:
> Neil Horman wrote:
>
>> This could be any number of things. Have you tried taking a network
>> trace with tcpdump of an attempted connection? Does it take 10
>> seconds before your initial packet for a connection request is sent,
>> or does it take 10 seconds between the tcp syn packet and the first
>> ack, or is the tcp connection negotiated and a later packet delayed?
>> This trace will give you your first clues as to whats going on here.

>
>
> Hi Neil,
>
> I'm not too skilled on this tcpdump thing (I have used it in
> the past, trying to debug a problem in the postgres frontend-to-
> backend communications).
>
> Maybe you could verify if what I did makes sense: I just ran
> the following command:
>
> tcpdump -s1500 -xX port 53
>
> Then, at a different shell on the same machine, I ran the
> command:
>
> telnet localhost 5555
>
> (it should respond "Connection refused", since there is no
> service running on that port, but there is no stealth firewall
> dropping the packets either)
>
> So, after some 3 or 5 seconds, I see a packet that seems to
> be a request to resolve the name "localhost.blahblah.com"
>
> (blahblah.com is the domain name associated to the machine;
> the command hostname outputs "www.blahblah.com" -- I mean,
> blahblah.com is not the actual name, but you understand
> what I'm trying to say... I hope :-))
>
> Then, another 3 or 5 seconds later, a second request that
> I suspect is now a request for the name localhost. Some
> 3 seconds later, the telnet command finally responds with
> "Connection refused".
>
> An even funnier thing is that if I run the command:
>
> telnet www.blahblah.com
>
> Then the packet looks like a request to resolve the name:
>
> www.blahblah.com.blahblah.com
>
> (yes, the domain name is repeated).
>
> Do I have some serious screw-up in my network setup? :-(
>
> Thanks,
>
> Carlos
> --
>
>

Ok, your tcpdump command is a little off (Filtering on port 53 prevents
you from seeing lots of usefull traffic), but thats probably ok, since
what you did capture probably points to the problem. If it doesn't well
re-visit your tcpdump options, but for now, it looks like the problem is
in your name resolution setup. You should never have to query a DNS for
a localhost name. Go into your /etc directory do the following:

1) edit the file hosts. Make sure this line appears in it:
127.0.0.1 localhost.localdomain localhost

2) edit the file resolv.conf. Ensure that the file has the directive:
search blahblahblah.com
is in the file, where blahblahblah is your domain name
also ensure it contains the ip addresses of you domain name servers
correctly

3) edit the nsswitch.conf file. Ensure that on the hosts: line, the
files directive appears before the dns directive.

HTH
Neil

--
Neil Horman
Red Hat, Inc., http://people.redhat.com/nhorman
gpg keyid: 1024D / 0x92A74FA1, http://www.keyserver.net

 
Reply With Quote
 
Baho Utot
Guest
Posts: n/a

 
      01-06-2004, 11:01 PM
Carlos Moreno wrote:

> Neil Horman wrote:
>
>> This could be any number of things. Have you tried taking a network
>> trace with tcpdump of an attempted connection? Does it take 10 seconds
>> before your initial packet for a connection request is sent, or does it
>> take 10 seconds between the tcp syn packet and the first ack, or is the
>> tcp connection negotiated and a later packet delayed? This trace will
>> give you your first clues as to whats going on here.

>
> Hi Neil,
>
> I'm not too skilled on this tcpdump thing (I have used it in
> the past, trying to debug a problem in the postgres frontend-to-
> backend communications).
>
> Maybe you could verify if what I did makes sense: I just ran
> the following command:
>
> tcpdump -s1500 -xX port 53
>
> Then, at a different shell on the same machine, I ran the
> command:
>
> telnet localhost 5555
>
> (it should respond "Connection refused", since there is no
> service running on that port, but there is no stealth firewall
> dropping the packets either)
>
> So, after some 3 or 5 seconds, I see a packet that seems to
> be a request to resolve the name "localhost.blahblah.com"
>
> (blahblah.com is the domain name associated to the machine;
> the command hostname outputs "www.blahblah.com" -- I mean,
> blahblah.com is not the actual name, but you understand
> what I'm trying to say... I hope :-))
>
> Then, another 3 or 5 seconds later, a second request that
> I suspect is now a request for the name localhost. Some
> 3 seconds later, the telnet command finally responds with
> "Connection refused".
>
> An even funnier thing is that if I run the command:
>
> telnet www.blahblah.com
>
> Then the packet looks like a request to resolve the name:
>
> www.blahblah.com.blahblah.com
>
> (yes, the domain name is repeated).
>
> Do I have some serious screw-up in my network setup? :-(
>
> Thanks,
>
> Carlos
> --


What your seeing is the DNS services working as designed. Seeing names as
www.blahblah.com.blahblah.com is the way host resolution works. I refer
you to any good book on DNS.


 
Reply With Quote
 
Carlos Moreno
Guest
Posts: n/a

 
      01-06-2004, 11:02 PM
> Ok, your tcpdump command is a little off (Filtering on port 53 prevents
> you from seeing lots of usefull traffic), but thats probably ok, since
> what you did capture probably points to the problem. If it doesn't well
> re-visit your tcpdump options, but for now, it looks like the problem is
> in your name resolution setup. You should never have to query a DNS for
> a localhost name. Go into your /etc directory do the following:
>
> 1) edit the file hosts. Make sure this line appears in it:
> 127.0.0.1 localhost.localdomain localhost
>
> 2) edit the file resolv.conf. Ensure that the file has the directive:
> search blahblahblah.com
> is in the file, where blahblahblah is your domain name
> also ensure it contains the ip addresses of you domain name servers
> correctly
>
> 3) edit the nsswitch.conf file. Ensure that on the hosts: line, the
> files directive appears before the dns directive.


I don't see your message through my newsreader, but I just saw it
on groups.google.com, so I'm replying here...

The *very strange* thing is that everything you mention here checks
fine. The file /etc/hosts has always contained the line

127.0.0.1 localhost.localdomain localhost

I had added extra entries (extra aliases for 127.0.0.1), but then
removed it yesterday thinkging it might be the cause of the
problem -- nothing changed after removing the additional entries)

The file /etc/resolv.conf, however, does not contain the directive
search. It only contains two lines, each starting with the
keyword nameserver and followed by the IP of each of the DNS
servers of our hoster (our provider). But that has always been
like that, to the best of my knowledge, and things were running
fine in the past (including recent past days).

And yes, the file /etc/nsswitch.conf contains the following line:

hosts: files nisplus dns

(I didn't know about this file -- on RedHat 9, I thought that
was controlled by the file /etc/host.conf, which in our case,
has always contained one line: order hosts, bind )


So, given all this, it totally beats me how on earth our server
was taking 5 seconds trying to resolve localhost!! And I say
*was* because this morning, the problem had automagically
disappeared -- this kind of supports the theory that it was
a DNS misconfiguration or temporary malfunction (our hosters
may have fixed it or rebooted their servers... Though they
rent Linux dedicated servers, I wouldn't be surprised that
they were so incompetent as to use Windows machines as their
DNS servers *sigh*)

But regardless of the problem being solved, I'm curious!!
I have no explanation or even speculative ideas as to why
or how *could* a machine with the right setup take 5 seconds
on a hopeless attempt to resolve localhost via DNS server.

The only thing I could add is that this happened the same
day that I upgraded several RPMs (notably the kernel -- I
upgraded to RedHat's patch 2.4.20-27 for the kernel, and
glibc 2.2.5-44). The problem did not necessarily appear
right after the upgrades -- we noticed the problem about
12 hours later, and have no way to know if it had been
occurring before (even before our upgrades, maybe?)

Thanks!

Carlos
--

 
Reply With Quote
 
Carlos Moreno
Guest
Posts: n/a

 
      01-07-2004, 01:30 AM
Baho Utot wrote:

> What your seeing is the DNS services working as designed. Seeing names as
> www.blahblah.com.blahblah.com is the way host resolution works. I refer
> you to any good book on DNS.


I don't want to sound rude or ungrateful, but this does
not help at all!

I mean, I have a host that is called www.blahblah.com,
and that domain name is official (i.e., it is registered
with some official domain name registrar's) and recognized
worldwide.

Well, something *is* wrong if I type telnet localhost and
the command freezes for sometime until the DNS resolver
timesout looking for localhost.blahblah.com. Or if I
type telnet www.blahblah.com and the shell attempts to
resolve www.blahblah.com.blahblah.com

I'm not saying that the DNS design is fundamentally
flawed. I'm not saying that Linux is flawed. I'm not
even remotely claiming that I understand all the details
(not even *most* of the details). What I'm saying is
that things *should* work (and they were working up
until a couple days ago). Something *is* wrong (or was,
since the problem seems to have "automagically" been
solved), and my lack of knowledge and understanding make
me unable to figure out what. Is it my configuration
that was wrong, and it was causing that the system
correctly and expectedly attempts to resolve
www.blahblah.com.blahblah.com? Who knows. Maybe.
But if that was the case, I would have liked to know
why my setup was wrong, and what would I have to do
to make it right. (that was, in essence, my question)

I know I may be sounding a bit rude. I know you guys
are not paid to help me, and I don't *expect* that
someone will have to answer every or any question that
I post. I'm super grateful for the generosity of
everyone that takes the time to respond and help me
figure things out (in this or any other newsgroup).
It's just that your response sounded a bit "unfairly"
negative, so I'm trying to clarify things.

Cheers,

Carlos
--

 
Reply With Quote
 
Baho Utot
Guest
Posts: n/a

 
      01-07-2004, 11:01 PM
Carlos Moreno wrote:

> Baho Utot wrote:
>
>> What your seeing is the DNS services working as designed. Seeing names
>> as
>> www.blahblah.com.blahblah.com is the way host resolution works. I refer
>> you to any good book on DNS.

>
> I don't want to sound rude or ungrateful, but this does
> not help at all!
>
> I mean, I have a host that is called www.blahblah.com,
> and that domain name is official (i.e., it is registered
> with some official domain name registrar's) and recognized
> worldwide.
>
> Well, something *is* wrong if I type telnet localhost and
> the command freezes for sometime until the DNS resolver
> timesout looking for localhost.blahblah.com. Or if I
> type telnet www.blahblah.com and the shell attempts to
> resolve www.blahblah.com.blahblah.com
>
> I'm not saying that the DNS design is fundamentally
> flawed. I'm not saying that Linux is flawed. I'm not
> even remotely claiming that I understand all the details
> (not even *most* of the details). What I'm saying is
> that things *should* work (and they were working up
> until a couple days ago). Something *is* wrong (or was,
> since the problem seems to have "automagically" been
> solved), and my lack of knowledge and understanding make
> me unable to figure out what. Is it my configuration
> that was wrong, and it was causing that the system
> correctly and expectedly attempts to resolve
> www.blahblah.com.blahblah.com? Who knows. Maybe.
> But if that was the case, I would have liked to know
> why my setup was wrong, and what would I have to do
> to make it right. (that was, in essence, my question)
>
> I know I may be sounding a bit rude. I know you guys
> are not paid to help me, and I don't *expect* that
> someone will have to answer every or any question that
> I post. I'm super grateful for the generosity of
> everyone that takes the time to respond and help me
> figure things out (in this or any other newsgroup).
> It's just that your response sounded a bit "unfairly"
> negative, so I'm trying to clarify things.
>
> Cheers,
>
> Carlos
> --


Sorry

I didn't mean to be, DNS works that way. If you want to learn how then
have a look on a DNS book.
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Establishing a VPN connection to an MS server Clark Smith Linux Networking 1 03-15-2011 10:40 PM
Problems with establishing wireless connection and VPN connection before log on to daomain daniel@namni.se Windows Networking 1 09-13-2006 10:47 PM
Establishing Bluetooth network connection with PDA news.microsoft.com Wireless Networks 0 11-22-2004 02:12 PM
Don't get local IP after establishing a connection Luis Wireless Internet 12 07-23-2004 05:35 AM
Problems establishing an VPN connection MC Windows Networking 0 08-27-2003 04:08 AM



1 2 3 4 5 6 7 8 9 10 11