Networking Forums

Networking Forums > Computer Networking > Linux Networking > Flakey Server can't initiate a network connection.

Reply
Thread Tools Display Modes

Flakey Server can't initiate a network connection.

 
 
Paul Rogers
Guest
Posts: n/a

 
      10-15-2003, 07:12 AM
Dear All

This is driving me scatty, can't any one help?

I have recently setup a new server running RH linux 7.3, which will,
if it ever works be web server.

The problem is that I cant connect to anything when the box initiates
the request. For example if I ping localhost or the server's own IP
address I get a reply, but anything else, including other boxes on the
same subnet and the GW just give host unreachable.

I can connect to the box with SSH no problem and view web pages on
Apache which is running on it (tho' as soon as the box uses a page
that connects to another server ie mail etc it fails). I can also ping
the box from any other PC and receive a reply. I can even connect via
a windows box to the samba server running on the box.

I'm sure the problem is with the network config, but can't see where
the issue lies. I've included the outputs from ifconfig and netstat
-rn below. Below those are the outputs from tcpdump when pinging
between two servers. The first is where the ping was initiated by a
server that is working fine and the second by the box that is
"faulty". When the faulty box initiates the connection I only get
repeated arp who-has messages.

Can anyone suggest either what I can do to fix this or alternatively
what can be done to further investigate this problem?

Many thanks

Paul

ifconfig shows

eth0 Link encap:Ethernet HWaddr 00:06:5B:F6:63:73
inet addr:10.80.18.207 Bcast:10.80.23.255
Mask:255.255.248.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:979979 errors:0 dropped:0 overruns:0 frame:0
TX packets:9753 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:116652031 (111.2 Mb) TX bytes:1651268 (1.5 Mb)
Interrupt:17 Memory:feb60000-feb80000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:603 errors:0 dropped:0 overruns:0 frame:0
TX packets:603 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:66400 (64.8 Kb) TX bytes:66400 (64.8 Kb)

and netstat -rn shows

Kernel IP routing table
Destination Gateway Genmask Flags MSS Window
irtt Iface
10.80.16.0 0.0.0.0 255.255.248.0 U 40 0 0 eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo
0.0.0.0 10.80.16.1 0.0.0.0 UG 40 0 0 eth0

-----------------------------------------------------------------------------
tcpdump -e on ping from good machine to faulty machine

this repeated over and over

14:55:10.355955 0:50:4:67:ab:9b 0:6:5b:f6:63:73 ip 98: edm_bfhxx_fp002
> 10.80.18.207: icmp: echo request (DF)

14:55:10.355955 0:6:5b:f6:63:73 0:50:4:67:ab:9b ip 98: 10.80.18.207 >
edm_bfhxx_fp002: icmp: echo reply


------------------------------------------------------------------------------
tcpdump -e on ping from faulty machine to good machine

this repeated over and over

13:39:32.000864 0:6:5b:f6:63:73 Broadcast arp 42: arp who-has
10.80.18.27 tell edm_bfhxx_wb005
13:39:33.000793 0:6:5b:f6:63:73 Broadcast arp 42: arp who-has
10.80.18.27 tell edm_bfhxx_wb005
 
Reply With Quote
 
 
 
 
Peter T. Breuer
Guest
Posts: n/a

 
      10-15-2003, 08:03 AM
In comp.os.linux.setup Paul Rogers <(E-Mail Removed)> wrote:
> The problem is that I cant connect to anything when the box initiates
> the request. For example if I ping localhost or the server's own IP
> address I get a reply, but anything else, including other boxes on the
> same subnet and the GW just give host unreachable.


Then you don't have a route to them, or the NIC does not work.

> eth0 Link encap:Ethernet HWaddr 00:06:5B:F6:63:73
> inet addr:10.80.18.207 Bcast:10.80.23.255
> Mask:255.255.248.0
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:979979 errors:0 dropped:0 overruns:0 frame:0
> TX packets:9753 errors:0 dropped:0 overruns:0 carrier:0


The NIC works.

> collisions:0 txqueuelen:100
> RX bytes:116652031 (111.2 Mb) TX bytes:1651268 (1.5 Mb)
> Interrupt:17 Memory:feb60000-feb80000
>


> and netstat -rn shows


Please use route -n.

> 10.80.16.0 0.0.0.0 255.255.248.0 U 40 0 0 eth0


There you are. No route. Learn the difference between 16 and 18.
ANd what's that mask for?

> 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo
> 0.0.0.0 10.80.16.1 0.0.0.0 UG 40 0 0 eth0


Donnnnng.

Peter
 
Reply With Quote
 
Nico Kadel-Garcia
Guest
Posts: n/a

 
      10-15-2003, 12:10 PM
Paul Rogers wrote:

> Dear All
>
> This is driving me scatty, can't any one help?
>
> I have recently setup a new server running RH linux 7.3, which will,
> if it ever works be web server.
>
> The problem is that I cant connect to anything when the box initiates
> the request. For example if I ping localhost or the server's own IP
> address I get a reply, but anything else, including other boxes on the
> same subnet and the GW just give host unreachable.
>
> I can connect to the box with SSH no problem and view web pages on
> Apache which is running on it (tho' as soon as the box uses a page
> that connects to another server ie mail etc it fails). I can also ping
> the box from any other PC and receive a reply. I can even connect via
> a windows box to the samba server running on the box.


Sounds like your gateway or netmask are wrong. Take a look in
/etc/sysconfig/network to see what these are, or in
/etc/sysconfig/network-scripts/ifcfg-eth* for multiple network ports.


> I'm sure the problem is with the network config, but can't see where
> the issue lies. I've included the outputs from ifconfig and netstat
> -rn below. Below those are the outputs from tcpdump when pinging
> between two servers. The first is where the ping was initiated by a
> server that is working fine and the second by the box that is
> "faulty". When the faulty box initiates the connection I only get
> repeated arp who-has messages.
>
> Can anyone suggest either what I can do to fix this or alternatively
> what can be done to further investigate this problem?
>
> Many thanks
>
> Paul
>
> ifconfig shows
>
> eth0 Link encap:Ethernet HWaddr 00:06:5B:F6:63:73
> inet addr:10.80.18.207 Bcast:10.80.23.255
> Mask:255.255.248.0


This part looks OK. The bitmapping from IP, Netmask, and broadcast all
look good.

> Kernel IP routing table
> Destination Gateway Genmask Flags MSS Window
> irtt Iface
> 10.80.16.0 0.0.0.0 255.255.248.0 U 40 0 0 eth0
> 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo
> 0.0.0.0 10.80.16.1 0.0.0.0 UG 40 0 0 eth0


This.... looks a little weird. Are you sure your gateway is 10.80.16.1?
And is your upstream switch or router correctly configured to route the
10.18.16.0/255.255.248.0 subnet for you?


 
Reply With Quote
 
Paul Rogers
Guest
Posts: n/a

 
      10-16-2003, 07:36 AM
Nico Kadel-Garcia <(E-Mail Removed)> wrote in message news:<vXKdncUbXKerpBCiU-(E-Mail Removed)>...
> This.... looks a little weird. Are you sure your gateway is 10.80.16.1?
> And is your upstream switch or router correctly configured to route the
> 10.18.16.0/255.255.248.0 subnet for you?


Peter/Nico

Many thanks for the replies. It's a class A network that extends from
10.80.16.0 to 10.80.23.255 hence the 255.255.248.0 mask. The router
is indeed on 10.80.16.1 (THere are no typos in there). The same set
up (with different individual IP addresses obviously) works fine on
every other machine on the network. TCPDump's output suggests that for
some reason the faulty machine cannot resolve the mac address of any
machine when it initiates the connection.

Can you suggest why this might be or how to investigate further?

Thanks

Paul
 
Reply With Quote
 
Nico Kadel-Garcia
Guest
Posts: n/a

 
      10-16-2003, 12:44 PM
Paul Rogers wrote:

> Nico Kadel-Garcia <(E-Mail Removed)> wrote in message news:<vXKdncUbXKerpBCiU-(E-Mail Removed)>...
>
>>This.... looks a little weird. Are you sure your gateway is 10.80.16.1?
>>And is your upstream switch or router correctly configured to route the
>>10.18.16.0/255.255.248.0 subnet for you?

>
>
> Peter/Nico
>
> Many thanks for the replies. It's a class A network that extends from
> 10.80.16.0 to 10.80.23.255 hence the 255.255.248.0 mask. The router
> is indeed on 10.80.16.1 (THere are no typos in there). The same set
> up (with different individual IP addresses obviously) works fine on
> every other machine on the network. TCPDump's output suggests that for
> some reason the faulty machine cannot resolve the mac address of any
> machine when it initiates the connection.
>
> Can you suggest why this might be or how to investigate further?
>
> Thanks


I don't have one off-hand. Can you try the machine on another port on
the upstream router, one that is known to work and be configured correctly?

 
Reply With Quote
 
Peter T. Breuer
Guest
Posts: n/a

 
      10-16-2003, 01:36 PM
In comp.os.linux.setup Paul Rogers <(E-Mail Removed)> wrote:
> Nico Kadel-Garcia <(E-Mail Removed)> wrote in message news:<vXKdncUbXKerpBCiU-(E-Mail Removed)>...
> > This.... looks a little weird. Are you sure your gateway is 10.80.16.1?
> > And is your upstream switch or router correctly configured to route the
> > 10.18.16.0/255.255.248.0 subnet for you?

>
> Many thanks for the replies. It's a class A network that extends from
> 10.80.16.0 to 10.80.23.255 hence the 255.255.248.0 mask. The router
> is indeed on 10.80.16.1 (THere are no typos in there). The same set
> up (with different individual IP addresses obviously) works fine on
> every other machine on the network. TCPDump's output suggests that for
> some reason the faulty machine cannot resolve the mac address of any
> machine when it initiates the connection.


Swap its NIC and see if it behaves better. Ditto cable. Physically move
it to somewhere else and see if it gets better. Change its hub/switch
[port].

When you find out what changes its behavior, you'll know what it was.

If the machine calls who-has and gets no reply, then the problem that
those calls are not getting out to the net. If it gets replies and does
not notice them, then IT has the problem in its brain, and I'd be
checking its kernel.

Peter
 
Reply With Quote
 
Paul Rogers
Guest
Posts: n/a

 
      10-17-2003, 06:23 AM
"Peter T. Breuer" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...

> If the machine calls who-has and gets no reply, then the problem that
> those calls are not getting out to the net. If it gets replies and does
> not notice them, then IT has the problem in its brain, and I'd be
> checking its kernel.
>
> Peter


Peter/Nico

Again many thanks for the replies. I checked what traffic the
receiver of the pings was getting while all this was going on and it
was zilch, so as Peter suggests I think the faulty box isn't even
broadcasting the requests onto the network. The box has two NIC's,
one of which is disabled, I've tried both with the same results.
I've also tried reinstalling the OS. NO joy.

If I check the arp table after the ping I get that the ip address is
unresolved. If a then ping the faulty box from the good box the arp
table is updated with the mac address, and the faulty box can then
initiate pings, until it is rebooted.

I'll do what Nico suggests and try moving the box to another port on
the switch, but I suspect it might be the NNIC "driver" that is faulty
or the firmware.

I'll let you know how I get on

Many thanks

Paul
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Can a server initiate communication with a client? ahmed.maryam@gmail.com Windows Networking 1 03-16-2007 12:33 AM
Client will not initiate a hangup on an internet connection Shirlee Wireless Networks 0 09-26-2004 06:47 PM
Initiate Wireless Connection Before Logon Derek Johnson Wireless Networks 0 09-22-2004 07:27 PM
iptables and masquerading - slow to initiate connection Rob Linux Networking 5 08-21-2004 01:44 PM
Flakey Server can't initiate a network connection. Paul Rogers Linux Networking 0 10-15-2003 07:11 AM



1 2 3 4 5 6 7 8 9 10 11