Hi,
Today, with no apparent outside influence, our main server (A) stopped
responding on the IP address we use for most services. We have a
backup server (B) running Nagios which stepped in to supply some
services on finding it couldn't ping A, including taking over that IP
address.
On trying to return service to server A, we can't get it to respond on
the main IP address to anything but requests from server B. If B is
responding to the same IP address, everything seems fine (as regards
networking). I don't believe any sort of selective packet blocking is
preventing communication.
The architecture is simple: the two machines connected to a switch
then onto the wider net, and on a private network through separate
NICs.
Running the command 'ip addr show' on the the two machines gives the
following relevant response:
Server A
---------
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
inet nnn.qqq.1.5/24 brd nnn.qqq.1.255 scope global eth0
inet nnn.qqq.1.1/24 brd nnn.qqq.1.255 scope global secondary
eth0:0
inet nnn.qqq.1.2/24 brd nnn.qqq.1.255 scope global secondary
eth0:1
inet nnn.qqq.1.3/24 brd nnn.qqq.1.255 scope global secondary
eth0:2
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/24 brd 10.0.0.255 scope global eth1
Server B
---------
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether xx:xx:xx:xx:xx:xx f brd ff:ff:ff:ff:ff:ff
inet nnn.qqq.1.4/24 brd nnn.qqq.1.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
inet 10.0.0.2/24 brd 10.0.0.255 scope global eth1
Any hints as to what might be going on?
TIA,
- simon
|