We had a weird problem here a few days ago. Please tell if you have an
idea of how it could happen.
publicnetwork---bigswitch---NATfirewall---smallswitch---internalnetwork
|
|
47
We have a machine called '47' attached to the bigswitch. (IP visible
from the world)
At a certain moment it stopped responding to pings (and to ssh, http,
everything) to computers inside internalnetwork.
However it was still responding to pings to computers outside the NAT.
We restarted all switches and the NAT many times, then we changed
ethernet sockets in the bigswitch... No good.
I looked at the routing table of 47: there were a few extra entries (a
few destination IPs routed to loopback) to blacklist some IPs from which
we were receiving SSH attacks. This is a dynamic filter we have
installed. However I removed those entries manually from the route and
the problem persisted.
I looked at the iptables: it was empty as it was supposed to.
Then I started wireshark on the 47 and I could see the ping requests
incoming from the internalnetwork machines, and the outgoing ping
replies to such pings, going to the NAT.
So the replies were acutally generated, but somehow they were not
reaching the internalnetwork.
I didn't know what to do anymore, so I restarted the 47.
To my surprise the pings started working again!!
I immediately checked the route table and the iptables tables: they were
exactly like before the reboot.
Unfortunately I forgot to look at the arp cache of 47 before and after
the restart.
Any idea of what could have happened!?!?
Thanks in advance
|