Networking Forums

Networking Forums > Computer Networking > Linux Networking > routing woes

Reply
 
 
/dev/null
Guest
Posts: n/a

 
      11-09-2004, 05:47 PM
I've got a 2.4 kernel set up as a router with the hopes to eventually setup
the box as a firewall and protect a couple of servers.

The servers are IPs x.x.x.91 - x.x.x.101. They sit on a x.x.x.0/25 network.
We put the router at x.101 and switched on proxy_arp for both interfaces and
did `route add -host x.x.x.91` for each of the servers that were going to
sit "inside" the router's protected network.

Everything runs fine with a few exceptions. There are just a few systems on
the x.0/25 network that are choking when trying to reach the systems behind
the router.

..91 - .93 currently serve about 80 websites including all their
email/ftp/etc. When we moved .91 behind the router and reset the switch
they were plugged into (to reset the mac tables on the switch) everything
seemed to go fine. We watched a normal flow of traffic come in from the web
to the .91 server and saw those requests satisfied without a problem.

It looks like 3 of the other routers on the x.0/25 network had problems
routing over to .91 sitting behind the new linux router during this period.
Pings, connections to ports 110/25/80, etc all failed from any machine
routing through one of these 3 routers while at the same time other systems
on the .0/25 had no problems and everything coming in from the Internet
seemed to be OK.

Some of the things I noted was on the linux router we ran tcpdump and could
see ping requests coming in to .91 and the responses going back out but for
some reason the other end never saw the responses, they showed 100% packet
drop.

When we stick .91 back up on the switch by itself (instead of behind the
linux router) and reset the switch everything runs fine.

I'm hoping some of you guys have some good ideas on what to look for.

Thanks!


 
Reply With Quote
 
 
 
 
HisNameWasRobertPaulson
Guest
Posts: n/a

 
      11-09-2004, 07:19 PM
I'm assuming the echo-replys are at least going out the correct interface?
If so, whats the next hop, if any? If not does arp report the correct
address? Even if it is hopping a router, is that next hop pingable / correct
mac address?

Also, what network is this .91 on? And the router is .101, same network? and
what is the ip of the router on the /25 net?

This is definatly a routing issue, perhaps you should post a watered-down
route -n and arp

"/dev/null" <(E-Mail Removed)> wrote in message
news:nh8kd.77556$R05.54849@attbi_s53...
> I've got a 2.4 kernel set up as a router with the hopes to eventually

setup
> the box as a firewall and protect a couple of servers.
>
> The servers are IPs x.x.x.91 - x.x.x.101. They sit on a x.x.x.0/25

network.
> We put the router at x.101 and switched on proxy_arp for both interfaces

and
> did `route add -host x.x.x.91` for each of the servers that were going to
> sit "inside" the router's protected network.
>
> Everything runs fine with a few exceptions. There are just a few systems

on
> the x.0/25 network that are choking when trying to reach the systems

behind
> the router.
>
> .91 - .93 currently serve about 80 websites including all their
> email/ftp/etc. When we moved .91 behind the router and reset the switch
> they were plugged into (to reset the mac tables on the switch) everything
> seemed to go fine. We watched a normal flow of traffic come in from the

web
> to the .91 server and saw those requests satisfied without a problem.
>
> It looks like 3 of the other routers on the x.0/25 network had problems
> routing over to .91 sitting behind the new linux router during this

period.
> Pings, connections to ports 110/25/80, etc all failed from any machine
> routing through one of these 3 routers while at the same time other

systems
> on the .0/25 had no problems and everything coming in from the Internet
> seemed to be OK.
>
> Some of the things I noted was on the linux router we ran tcpdump and

could
> see ping requests coming in to .91 and the responses going back out but

for
> some reason the other end never saw the responses, they showed 100% packet
> drop.
>
> When we stick .91 back up on the switch by itself (instead of behind the
> linux router) and reset the switch everything runs fine.
>
> I'm hoping some of you guys have some good ideas on what to look for.
>
> Thanks!
>
>



 
Reply With Quote
 
/dev/null
Guest
Posts: n/a

 
      11-09-2004, 08:11 PM
> I'm assuming the echo-replys are at least going out the correct interface?

As far as I could tell, which now we've moved the .91 machine back "out" of
the linux router's protection so it will be a little harder to test with it,
but I do have a couple other machines still "in" the protection.

> If so, whats the next hop, if any? If not does arp report the correct
> address? Even if it is hopping a router, is that next hop pingable /

correct
> mac address?


when I was pinging from .91 to .80 (one of the trouble routers) and saw the
req go across the linux router and the reply come back to the linux router,
the next hop to the .91 box from the linux router was across a little
netgear switch/hub to the .91 machine. I'm sure the hub/switch works fine
because I've used it for some time now without any event.

> Also, what network is this .91 on? And the router is .101, same network?

and
> what is the ip of the router on the /25 net?


this is all on a public network. x.x.x.0/25 (.1 - .127)

The gateway out of this network is x.x.x.1, the linux router is .101 and
using proxy arp and ip forwarding w/ route entries routes for .91 - .100

One of the routers that was failing was .80, another one .11. In all cases
getting them to talk to .91 is the goal, as that hosts the web sites and
email. Last night .11 was failing intermitently and then suddenly cleared
up. This morning .80 started this same trash. That's when I pinged from
..91 => .80 and noticed both packets crossing the linux router as expected
but the .91 box didn't seem to see the replies (and didn't have tcpdump on
it for me to double check). I ran out of time to check this out and had to
move .91 back outside the protected zone. After moving it out ping worked
fine. While the ping was failing with .80 all the rest of the web seemed
fine. HTTP requests were coming across the linux router and getting
answered without any problems. Same with the other services running on .91,
except those coming from .80.

> This is definatly a routing issue, perhaps you should post a watered-down
> route -n and arp


# arp -n
Address HWtype HWaddress Flags Mask Iface
x.x.x.91 ether 00:10CF:6C:48 C eth0
x.x.x.94 ether 00:0C:299:BA:A8 C eth1
x.x.x.1 ether 00:04:27:4C:BA:E1 C eth0
x.x.x.100 ether 00:11:2F:15:23:76 C eth1
x.x.x.96 ether 00:0C:299:BA:A8 C eth1

as you can see, .91 is now on eth0 (the "public" interface)

# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
x.x.x.95 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
x.x.x.94 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
x.x.x.98 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
x.x.x.99 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
x.x.x.96 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
x.x.x.97 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
x.x.x.100 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
x.x.x.0 0.0.0.0 255.255.255.128 U 0 0 0 eth0
x.x.x.0 0.0.0.0 255.255.255.128 U 0 0 0 eth1
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 x.x.x.1 0.0.0.0 UG 1 0 0 eth0

And here's the route table as it currently stands. About the only
abnormality I can see is the .0/25 network being on both interfaces. Should
the eth1 interface's entry for this net be deleted since technically the
whole netork (minus .94 - .100) is actually on eth0?

Thanks for your help!


 
Reply With Quote
 
HisNameWasRobertPaulson
Guest
Posts: n/a

 
      11-09-2004, 09:00 PM
Ya, thats what I am getting at. If you are not seeing those icmp replies,
(and after looking at your routing table) I think maybe packets are getting
confused as to which interface to exit for the return trip. Is the /25
network out eth0, or is it out eth1 - it can't be both, that much is
certain.
I'm not positive on this, but I have seen occasions where two (static)
routes to the same network (same metric) produced very stange results, aka,
sometimes it worked, sometimes it did not. It all depends. Because of the
nature of arp, the arp cache is periodically flushed, or at least arp
records expire. After which, arp will broadcast for an ip (assuming that
machine is on the same network) - but which interface will it use?
Why don't you try flushing your arp cache, then pinging one of those trouble
boxes while running tcpdump. It may take a couple times, but after a couple
flushes, you may discover the arp broadcast going out the wrong interface,
and thus the icmp fails. This could possible explain the intermittent nature
of the problem. But in that case, the routing table is not to blame, but the
interfaces's ip setup to begin with.
But again, I don't have a real clear picture of your network, so I could be
totally wrong. But I maintain the duplicate routes going out two seperate
interfaces will produce some wierd results, unless running routing
protocols, such as rip or ospf to keep track of metrics/routes...
Also, just making sure here... your eth1 and eth0 are on different subnets,
right? (it's hard to tell with x.x.x.# notation)

-mike

"/dev/null" <(E-Mail Removed)> wrote in message
news:5nakd.12892$V41.2982@attbi_s52...
> > I'm assuming the echo-replys are at least going out the correct

interface?
>
> As far as I could tell, which now we've moved the .91 machine back "out"

of
> the linux router's protection so it will be a little harder to test with

it,
> but I do have a couple other machines still "in" the protection.
>
> > If so, whats the next hop, if any? If not does arp report the correct
> > address? Even if it is hopping a router, is that next hop pingable /

> correct
> > mac address?

>
> when I was pinging from .91 to .80 (one of the trouble routers) and saw

the
> req go across the linux router and the reply come back to the linux

router,
> the next hop to the .91 box from the linux router was across a little
> netgear switch/hub to the .91 machine. I'm sure the hub/switch works fine
> because I've used it for some time now without any event.
>
> > Also, what network is this .91 on? And the router is .101, same network?

> and
> > what is the ip of the router on the /25 net?

>
> this is all on a public network. x.x.x.0/25 (.1 - .127)
>
> The gateway out of this network is x.x.x.1, the linux router is .101 and
> using proxy arp and ip forwarding w/ route entries routes for .91 - .100
>
> One of the routers that was failing was .80, another one .11. In all

cases
> getting them to talk to .91 is the goal, as that hosts the web sites and
> email. Last night .11 was failing intermitently and then suddenly cleared
> up. This morning .80 started this same trash. That's when I pinged from
> .91 => .80 and noticed both packets crossing the linux router as expected
> but the .91 box didn't seem to see the replies (and didn't have tcpdump on
> it for me to double check). I ran out of time to check this out and had

to
> move .91 back outside the protected zone. After moving it out ping worked
> fine. While the ping was failing with .80 all the rest of the web seemed
> fine. HTTP requests were coming across the linux router and getting
> answered without any problems. Same with the other services running on

..91,
> except those coming from .80.
>
> > This is definatly a routing issue, perhaps you should post a

watered-down
> > route -n and arp

>
> # arp -n
> Address HWtype HWaddress Flags Mask

Iface
> x.x.x.91 ether 00:10CF:6C:48 C eth0
> x.x.x.94 ether 00:0C:299:BA:A8 C eth1
> x.x.x.1 ether 00:04:27:4C:BA:E1 C eth0
> x.x.x.100 ether 00:11:2F:15:23:76 C eth1
> x.x.x.96 ether 00:0C:299:BA:A8 C eth1
>
> as you can see, .91 is now on eth0 (the "public" interface)
>
> # route -n
> Kernel IP routing table
> Destination Gateway Genmask Flags Metric Ref Use Iface
> x.x.x.95 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
> x.x.x.94 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
> x.x.x.98 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
> x.x.x.99 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
> x.x.x.96 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
> x.x.x.97 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
> x.x.x.100 0.0.0.0 255.255.255.255 UH 0 0 0 eth1
> x.x.x.0 0.0.0.0 255.255.255.128 U 0 0 0 eth0
> x.x.x.0 0.0.0.0 255.255.255.128 U 0 0 0 eth1
> 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
> 0.0.0.0 x.x.x.1 0.0.0.0 UG 1 0 0 eth0
>
> And here's the route table as it currently stands. About the only
> abnormality I can see is the .0/25 network being on both interfaces.

Should
> the eth1 interface's entry for this net be deleted since technically the
> whole netork (minus .94 - .100) is actually on eth0?
>
> Thanks for your help!
>
>



 
Reply With Quote
 
/dev/null
Guest
Posts: n/a

 
      11-09-2004, 09:37 PM
> Ya, thats what I am getting at. If you are not seeing those icmp replies,
> (and after looking at your routing table) I think maybe packets are

getting
> confused as to which interface to exit for the return trip. Is the /25
> network out eth0, or is it out eth1 - it can't be both, that much is
> certain.


right.

> Why don't you try flushing your arp cache, then pinging one of those

trouble
> boxes while running tcpdump. It may take a couple times, but after a

couple
> flushes, you may discover the arp broadcast going out the wrong interface,
> and thus the icmp fails. This could possible explain the intermittent

nature
> of the problem. But in that case, the routing table is not to blame, but

the
> interfaces's ip setup to begin with.


I'll have to try that out the next time I can find a window to slip that .91
back behind my linux router.

> But again, I don't have a real clear picture of your network, so I could

be
> totally wrong. But I maintain the duplicate routes going out two seperate
> interfaces will produce some wierd results, unless running routing
> protocols, such as rip or ospf to keep track of metrics/routes...
> Also, just making sure here... your eth1 and eth0 are on different

subnets,
> right? (it's hard to tell with x.x.x.# notation)


no, all x.x.x are the same numbers.


 
Reply With Quote
 
HisNameWasRobertPaulson
Guest
Posts: n/a

 
      11-10-2004, 05:39 PM
So you have two interfaces that are on the same subnet??
I don't recommend doing that ever. Plays havoc with routing and would
definatly cause what you are experiencing.

mike


"/dev/null" <(E-Mail Removed)> wrote in message
news:6Dbkd.77881$R05.59739@attbi_s53...
> > Ya, thats what I am getting at. If you are not seeing those icmp

replies,
> > (and after looking at your routing table) I think maybe packets are

> getting
> > confused as to which interface to exit for the return trip. Is the /25
> > network out eth0, or is it out eth1 - it can't be both, that much is
> > certain.

>
> right.
>
> > Why don't you try flushing your arp cache, then pinging one of those

> trouble
> > boxes while running tcpdump. It may take a couple times, but after a

> couple
> > flushes, you may discover the arp broadcast going out the wrong

interface,
> > and thus the icmp fails. This could possible explain the intermittent

> nature
> > of the problem. But in that case, the routing table is not to blame, but

> the
> > interfaces's ip setup to begin with.

>
> I'll have to try that out the next time I can find a window to slip that

..91
> back behind my linux router.
>
> > But again, I don't have a real clear picture of your network, so I could

> be
> > totally wrong. But I maintain the duplicate routes going out two

seperate
> > interfaces will produce some wierd results, unless running routing
> > protocols, such as rip or ospf to keep track of metrics/routes...
> > Also, just making sure here... your eth1 and eth0 are on different

> subnets,
> > right? (it's hard to tell with x.x.x.# notation)

>
> no, all x.x.x are the same numbers.
>
>



 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Win2003 R2 server just stops routing traffic until I restart Routing service Martijn Tonies Windows Networking 8 11-03-2008 11:05 AM
Multihomed Server Routing Woes: Two network segments can't communi E. Lavidor Windows Networking 5 07-18-2007 05:58 PM
I not find the NAT/Basic Firewall under Routing\IP Routing mtczx232@yahoo.com Windows Networking 2 12-16-2006 04:08 PM
Linksys WRT54GC routing woes /mel/ Broadband 2 09-03-2005 06:40 PM
Linux Routing Woes Jeff Linux Networking 6 03-07-2005 05:01 AM



1 2 3 4 5 6 7 8 9 10 11