Networking Forums

Networking Forums > Computer Networking > Linux Networking > Network Load Balancing, Fail Over, NIC Teaming with Radio Bridges.

Reply
Thread Tools Display Modes

Network Load Balancing, Fail Over, NIC Teaming with Radio Bridges.

 
 
Davide DG
Guest
Posts: n/a

 
      01-08-2005, 02:09 PM
Hello everybody!


I have the need to implement a load-balancing fail-over connection
between 2 sites with 2 54-Mbps RADIO Bridges, each one connected to a
3-homed linux boxes.

The network topology should be as follows:
(please use a fixed-width font to display correctly)

eth1 eth1
/--[( radio bridge 1 )]--\
eth0 / \ eth0
LAN_A -----LINUX_A LINUX_B-----LAN_B
\ /
\--[( radio bridge 2 )]--/
eth2 eth2

or as follows:

eth1 eth1
|--| /--( radio bridge 1 )--\ |--|
eth0 | | / \ | | eth0
LAN_A -----LINUX_A switch switch LINUX_B-----LAN_B
| | \ / | |
|--| \--( radio bridge 2 )--/ |--|
eth2 eth2


The goals of the projects are:
A) If both links are working, to load-balance the network traffic.
B) If one link fails (for example the antennas move), to use the other
link (fail-over) until the restoration of the failed link.
C) If possible, not loose the ESTABLISHED connections when one link
fails.

The initial solution (suggested by our chief) was to use 2 NICs with
Load Balancing features built-in: 2 3Com 3C996B-T with the BASP driver.

I succeded in building both the bcm5700 and basp driver, but I don't
know how to configure the basp driver properly. I don't know what value
for the TEAM_TYPE field ... I tried with 0 (SLB), but it seems to
"elect" eth1 devices as "primary" on both machines, so that eth2 are
used *only* if both eth1 links are down (so no load-balancing
features!). [I tried with a flooding ping on both sides]

This is a problem, because goal A would not be achieved.

Moreover, radio bridges with a broken or a misaligned antenna, do not
signal the ethernet link (at the linux sides), which instead remains up.

----------------

Talking with a colleague, he suggested me the possibility to use the
OSPF protocol. With this approach, the ospf protocol would give the
information on a broken link (no response) and a Multi-Path Equal-Cost
Routing would give the Load Balancing feature.


Based on the goals to achieve, which solution (of course even different
from the two proposed) should be implemented? May someone point me to
some *specific* documentation and-or how-to?

Thank you all in advance for any hints ;P

Good bye!


--
Davide DG.
keep the word spam to answer via email.
 
Reply With Quote
 
 
 
 
prg
Guest
Posts: n/a

 
      01-09-2005, 03:48 PM
Davide DG wrote:
> Hello everybody!
>
>
> I have the need to implement a load-balancing fail-over connection
> between 2 sites with 2 54-Mbps RADIO Bridges, each one connected to a
> 3-homed linux boxes.
>
> The network topology should be as follows:
> (please use a fixed-width font to display correctly)
>
> eth1 eth1
> /--[( radio bridge 1 )]--\
> eth0 / \ eth0
> LAN_A -----LINUX_A LINUX_B-----LAN_B
> \ /
> \--[( radio bridge 2 )]--/
> eth2 eth2
>
> or as follows:
>
> eth1 eth1
> |--| /--( radio bridge 1 )--\ |--|
> eth0 | | / \ | | eth0
> LAN_A -----LINUX_A switch switch

LINUX_B-----LAN_B
> | | \ / | |
> |--| \--( radio bridge 2 )--/ |--|
> eth2 eth2


This second setup is a bit unclear to me. Is this to be a high end
switch to perform "pre-linux" trunking? Connected only to linux?
Would be more usual to have the switch behind linux, wouldn't it?

> The goals of the projects are:
> A) If both links are working, to load-balance the network traffic.
> B) If one link fails (for example the antennas move), to use the

other
> link (fail-over) until the restoration of the failed link.
> C) If possible, not loose the ESTABLISHED connections when one link
> fails.
>
> The initial solution (suggested by our chief) was to use 2 NICs with
> Load Balancing features built-in: 2 3Com 3C996B-T with the BASP

driver.
>
> I succeded in building both the bcm5700 and basp driver, but I don't
> know how to configure the basp driver properly. I don't know what

value
> for the TEAM_TYPE field ... I tried with 0 (SLB), but it seems to
> "elect" eth1 devices as "primary" on both machines, so that eth2 are
> used *only* if both eth1 links are down (so no load-balancing
> features!). [I tried with a flooding ping on both sides]


These are Broadcom chip based nics and the readme from BASP indicates
that _only_ Nextreme (Broadcom) nics support what you want. Is that
why you inserted switch in second layout -- to provide such support?
See Readme quote below.

> This is a problem, because goal A would not be achieved.
>
> Moreover, radio bridges with a broken or a misaligned antenna, do not
> signal the ethernet link (at the linux sides), which instead remains

up.
>


That's ugly ;-(

> ----------------
>
> Talking with a colleague, he suggested me the possibility to use the
> OSPF protocol. With this approach, the ospf protocol would give the
> information on a broken link (no response) and a Multi-Path

Equal-Cost
> Routing would give the Load Balancing feature.


Haven't read the docs closely -- this _may_ work.

> Based on the goals to achieve, which solution (of course even

different
> from the two proposed) should be implemented? May someone point me to
> some *specific* documentation and-or how-to?
>
> Thank you all in advance for any hints ;P
>
> Good bye!


Not sure where you got your drivers but Broadcom posted the latest
7/29/04 and you should have gotten the documentation re: configure
_scripts_.

Readme is a bit terse, but seems to read that a nic can be designated
as Primary (load balancing) _or_ Hot-standby for failover (not both).

[quote]
BASP is a kernel module designed for 2.4.x and 2.6.x kernels and
provides load-balancing, fault-tolerance, and VLAN features. These
features are provided by creating teams that consist of multiple NIC
interfaces. A team can consist of 1 to 8 NIC interfaces and each
interface can be designated primary, ***or*** hot-standby (SLB team
only). All ***primary*** NIC interfaces in a team will participate in
***Load-balancing*** operations by sending and receiving a portion of
the total traffic. ***Hot-standby*** interfaces will ***take over*** in
the event that ***all primary interfaces*** have lost their links.
[end quuote]

In effect, if you're load balancing with two primaries and _one_ of
them goes down then the remaining one will carry the whole load.
Hot-standby is for _both_ primaries going down -- for your setup can't
see that HSB will get you anything.

The configure script examples document the options available. See eg.,

basplnx-6.2.9/scripts/team-sample found in linux_basp_ia32-6.2.9.zip

# Configurable parameters:
# TEAM_ID: this number uniquely identifies a team.
# TEAM_TYPE: 0 = SLB, 1 = Generic Trunking/GEC/FEC, 2= 802.3ad
# 3 = SLB (Auto-Fallback Disable)
# TEAM_NAME: ascii name of the team
# TEAM_PAx_NAME: ascii name of the physical interface x,
# where x can be 0 to 7.
# TEAM_PAx_ROLE: role of the physical interface x
# 0 = Primary, 1 = Hot-standby. This field
# must be 0 for Generic Trunking/GEC/FEC
# and IEEE 802.3ad team.
# TEAM_VAx_NAME: ascii name of the virtual interface x,
# where x can be 0 to 63
# TEAM_VAx_VLAN: 802.1p VLAN ID of the virtual interface x.
# For untagged virtual interface, i.e. without
# VLAN enable, set it to 0. The valid VLAN ID
# can be 0 to 4094.
# TEAM_VAx_IP: IP address of the virtual interface x. The
# format should be aa.bb.cc.dd.
# TEAM_VAx_NETMASK: Subnet mask of the virtual interface x.
# The format should mm.nn.oo.pp.
# TEAM_VAx_BROADCAST: Optional broadcast address of the virtual
# interface x. The format should qq.rr.ss.tt.
# TEAM_VAx_GW: Optional default gateway. The format should
# be ww.xx.yy.zz. Usually one default gateway
# is specified for the system and it should
# be reacheable from one network interface.

The download is here:

http://network.free-driver-download....ASP(i386).html
(has code for 3com and intel nics also).
hth,
prg
email above disabled

 
Reply With Quote
 
Davide DG
Guest
Posts: n/a

 
      01-09-2005, 11:40 PM
"prg" <(E-Mail Removed)> scrisse:

> This second setup is a bit unclear to me. Is this to be a high end
> switch to perform "pre-linux" trunking? Connected only to linux?


This second setup was proposed by one of our network colleagues, I am
unsure of its purpose. As per what I understood, he said that those
managed (high-end) switches should be configured with spanning-tree
disabled and should be there to let eth1 and eth2 find the same mac
address when speaking to the radio bridges.

> Would be more usual to have the switch behind linux, wouldn't it?

Of course there will be, to connect the 2 LANs (The 2 linux will be
default gateway for respective LANs).

> These are Broadcom chip based nics and the readme from BASP indicates
> that _only_ Nextreme (Broadcom) nics support what you want. Is that
> why you inserted switch in second layout -- to provide such support?
> See Readme quote below.


> That's ugly ;-(


I know. Only hope is that this information is "supposed", from the
behaviour of a similar Cisco device that we have in lab. The radio
bridges should be available for real testing soon.

The 2 3Com nics (3C996B-T) have that broadcom chip and are actually
recognized by the latest bcm5700 driver, with NICE extensions enabled.

> Not sure where you got your drivers but Broadcom posted the latest
> 7/29/04 and you should have gotten the documentation re: configure
> _scripts_.


Got latest drivers for the Broadcom chip on board of the 3Com cards.
http://broadcom.com/drivers/downloaddrivers.php

bcm5700 7.3.5
http://broadcom.com/drivers/driver-s...ver=570x-Linux
basp 6.2.9
http://broadcom.com/drivers/driver-s...LinuxBASP-i386

> Readme is a bit terse, but seems to read that a nic can be designated
> as Primary (load balancing) _or_ Hot-standby for failover (not both).


I already read the basp docs, but I miss some networking knowledge about
the protocols that can be implemented via the TEAM_TYPE value.

Specifically, in my setup scenario, which of the four values should be
choosen?

And, most important, I am still unsure if doing this think at layer 2
(with basp or bonding or balance or what?) or at layer 3 (ospf, iproute2
and a qdisc? eql device?)

The fact is that, before reading throughfully the LARTC site, I would
like to be hinted on which way is best.

I will make some more tests on Mon 10.

Meanwhile, thanks for the answer

Bye.

--
Davide DG.
tieni per te la spam per rispondermi via email.
 
Reply With Quote
 
prg
Guest
Posts: n/a

 
      01-10-2005, 05:17 AM
Davide DG wrote:
> "prg" <(E-Mail Removed)> scrisse:
>
> > This second setup is a bit unclear to me. Is this to be a high end
> > switch to perform "pre-linux" trunking? Connected only to linux?

>
> This second setup was proposed by one of our network colleagues, I am
> unsure of its purpose. As per what I understood, he said that those
> managed (high-end) switches should be configured with spanning-tree
> disabled and should be there to let eth1 and eth2 find the same mac
> address when speaking to the radio bridges.


This makes sense to me offhand -- too late in the day after too
complete a dinner to think it through ;-)

> > Would be more usual to have the switch behind linux, wouldn't it?

> Of course there will be, to connect the 2 LANs (The 2 linux will be
> default gateway for respective LANs).
>
> > These are Broadcom chip based nics and the readme from BASP

indicates
> > that _only_ Nextreme (Broadcom) nics support what you want. Is

that
> > why you inserted switch in second layout -- to provide such

support?
> > See Readme quote below.

>
> > That's ugly ;-(

>
> I know. Only hope is that this information is "supposed", from the
> behaviour of a similar Cisco device that we have in lab. The radio
> bridges should be available for real testing soon.
>
> The 2 3Com nics (3C996B-T) have that broadcom chip and are actually
> recognized by the latest bcm5700 driver, with NICE extensions

enabled.

That was my thought too -- Cisco can make this so much easier,
sometimes

Oops... see below. Google pays off again

[snip]

> I already read the basp docs, but I miss some networking knowledge

about
> the protocols that can be implemented via the TEAM_TYPE value.
>
> Specifically, in my setup scenario, which of the four values should

be
> choosen?
>
> And, most important, I am still unsure if doing this think at layer 2
> (with basp or bonding or balance or what?) or at layer 3 (ospf,

iproute2
> and a qdisc? eql device?)
>
> The fact is that, before reading throughfully the LARTC site, I would
> like to be hinted on which way is best.
>
> I will make some more tests on Mon 10.


Wasn't too lazy to try just one more google attempt -- brain cells
working better than I thought.

These are Dell docs related to Broadcom NetXtreme cards but provide
some _useful_ background. Probably no need for me to interpret them
for you -- they actually make sense, which is good given the TOD

Check out these links:

http://support.ap.dell.com/docs/netw...#BASP_overview
Linux specific info at bottom of page.
http://support.ap.dell.com/docs/netw...n/protocol.htm
http://support.ap.dell.com/docs/netw...G/en/index.htm

And if my quick read of them is correct, running OSPF will not be
needed to detect dead link pathways -- it's built into the driver. It
operates at the same link layer level as vlan tagging (more or less).

My only remaining "doubt" is the interrupt handling at the Linux boxes
-- have personally seen, and helped others in the ng -- because of the
sheer volume of interrupts that GigE cards can put onto the system.
Generally requires fiber cable (no re-transmissions, please, or 0
window size) and nics with on-board processing/memory support. But my
experience was several years ago -- things must be better by now

Links to tests I had laying about from past GigE ventures:
http://www.cs.uni.edu/~gray/gig-over...T%3A%7Coutline
http://www.digit-life.com/articles2/...2004-p2-2.html

hth,
prg
email above

PS. Would be nice if you could drop a line on how it works out as the
ng doesn't have much recent data/experience for those searching the
group via google.

 
Reply With Quote
 
prg
Guest
Posts: n/a

 
      01-10-2005, 05:28 AM

Davide DG wrote:
> "prg" <(E-Mail Removed)> scrisse:


[snip]

Shamelessly makes excuses -- brain cells not as sharp as I thought.

Forgot to add the "real" documentation link:

http://support.ap.dell.com/docs/netw..._1.1_final.doc

It's listed as "Engineering Brief" here:
http://support.ap.dell.com/docs/network/r35278/

Note that it is a .doc file (Win2K) -- opens just fine in OOo Writer.
regards and good luck,
prg
email above disabled

 
Reply With Quote
 
Davide DG
Guest
Posts: n/a

 
      01-10-2005, 09:04 PM
"prg" <(E-Mail Removed)> scrisse:

> Forgot to add the "real" documentation link:
>
> http://support.ap.dell.com/docs/netw...0nic%20teaming
> _1.1_final.doc


Thx for the useful link.
Actually, I read it almost throughfully, and discovered that, in this
particular network scenario (radio bridges), *any* of the basp teaming
features is useless.

This is due to the fact that a misaligned antenna will not be detected
by the basp driver. This is clearly stated in the "Engineering Brief"
..doc file:

<quote>
section 2.2.1.1 - Smart Load Balancing and Failover

The BASP intermediate driver continually monitors the physical ports in
a team for **link loss**. In the event of **link loss** on any port,
traffic is automatically diverted to other ports in the team.
The SLB teaming mode supports switch fault tolerance by allowing
teaming across different switches- provided the switches are on the
same physical network or broadcast domain.

section 4.1 - Teaming across switches

SLB teaming can be configured across switches. However, the switches
must be interconnected together. Generic Trunking and Link Aggregation
do not work across switches because each of these implementations
requires that all physical adapters in a team share the same Ethernet
MAC address.

** It is important to note that SLB can only detect the loss of link
between the ports in the team and their immediate link partner.

***** SLB has no way of reacting to other hardware failures in the
switches and cannot detect loss of link on other ports.

</quote>

Given that eth1 and eth2 will be connected either directly, or through
switch(es), to their respective radio bridges, whenever one of these
latters dies (eg: antenna), the LINK status on the 2 NICs WILL REMAIN
UP.
Unfortunately, this will fool the basp driver into believing that the
(actually unusable) interface is still available for load-balancing.

-----------

I am resorting to the other solution: Multipath Equal-Cost Routing, with,
if needed, OSPF as the dynamic routing protocol to "probe" the goodness
of the 2 routes.

I saw interesting and promising docs and tools on the LARTC and the Zebra
projects:
http://lartc.org
http://zebra.org

If I manage to succesfully configure everything as expected, I'll post
again a small report

Thanks again for the useful help!!

Bye.

--
Davide DG.
keep the spam away to send email
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Network Load Balancing Eslami, Ali Windows Networking 3 08-29-2006 07:32 PM
load balancing, fail-over kqueue Linux Networking 0 06-27-2006 08:10 PM
Services fail to start with HP NIC Teaming jaspain Windows Networking 7 03-29-2006 03:46 AM
Network Load Balancing Aaron Neunz Windows Networking 2 12-16-2005 01:49 PM
Load Balancing and Auto-fail Over MarkWirez Windows Networking 7 07-13-2005 12:41 PM



1 2 3 4 5 6 7 8 9 10 11