bonding, link aggregation, and switch config

Discussion in 'Linux Networking' started by Guest, Dec 10, 2004.

  1. Guest

    Guest Guest

    Hi,

    I have a few questions regarding Linux bonding for high availability.
    I'm trying to setup bonding with 2.4.20 kernel and have a few questions.

    Scenario 1: Creating a bond between two servers (wired back-to-back)
    using dual gigabit ports on each side. (Example 1 in the bonding.txt file)

    I set it up just like advised in the bonding.txt file, i.e round robin
    mode with twice the speed and failover. There are few problems:

    1. Link aggregration does not work. I dump a huge iso across the bonded
    pipe using scp and it took about 30 seconds. I took down the bond
    and dumped the same file over individual NICs and it also took about
    30 seconds. I ran the same tests again with ttcp over a couple minute
    interval (instead of dumping the iso) and the speed was similar to
    dumping the iso. And that speed is around what it should be for a
    gigabit port (~ 900MBit/sec)

    2. Failover using MII. To simulate a fault, I first ifdown the NIC, and
    the sending server still sends packets out in round robin fashion. That
    is understandable because even though the interface is ifdown, mii-tool
    still show the link ok msg. I simulate a faul by pulling the cable
    and it worked fine. So is miimon the right mode to use to failover
    if an interface is ifdown.

    3. Link aggregration with ARP interval: I ran #1 again using ARP
    insteading of miimon and it still didn't work.
    link aggregration still didn't work.

    4. If I understand correctly from the bonding.txt file and basic networking
    knowledge, if a host is in the ARP table, the node should not ARP again.
    So I did a ping test across the bond and watch the traffic. ARP
    request/reply msgs were still flying by though the ARP table still lists
    the host I'm pinging. How do you explain that?

    5. Using arp interval with round-robin doesn't seem to work. I sniffed
    both physical interfaces and the packets are only sent out on one
    interface though I use mode 0 (default, round robin)

    6. If I want to run the dual gigaE ports up to one or two cisco switch(es)
    for failover and/or link aggregration. Would you use miimon or
    arp_interval?

    a) According to bonding.txt, if both NICs on host go to a single switch,
    setup a trunk on both switch ports for link aggregation. Is that right?
    How does the switch handle the same MAC on both ports? If it blocks
    the port with STP, you would not get "twice the speed" would you?
    Should I setup a {Fast,Giga}EtherChannel on the switch instead of
    a trunk?

    7. Does this kernel support 802.3ad or LACP by default? Is it more desired
    when connecting to a cisco switch if the switch supports it?

    Scenario 2: Connecting the NICs to more than one switch to eliminate SPoF.
    The bonding.txt file says "This mode is more problematic...", have you tried
    it successfully? If so, what's your config?

    Please advise. Thanks.

    Henry
     
    Guest, Dec 10, 2004
    #1
    1. Advertisements

  2. In comp.os.linux.networking :
    This is a pretty old/buggy kernel, please use a recent one from
    kernel.org to get a updated bonding driver/etc.

    [..]
    What kind of PCI bus are those NICs on? Are your disks fast
    enough to read the data, is the receiving system fast enough to
    handle all this?

    [..]
    Yep, works like a charm, no special config for this. The only
    important thing, most $$ switches need to be configured to make
    this working or unexpected things will happen. Ask your network
    guys or/and check the switch docu how to go on about it.

    Anyway, update the kernel before anything else.

    Good luck
     
    Michael Heiming, Dec 10, 2004
    #2
    1. Advertisements

  3. Guest

    Juha Laiho Guest

    Here's a terminology confusion.

    Trunking is Sun terminology for Cisco EtherChannel, and thus also for
    Linux bonding. And with Cisco, trunking is combining all (or a subset
    of) VLANs in a switch to a single port. So, as you suspected, it's
    EtherChannel conf you need on the switches, not trunking.
     
    Juha Laiho, Dec 10, 2004
    #3
  4. Guest

    Wolf Guest


    He's right you know. It is not trunking. And you will not get
    the throughput across both lines with one node. Now throw
    10 at them and that's a different story.
     
    Wolf, Dec 12, 2004
    #4
  5. Guest

    Guest Guest

    Michael,

    I'm in a controlled env where the "latest" kernel is 2.4.21. I've looked
    through the kernel release notes and found some seemingly minor fixes for
    bonding in 2.4.21. However, in 2.4.22 and 2.4.26, there are quite some
    changes. If my goal is to get failover (with arp_interval and/or miimon) and
    link aggregration to work, do I _absolutely_ have to have something >=2.4.22 ?
    I'm building argument for kernel upgrade.

    Hardware-wise, they are all very fast (dual Pent 2.5 with a couple gigs
    of ram)
    Did you mean the switch would need to be configured for EtherChannel?

    Also, thanks Juha and Wolf for clarification on trunking vs EtherChannel.

    Henry
     
    Guest, Dec 13, 2004
    #5
  6. In comp.os.linux.networking :
    You wrote *2.4.20* in your OP, makes trying to help pretty hard if
    your kernel changes with every post.
    [..]
    Ah, very fast great detailed info, the problem I was thinking
    about is that 2x32bit PCI 1GB NIC could outperform a 32 bit PCI
    slot (IIRC), without calculating it.

    [..]
    Probably, cisco calls it ISL other manufacturers might have other
    names for it.

    [..]
     
    Michael Heiming, Dec 13, 2004
    #6
  7. Guest

    Guest Guest

    Michael,

    Sorry for the confusion. I was testing on a 2.4.20 kernel, but 2.4.21 is
    the latest I can go to.
    I see. These are in-house boxes, so the hardware guys already took care
    of the hardware compatibility issue. I hope ;)

    Henry
     
    Guest, Dec 13, 2004
    #7
  8. Guest

    Juha Laiho Guest

    It's not a compatibility issue. It's an issue of bandwidth (or may be,
    but so far you've declined requests to tell enough to check whether it
    is or not).

    So, a single 32bit/33MHz PCI bus is capable of transferring rather
    exactly 1Gbit/s. Still, there might be multiple card slots on the
    same bus. You can connect two 1GB PCI NICs to a single PCI bus, but
    because of the total bandwidth of the bus, the aggregate bandwidth
    of the two adapters cannot be above 1Gbit/s. But as said, because
    you're keeping this information from us, we can just speculate. It
    could be that you have some faster variant of PCI bus in your
    machines, in which case this speculation is moot.
     
    Juha Laiho, Dec 15, 2004
    #8
  9. Exactly, even a single 1GB NIC might outperform a standard
    32bit/33MHz PCI bus, since those are full duplex, allowing 1
    Gbit/s in each direction at the same time.

    But then without any info it's kind of hard helping...
     
    Michael Heiming, Dec 16, 2004
    #9
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.