Networking Forums

Networking Forums > Computer Networking > Windows Networking > Very slow SMB performance on one interface of a multi-homed server

Reply
Thread Tools Display Modes

Very slow SMB performance on one interface of a multi-homed server

 
 
usenet@pkmotorsports.com
Guest
Posts: n/a

 
      04-07-2007, 03:43 AM
General problem:
Seven multi-homed hosts are connected to a fast ethernet office LAN on
one interface and a private gigabit network on the other. SMB
communication (tested with Windows Explorer) on the private network is
extremely slow for three of the machines.

Details:
Of the seven hosts, four are Windows 2000 server and three are XP.
The issue is present on two of the Win2k machines and one of the XP
machines. Each of the machines exhibiting the problem is identical in
terms of hardware, O/S build, and even installed programs to at least
one of the other machines which does not exhibit the problem.

Each machine has a static address on the office LAN (10.20.30.x) and a
static address on the private network (172.20.30.x). Four of the
machines have a third adapter with an address of 192.168.30.x which I
will discuss later.

The office LAN is configured with two domains in the DNS search order,
WINS servers, etc. As might be expected, the gateway is 10.20.30.1.

The private network has no name server and no gateway, nothing except
the 7 hosts attached. There is no attempt to route between the two
networks, and at this time no attempt to get name resolution working
on the private network.

The NIC on the private network is bound first on all hosts.

Some information I have found on the web indicates that setups similar
to the above just won't work. However, the cluster used to have only
four hosts and we had a parallel setup that worked flawlessly. The
original four had Qlogic fiber channel HBAs, and the Qlogic adapters
were configured as an IP interface. The private fiber channel network
was configured exactly as the private gigabit network is now and
worked perfectly with no adjustments to anything. Users could simply
point Windows Explorer to \\192.168.30.y\share and immediately get a
near-gigabit speed connection. The fiber channel "NIC" was bound
before the one connected to the office LAN (and is now bound in
between the gigabit network and the office LAN).

Buying another fiber channel switch and more HBAs is not a practical
solution to this problem, although everything I have seen so far
indicates that would actually solve my issue - at great cost. I
naiively assumed that it would be similarly easy to use a gigabit
ethernet network to accomplish what was possible and in fact quite
easy over the fiber channel equipment.

The fiber channel setup still exists on four of the hosts. If a host
is affected, connections to all of the other 6 are affected, but the
problem is not reciprocal. Host x may exhibit the slow link when
connecting to \\172.20.30.y, but host y will not exhibit the slow link
connecting to \\172.20.30.x. Connections using the office LAN and,
for four of the hosts, the fiber channel network are fine.

Troubleshooting steps tried so far:
1. Verified that the switch and all cables are 100% functional. Ran
all NICs through diagnostics (all passed).
2. Verified that there are no errors written to the event log and no
TCP/IP errors (there aren't).
3. Compared the network configurations between the hosts (all have
the same settings).
4. Profiled the problem connections with Network Monitor (results
described below).
5. Tried to compare the registry settings related to the fiber
channel NICs and the gigabit NICs (inconclusive).
6. Added entries to the HOSTS file (did not help).
7. Tried disabling the adapter connected to the office LAN on the
problem hosts. When I do this the gigabit network and the fiber
channel network are effectively crippled. Any attempt to connect to
another host results in (after 10 or 15 seconds) the error message:
"there are currently no logon servers available to service the logon
request". I know this should be a clue to me, but I do not know
enough to interpret it.

Network Monitor results:
Profiling a copy of a large file over the gigabit network one of the
problem hosts reveals that the following is happening: A session will
be negotiated as normal, then a few dozen packets will be exchanged.
After a 5-6 second pause, a few hundred more packets and acks will
travel over the network, then another 5-6 second pause will begin.
This packets-pause cycle repeats until the copy is finished. It
appears as though there is some internal timeout which keeps getting
invoked, and every time it does the pauses occur. When the connection
is not in a "paused" state, the packets are moving at the expected
speed and there are no errors. There is also no unusual activity on
the other adapters while all of this is taking place.


Any suggestions would be greatly appreciated. Thank you for your time
- PK

 
Reply With Quote
 
 
 
 
Arkady Frenkel
Guest
Posts: n/a

 
      04-10-2007, 06:04 AM
Check with some sniffer ( ethereal, netmon... ) the times of sending/
receiving packets. Maybe that will show something
Arkady

<(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ups.com...
> General problem:
> Seven multi-homed hosts are connected to a fast ethernet office LAN on
> one interface and a private gigabit network on the other. SMB
> communication (tested with Windows Explorer) on the private network is
> extremely slow for three of the machines.
>
> Details:
> Of the seven hosts, four are Windows 2000 server and three are XP.
> The issue is present on two of the Win2k machines and one of the XP
> machines. Each of the machines exhibiting the problem is identical in
> terms of hardware, O/S build, and even installed programs to at least
> one of the other machines which does not exhibit the problem.
>
> Each machine has a static address on the office LAN (10.20.30.x) and a
> static address on the private network (172.20.30.x). Four of the
> machines have a third adapter with an address of 192.168.30.x which I
> will discuss later.
>
> The office LAN is configured with two domains in the DNS search order,
> WINS servers, etc. As might be expected, the gateway is 10.20.30.1.
>
> The private network has no name server and no gateway, nothing except
> the 7 hosts attached. There is no attempt to route between the two
> networks, and at this time no attempt to get name resolution working
> on the private network.
>
> The NIC on the private network is bound first on all hosts.
>
> Some information I have found on the web indicates that setups similar
> to the above just won't work. However, the cluster used to have only
> four hosts and we had a parallel setup that worked flawlessly. The
> original four had Qlogic fiber channel HBAs, and the Qlogic adapters
> were configured as an IP interface. The private fiber channel network
> was configured exactly as the private gigabit network is now and
> worked perfectly with no adjustments to anything. Users could simply
> point Windows Explorer to \\192.168.30.y\share and immediately get a
> near-gigabit speed connection. The fiber channel "NIC" was bound
> before the one connected to the office LAN (and is now bound in
> between the gigabit network and the office LAN).
>
> Buying another fiber channel switch and more HBAs is not a practical
> solution to this problem, although everything I have seen so far
> indicates that would actually solve my issue - at great cost. I
> naiively assumed that it would be similarly easy to use a gigabit
> ethernet network to accomplish what was possible and in fact quite
> easy over the fiber channel equipment.
>
> The fiber channel setup still exists on four of the hosts. If a host
> is affected, connections to all of the other 6 are affected, but the
> problem is not reciprocal. Host x may exhibit the slow link when
> connecting to \\172.20.30.y, but host y will not exhibit the slow link
> connecting to \\172.20.30.x. Connections using the office LAN and,
> for four of the hosts, the fiber channel network are fine.
>
> Troubleshooting steps tried so far:
> 1. Verified that the switch and all cables are 100% functional. Ran
> all NICs through diagnostics (all passed).
> 2. Verified that there are no errors written to the event log and no
> TCP/IP errors (there aren't).
> 3. Compared the network configurations between the hosts (all have
> the same settings).
> 4. Profiled the problem connections with Network Monitor (results
> described below).
> 5. Tried to compare the registry settings related to the fiber
> channel NICs and the gigabit NICs (inconclusive).
> 6. Added entries to the HOSTS file (did not help).
> 7. Tried disabling the adapter connected to the office LAN on the
> problem hosts. When I do this the gigabit network and the fiber
> channel network are effectively crippled. Any attempt to connect to
> another host results in (after 10 or 15 seconds) the error message:
> "there are currently no logon servers available to service the logon
> request". I know this should be a clue to me, but I do not know
> enough to interpret it.
>
> Network Monitor results:
> Profiling a copy of a large file over the gigabit network one of the
> problem hosts reveals that the following is happening: A session will
> be negotiated as normal, then a few dozen packets will be exchanged.
> After a 5-6 second pause, a few hundred more packets and acks will
> travel over the network, then another 5-6 second pause will begin.
> This packets-pause cycle repeats until the copy is finished. It
> appears as though there is some internal timeout which keeps getting
> invoked, and every time it does the pauses occur. When the connection
> is not in a "paused" state, the packets are moving at the expected
> speed and there are no errors. There is also no unusual activity on
> the other adapters while all of this is taking place.
>
>
> Any suggestions would be greatly appreciated. Thank you for your time
> - PK
>



 
Reply With Quote
 
usenet@pkmotorsports.com
Guest
Posts: n/a

 
      04-15-2007, 11:06 PM
I did check it with netmon. This is where I see the long pauses.

I have completely disabled NetBIOS on all connections, figuring that I
will try to get it working in the simplest case before trying more
complicated stuff, but this doesn't really affect the behavior I am
seeing.

I have also tried a couple of other things since without success.

I tried bridging the gigabit and LAN connections (thinking that I
could then force the machines to use the gigabit connection to each
other), but that did not work - XP's STA sees a network loop and
disables the gigabit connection.

I also tried configuring routing between the gigabit network and the
LAN (on the premise that the problem has to do with a split path
fouling up the authentication - see my note above about "there are
currently no logon servers available to service the logon request").
After doing this though, I didn't know what routes to add.

Throughout all of this, the IP-over-fiber channel network continues to
work, with no routing, bridging, or anything else. The only reason I
can't just use it is that I don't have more fiber channel HBAs (not to
mention more open ports on my Silkworm).

I'm getting pretty discouraged and may end up buying another gigabit
switch, then going to a more "traditional" topology in which all the
adapters connect to the gigabit switch first (and then a trunk line
leads to the office LAN). I'm really loathe to to this because it
takes away a lot of fault tolerance (since there will now be only one
line connecting seven workstations with the main switch) and will have
performance ramifications as well (the trunk line would only be
100baseT). Also, it seems dumb to give up without understanding why I
never had any trouble using the IP-over-FC setup.

 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Route traffic through specific interface on multi-homed server. jwilliams3034 Windows Networking 1 08-17-2007 06:44 PM
Change DC from multi-homed to single homed segmented Terry Windows Networking 7 03-01-2007 06:13 PM
Multi-homed with 2000 server Kidem Windows Networking 3 10-30-2004 06:17 PM
Multi-homed server and VPN NeoAdmin Windows Networking 5 04-28-2004 03:52 PM
Multi-homed Server Samuel Shum Windows Networking 9 12-17-2003 07:11 AM



1 2 3 4 5 6 7 8 9 10 11