Very interesting. I'm on SP2 as well. I'm having a similar problem as you in
that wireshark is showing tons of SMB traffic when the XP clients hang, like
the clients aren't getting a response back from the server. The "net files"
is showing almost 1,200 files opened...so I don't think a 200 files
limitation would be causing a problem unless you have some lower end hardware
or other resource issue.
I notice our problem during our backups in the evening or if I perform a
defrag, or if I fail the cluster over -- basically anything that hits the
processor for a little bit. Our cluster is virtualized with dual procs and 4
gb of memory...so, its somewhat hard for me to believe that a defrag or a
background backup would cause it to hang so badly that it can't respond to
any network requests. During normal daytime usage everything on the server
runs fine, even with some fairly high cpu usage (30% range).
Also, I don't have that registry key you mentioned, so it must be some
default value thats built in.
Thanks for all the information. I'll keep hunting for solutions. If I find
anything out I'll post it.
Rob
"P. Lindberg" wrote:
> I upgraded to SP2, but I don't know if that fixed the problem. In the past
> it has only happened about every 3 months, so it's too early to tell.
>
> I'm not monitoring the number of connections available, only the number in
> use. The max number of TCP connections is controlled by this key:
> HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Param eters\TcpNumConnections
> Which has a default value of 0x00fffffe (16777214). That's really high, so
> it would be hard to hit that limit.
>
> However, the limit on the number of SMB connections is probably much lower,
> although I don't know what it is. I'm pretty sure it's higher than 200. My
> theory is that the limit being hit is the number of files open via SMB --
> viewable with the command "net files".
>
>
> "weisrc" wrote:
>
> > I'm having a similar problem and was wondering if any resolution had been
> > reached? Also, how are you quering for available TCP pool connections? Is
> > there a way to monitor this (perfmon)?
> >
> > Rob
> >
> >
> > "P. Lindberg" wrote:
> >
> > >
> > > Yes. The server responded to ICMP pings, and was able to connect to other
> > > file servers as an smb client.
> > >
> > > Network utilization was very low -- less than 1%.
> > >
> > > Netstat showed all connections in the "established" state, and none in a
> > > "waiting" state. Also, the number of tcp connections was not nearly high
> > > enough to exhaust the pool of available connections. From what I remember,
> > > there were around 100 connections.
> > >
> > >
> > > "Ashish" wrote:
> > >
> > > > Does this means only SMB traffic stopped responding and the other network
> > > > requests are being accepted by the server?
> > > >
> > > > Check out how much load (Network Traffic) is on the server?
> > > >
> > > > Is the TCP connect pool got highly utilized and more are in waiting state
> > > > (netstat)?
> > > >
> > > > Ashish
> > > >
> > > > "P. Lindberg" wrote:
> > > >
> > > > > Our main file server runs Win 2003 Enterprise with SP1. Yesterday, it
> > > > > stopped responding to all SMB traffic. Clients with connections to the
> > > > > server -- pretty much any computer logged into the domain -- locked up
> > > > > waiting for a response from the server.
> > > > >
> > > > > The server itself was still responsive at the console, and netstat showed
> > > > > manyt established tcp connections on "microsoft-ds" to clients, but a packet
> > > > > capture with ethereal showed many unanswered "SMB negotiate" packets from the
> > > > > clients.
> > > > >
> > > > > Restarting the "Server service" was unsuccessful, and the server had to be
> > > > > cold booted. After rebooting, it worked normally for about 10 minutes, and
> > > > > then the same thing happened again. This time, I issued the "net stop
> > > > > server" command, and accidentally answered 'no' to the second "are you sure?"
> > > > > question. This aparently brought the Server service back to life. . .
> > > > > somehow.
> > > > >
> > > > > No evidence of why this happened was recorded. I did manage to capture the
> > > > > output from "openfiles.exe" to a text file, and the total was rather high.
> > > > > No errors were recorded in event logs, though.
> > > > >
> > > > > Do I need to manually tune the lanmanserver parameters? Is a resouce
> > > > > limitation being reached?
|