Can anyone suggest what might be causing this perplexing network
problem?
Short version:
A machine on my home network (gigabit ethernet) drops lots of packets
when receiving at a rate of about 30MB/S or higher. CPU utilization
remains low. Typical output from ifconfig after a netcat session:
eth0 Link encap:Ethernet HWaddr 00:08:54:35:3A:91
inet addr:192.168.254.77 Bcast:192.168.254.254
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:11585485 errors:0 dropped:206622 overruns:0
frame:0
TX packets:2567897 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:747011954 (712.4 MiB) TX bytes:1482613312 (1.3 GiB)
Interrupt:12 Base address:0xf00
The problem is completely unidirectional. I have managed to send data
FROM this host via TCP at over 45 MB/S with no dropped packets.
Details:
The machine is an Athlon 64 3400+, which should be able to receive at
these rates. The NIC is based on the RTL8169 chip (using the r8169
driver). I am running kernel 2.6.11.
I have tried many things to diagnose/fix the problem, with absolutely
no effect.
1) Cable?
The cable works fine when run between another host and my hub.
2) CABLE?
Replacing *all* cables with brand new CAT6 patch cords has no effect.
3) Hub?
I get zero dropped packets when transmitting at high speed between two
different hosts on the same hub. Switching around the ports didn't
help.
4) Gigabit?
I don't have another gigabit hub, but when I run the connection through
a 10/100 hub there are no dropped packets. But then the speed is
lower, and I get the same effect with the gigabit hub and a slower
transfer rate.
5) Send host?
I have tried sending data from two different hosts to the "bad" one,
with the same results.
6) Driver?
I recompiled the kernel with RX polling turned on and turned off. Both
give the same result. I don't have a gigabit NIC based on another
chipset to test.
7) TCP only?
Hitting this host with a lot of UDP packets has approximately the same
effect (tested with netcat).
8) IRQ?
The second NIC has a different IRQ, and the problem happens with both
of them.
Thanks in advance.
-David
|