Networking Forums

Networking Forums > Computer Networking > Linux Networking > eth network seems way too slow

Reply
Thread Tools Display Modes

eth network seems way too slow

 
 
Rahul
Guest
Posts: n/a

 
      09-09-2008, 12:56 AM
I just ran a file-transfer speed test on my newly configured LAN. I used
ssh for a quick and dirty test before I move to stronger ammunition
including netperf etc.

For a 4 GB file transfer I have a time of approx. 86 sec. Using a rough
factor of 1 GByte=10 Gbits this translates to an effective transmit of 0.46
Gbps. Even accounting for the protocol overheads etc. this seems a rate way
too low for what I was expecting for my twin-eth-ports bonded machines.

Do other people have any benchmark numbers that I can compare against?

I ought to mention that cpu's on both machines were almost unloaded. As a
rough indicator of Disk I/O I see 40 secs. for a file copy from disk to
disk on any single machine. So I _think_ the network is my bottleneck. But
correct me if I am wrong. I'd be up for trying any more "realistic" /
"sophisticated" tests that people might suggest instead of my primitive
approach.


--
Rahul
 
Reply With Quote
 
 
 
 
Rick Jones
Guest
Posts: n/a

 
      09-09-2008, 05:38 PM
In comp.os.linux.networking Rahul <(E-Mail Removed)> wrote:
> I just ran a file-transfer speed test on my newly configured LAN. I
> used ssh for a quick and dirty test before I move to stronger
> ammunition including netperf etc.


> For a 4 GB file transfer I have a time of approx. 86 sec. Using a
> rough factor of 1 GByte=10 Gbits this translates to an effective
> transmit of 0.46 Gbps. Even accounting for the protocol overheads
> etc. this seems a rate way too low for what I was expecting for my
> twin-eth-ports bonded machines.


> Do other people have any benchmark numbers that I can compare against?


What was the CPU utilization of all the CPUs on each side? Was one of
them pegged?

ssh/scp implies crypto. crypto implies CPU overhead.

also, there may be a question of the TCP window size being used for
the transfer. ssh may set an explicit SO_*BUF size which will disable
Linux's much-vaunted socket buffer autotuning and may subject ssh/scp
to a rather lower socket buffer size limit than can be achieved (by
default) with the autotuning.

> I ought to mention that cpu's on both machines were almost
> unloaded. As a rough indicator of Disk I/O I see 40 secs. for a file
> copy from disk to disk on any single machine. So I _think_ the
> network is my bottleneck. But correct me if I am wrong. I'd be up
> for trying any more "realistic" / "sophisticated" tests that people
> might suggest instead of my primitive approach.


You want to consider both getting even more primitive and running
netperf TCP_STREAM and also taking packet traces of your scp
transfer(s).

rick jones
--
firebug n, the idiot who tosses a lit cigarette out his car window
these opinions are mine, all mine; HP might not want them anyway...
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
 
Reply With Quote
 
Rahul
Guest
Posts: n/a

 
      09-09-2008, 09:42 PM
Rick Jones <(E-Mail Removed)> wrote in news:ga6cam$uj8$4
@usenet01.boi.hp.com:

> What was the CPU utilization of all the CPUs on each side? Was one of
> them pegged?


I might have jumped the gun. I'm not so sure about my earlier statement
that the cpu was not the limiting factor. I'm not exactly sure how to
measure the CPU utilization accurately on a multicore machine. Other than
"top" what are the reccomended options?

> ssh/scp implies crypto. crypto implies CPU overhead.
>


Any way to force a disable-crypto mode on ssh? Also funny is the following
result. I tried to benchmark how long it takes scp to do a local disk-to-
disk copy. Presuming that this would allow me a comparison versus a simple
cp and thus figure out the encryption overhead.

But cp and scp took the same time! Does scp figure out that it is a local
copy and then disable encryption?




--
Rahul
 
Reply With Quote
 
Jean-David Beyer
Guest
Posts: n/a

 
      09-09-2008, 10:07 PM
Rahul wrote:
> Rick Jones <(E-Mail Removed)> wrote in news:ga6cam$uj8$4
> @usenet01.boi.hp.com:
>
>> What was the CPU utilization of all the CPUs on each side? Was one of
>> them pegged?

>
> I might have jumped the gun. I'm not so sure about my earlier statement
> that the cpu was not the limiting factor. I'm not exactly sure how to
> measure the CPU utilization accurately on a multicore machine. Other than
> "top" what are the reccomended options?


I do not know about "recommended" options, but one I like is xosview. It
gives a bar chart showing CPU use (divided as to user, nice, system, idle,
wait, hardware interrupt, software interrupt), IO usage, network usage, and
many other things. I have mine set to show things at one second intervals,
but I believe you can make them go up to 1/10 second intervals if you want.

Another useful one is vmstat, and still another is iostat. But I do not know
if these will help you or not. You can always try them.

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 18:00:01 up 34 days, 6 min, 4 users, load average: 4.10, 4.11, 4.16
 
Reply With Quote
 
Rick Jones
Guest
Posts: n/a

 
      09-09-2008, 11:30 PM
In comp.os.linux.networking Rahul <(E-Mail Removed)> wrote:
> Rick Jones <(E-Mail Removed)> wrote in news:ga6cam$uj8$4
> @usenet01.boi.hp.com:


> > What was the CPU utilization of all the CPUs on each side? Was
> > one of them pegged?


> I might have jumped the gun. I'm not so sure about my earlier
> statement that the cpu was not the limiting factor. I'm not exactly
> sure how to measure the CPU utilization accurately on a multicore
> machine. Other than "top" what are the reccomended options?


I would go with top and then hit "1" to get it to show per-CPU
(core/whatever) utilization.

I'm sure there are other tools out there with wizzy displays and
charts and graphs and dials and whatnot, but for what you are doing
IMO just numbers are fine.

rick jones
--
Process shall set you free from the need for rational thought.
these opinions are mine, all mine; HP might not want them anyway...
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
 
Reply With Quote
 
Rahul
Guest
Posts: n/a

 
      09-10-2008, 12:25 AM
Rick Jones <(E-Mail Removed)> wrote in news:ga6cam$uj8$4
@usenet01.boi.hp.com:

> You want to consider both getting even more primitive and running
> netperf TCP_STREAM and also taking packet traces of your scp
> transfer(s).
>
>


I did run netperf Here's some output snippets. To me it seems performance
has been insensitive to choice of mode so far. It is not even relevant if
I have one ethernet card or two. I cannot rationalize this. Any leads?

Or am I running the wrong netperf tests? IMO only explaination is if
netperf has not generated enough traffic to saturate both my links.

--
Rahul


************************************************** *******
mode=4

[root@node01 scratch]# /opt/netperf2/bin/netperf -t TCP_RR -H 10.0.0.100
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
10.0.0.100 (10.0.0.100) port 0 AF_INET
Local /Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Size Time Rate
bytes Bytes bytes bytes secs. per sec

16384 87380 1 1 10.00 7023.38
16384 87380

************************************************** ********************
mode=6

[root@node04 ~]# /opt/netperf2/bin/netperf -t TCP_RR -H 10.0.0.100
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
10.0.0.100 (10.0.0.100) port 0 AF_INET
Local /Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Size Time Rate
bytes Bytes bytes bytes secs. per sec

16384 87380 1 1 10.00 7600.67
16384 87380
************************************************** ***
only 1 eth up. Other disabled via switch simulating a card failure.

[root@node05 ~]# /opt/netperf2/bin/netperf -t TCP_RR -H 10.0.0.100
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
10.0.0.100 (10.0.0.100) port 0 AF_INET
Local /Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Size Time Rate
bytes Bytes bytes bytes secs. per sec

16384 87380 1 1 10.00 6952.47
16384 87380


 
Reply With Quote
 
David Schwartz
Guest
Posts: n/a

 
      09-10-2008, 03:19 AM
On Sep 9, 2:42*pm, Rahul <nos...@nospam.invalid> wrote:

> Any way to force a disable-crypto mode on ssh? Also funny is the following
> result. I tried to benchmark how long it takes scp to do a local disk-to-
> disk copy. Presuming that this would allow me a comparison versus a simple
> cp and thus figure out the encryption overhead.
>
> But cp and scp took the same time! Does scp figure out that it is a local
> copy and then disable encryption?


It depends on the command line you pass to scp. If you do
'@localhost:' it will actually connect to the local machine's 'ssh'
server.

DS
 
Reply With Quote
 
Robert Riches
Guest
Posts: n/a

 
      09-10-2008, 04:08 AM
On 2008-09-09, Rick Jones <(E-Mail Removed)> wrote:
> In comp.os.linux.networking Rahul <(E-Mail Removed)> wrote:
>> I just ran a file-transfer speed test on my newly configured LAN. I
>> used ssh for a quick and dirty test before I move to stronger
>> ammunition including netperf etc.

>
>> For a 4 GB file transfer I have a time of approx. 86 sec. Using a
>> rough factor of 1 GByte=10 Gbits this translates to an effective
>> transmit of 0.46 Gbps. Even accounting for the protocol overheads
>> etc. this seems a rate way too low for what I was expecting for my
>> twin-eth-ports bonded machines.

>
>> Do other people have any benchmark numbers that I can compare against?

>
> What was the CPU utilization of all the CPUs on each side? Was one of
> them pegged?
>
> ssh/scp implies crypto. crypto implies CPU overhead.
>
> also, there may be a question of the TCP window size being used for
> the transfer. ssh may set an explicit SO_*BUF size which will disable
> Linux's much-vaunted socket buffer autotuning and may subject ssh/scp
> to a rather lower socket buffer size limit than can be achieved (by
> default) with the autotuning.
>
>> I ought to mention that cpu's on both machines were almost
>> unloaded. As a rough indicator of Disk I/O I see 40 secs. for a file
>> copy from disk to disk on any single machine. So I _think_ the
>> network is my bottleneck. But correct me if I am wrong. I'd be up
>> for trying any more "realistic" / "sophisticated" tests that people
>> might suggest instead of my primitive approach.

>
> You want to consider both getting even more primitive and running
> netperf TCP_STREAM and also taking packet traces of your scp
> transfer(s).


How about trying ftp to see if that can achieve higher
throughput than scp?

Another option for testing (and improving) network transfer
speed is to use a pair of simple TCP sockets to do the
transfer. I got fed up with the CPU overhead slowing down
my SCP transfers, so I wrote a quick Java program that
essentially copies between a file and a TCP socket. On my
100Mbit home LAN, it achieves full network potential with
almost no CPU overhead.

HTH

--
Robert Riches
(E-Mail Removed)
(Yes, that is one of my email addresses.)
 
Reply With Quote
 
Rahul
Guest
Posts: n/a

 
      09-10-2008, 05:12 PM
Rick Jones <(E-Mail Removed)> wrote in news:ga70u6$bhg$1
@usenet01.boi.hp.com:

> I would go with top and then hit "1" to get it to show per-CPU
> (core/whatever) utilization.


Perfect Rick. The simpler the better. I did just that.

Processor load is around 65%. Only one a single core. So processor is not
my bottleneck now right? Could only be network or disk.


--
Rahul
 
Reply With Quote
 
Rick Jones
Guest
Posts: n/a

 
      09-10-2008, 07:32 PM
In comp.os.linux.networking Rahul <(E-Mail Removed)> wrote:
> Rick Jones <(E-Mail Removed)> wrote in news:ga6cam$uj8$4
> @usenet01.boi.hp.com:


> > You want to consider both getting even more primitive and running
> > netperf TCP_STREAM and also taking packet traces of your scp
> > transfer(s).


> I did run netperf Here's some output snippets. To me it seems performance
> has been insensitive to choice of mode so far. It is not even relevant if
> I have one ethernet card or two. I cannot rationalize this. Any leads?


The netperf TCP_RR test is a "ping-pong" test rather like the ping
utility with no think/pause time at all. It is measuring latency - or
rather the inverse in transactions per second.

> Or am I running the wrong netperf tests? IMO only explaination is if
> netperf has not generated enough traffic to saturate both my links.


I think you should run TCP_STREAM tests:

netperf -t TCP_STREAM -H 10.0.0.100 -c -C -- -s 1M -S 1M -m 64K

is my current favorite bulk throughput test. The -c/-C will include
CPU util, the -s/-S will set large socket buffers (and will perhaps be
clipped by the stack, which will become clear in the output) and then
push 64KB of data into the socket at one time.


> [root@node01 scratch]# /opt/netperf2/bin/netperf -t TCP_RR -H 10.0.0.100
> TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 10.0.0.100 (10.0.0.100) port 0 AF_INET
> Local /Remote
> Socket Size Request Resp. Elapsed Trans.
> Send Recv Size Size Time Rate
> bytes Bytes bytes bytes secs. per sec


> ************************************************** ***
> mode=4
> 16384 87380 1 1 10.00 7023.38
> 16384 87380


> ************************************************** ***
> mode=6


> 16384 87380 1 1 10.00 7600.67
> 16384 87380
> ************************************************** ***
> only 1 eth up. Other disabled via switch simulating a card failure.


> 16384 87380 1 1 10.00 6952.47
> 16384 87380


Looks like you have NICs and drivers which favor bulk throughput over
minimizing latency:

ftp://ftp.cup.hp.com/dist/networking...cy_vs_tput.txt

rick jones
--
firebug n, the idiot who tosses a lit cigarette out his car window
these opinions are mine, all mine; HP might not want them anyway...
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Slow network martin_pentreath@hotmail.com Windows Networking 5 07-10-2008 01:26 PM
Exceptionally slow ping times, slow DNS lookup and slow download on router Martin Underwood Broadband 1 01-24-2007 12:05 AM
Slow network =?Utf-8?B?TWlrZQ==?= Broadband Hardware 0 02-20-2005 07:35 PM
slow network The Carver Wireless Internet 1 10-22-2004 06:21 PM
Another Slow Network =?Utf-8?B?R3JhaGFtIFBheXRvbg==?= Windows Networking 1 03-04-2004 07:35 PM



1 2 3 4 5 6 7 8 9 10 11