General Schvantzkoph <(E-Mail Removed)> wrote:
> What I want to do is be able to generate traffic at wire speed. I'm
> designing a piece of hardware that has a couple of 10G ports on it
> and we need to be able to test it under maximum load.
Assuming your piece of hardware has sufficient CPU "oomph" (I'm
assuming it runs Linux given the news group) then netperf should be
able to generate traffic at wire speed.
Some, but probably not all the factors involved:
*) CPU "oomph"
*) NIC driver model efficiency/driver path length
*) Stateless offloads in the NIC - both on transmit and receive side
*) Width of path to the NIC (PCIe Gen1/Gen2 x4/x8 etc)
*) Use of a 9000 byte (JumboFrame) vs 1500 byte MTU
When you get to the point of running, joining
netperf-(E-Mail Removed)
might be a good idea.
rick jones
Some boiler-plate I trot-out from time to time:
Looking from the standpoint of the defining IEEE specifications, there
is nothing in 10 Gigabit Ethernet that makes data transmission any
easier on the host than it was for 1 Gigabit Ethernet, or for that
matter 100 or 10 Megabit Ethernet. It takes just as many CPU cycles
to send a frame through the interface for all of those.
Now, as time has passed, NIC vendors have learned, or been taught by
system vendors

how to do things *beyond* the IEEE
specifications.
In the time of 100 Megabit Ethernet, it became possible to have fewer
than one interrupt per packet.
In the time of 1 Gigabit Ethernet, various interrupt coalescing
schemes took things farther than they went with 100 Megabit Ethernet.
Also, the mass-market NICs started to support ChecKsum Offload (CKO -
something first done by the then major systems vendors with their FDDI
NICs in the early 1990's) and some started supporting maximum frame
sizes of 9000 bytes - what Alteon dubbed "Jumbo Frames" - a name that
has stuck to this day.
Any 10 Gigabit Ethernet NIC worth its silicon will have all those
features, plus support for directing interrupts to multiple cores and
using multiple packet queues. This can spread the work across
multiple cores - but only when there are multiple "flows" (eg TCP
connections). The 10 Gigabit Ethernet NICs also provide support for
TCP Segmentation Offload (TSO) and newer ones also include Large
Receive Offload (LRO - sometimes called Transparant Packet
Aggregation).
Those stateless offloads can very dramatically lower the CPU overhead
of data transfer. However...
Stateless offloads such as CKO, TSO and LRO really only come into play
when the traffic is a "bulk transfer" - when the application(s)
involved send rather more than an MSS's worth of data at one time.
There are many applications which do so, but not all applications do.
CKO, TSO, and LRO do little or nothing for applications making
discrete, small sends - those applications are basically back to 10
Megabit Ethenet days when it comes to how much CPU will be consumed
sending/receiving their traffic through the NIC.
--
denial, anger, bargaining, depression, acceptance, rebirth...
where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway...

feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...