tcp send bigger data faster then small data

Discussion in 'Linux Networking' started by Stephan Absmeier, Sep 29, 2003.

  1. Hi,

    I did:


    typedef long long s64;
    inline s64 getRealTime() {
    s64 result;
    __asm__ __volatile__ ("rdtsc" : "=A" (result));
    return result;
    }
    s64 start64, end64;
    tcpSender testSender2=tcpSender(argv[1],line,length);
    testSender2.init(option); //SO_LINGER,bind,connect
    start64=getRealTime();
    testSender2.work();
    closeid=testSender2.end(); //close(socket)
    end64 = getRealTime();


    I measured the time, I did 10 rounds to get the average value. I use Red
    Hat 9.1, g++ 2.96, the data is produced by random, 100 MBit Ethernet,
    switched. I used SO_LINGER to wait till the queue has been send correctly.

    I send 8000, 10240 and 80000 bytes over the network.
    I did this several times cause I couldn't believe it, but it was all
    time the same
    it send 8000 bytes in 32 ms, 10240 bytes in 40.8 ms and
    80000 bytes in 10.5 ms.

    why is it faster to send 80000 bytes then less bytes? who can explain? I
    can use any idea.


    Thanks
    Stephan
     
    Stephan Absmeier, Sep 29, 2003
    #1
    1. Advertisements

  2. Stephan Absmeier

    Gisle Vanem Guest

    I think your timing method is flawed. Are you calculating and
    printing the 64bit TSC-diff correctly, or maybe rdtsc wraps?
    Some libs uses %lld and some uses %Ld to print a signed 64-bit
    value. Why not use clock() instead? Plenty precision for what's you
    are measuring.

    --gv
     
    Gisle Vanem, Sep 29, 2003
    #2
    1. Advertisements

  3. I changed s64 to clo9ck_t and getRealTime() to clock() and it always
    returned 0 and one time 1000.
    Did I something wrong?
     
    Stephan Absmeier, Sep 29, 2003
    #3
  4. I tried gettimeofday and again: sending 8000 bytes needs 36 to 38
    ms,10240 bytes needs 37 to 38 ms and 80000 bytes needs 8 to 47 ms, 14.9
    ms in the average. But 800000 bytes need 68.6 to 85 ms.

    double after;
    timeval end;
    gettimeofday(&end,NULL);
    after=end.tv_sec+1.e-6*end.tv_usec;


    the result is like the result of getRealTime
    8 out of 10 runs transfering 80000 bytes are faster then 10 ms and for
    that faster then all the runs sending 8000 or 10240 bytes.

    have anybody an idea?
    Stephan
     
    Stephan Absmeier, Sep 30, 2003
    #4
  5. I did some more tests, the runs are seconds:

    Bytes fastest run average run slowest run
    8000 0.031323 0.0315113 0.03211
    10240 0.040979 0.0410652 0.041146
    20000 0.040621 0.0407288 0.040818
    30000 0.040002 0.0403257 0.040455
    35000 0.003506 0.0035568 0.003608
    40000 0.003849 0.0289824 0.039732
    50000 0.004638 0.0046604 0.004724
    80000 0.007648 0.0077132 0.007806
    100000 0.008806 0.0097092 0.016487
    200000 0.01764 0.0208155 0.048285
    300000 0.02585 0.029469 0.058012
    350000 0.030264 0.0304887 0.030698
    400000 0.034348 0.041239 0.067274
    500000 0.042902 0.0431682 0.043544
    800000 0.068526 0.0735361 0.106607

    it starts between 30000 and 35000 bytes to be faster then less bytes and
    ends at 350000 bytes. 100 MBit means 35000 bytes need a minimum time of
    0.0028 seconds (without overhead).
     
    Stephan Absmeier, Sep 30, 2003
    #5
  6. That's good. Can you do some more graphing, please (and no tabs!!!
    Fixing).
    Yes - this sort of accords with a vague idea I have that over a small
    number of fragments the NIC will wait to see if there are any more
    incoming before telling the system it has something.
    What I don't get is how you can send 800KB in 0.07 seconds. That's
    11.2MB/s. Oh well, 100BT. Yes, some NICs probably do the wait-and-see
    trick.

    Peter
     
    Peter T. Breuer, Sep 30, 2003
    #6
  7. That's one of my problems. That's not like tcp should act cause of slow
    start.
    This all was done with the default settings after insatllation. Turning off
    the Nagle-Algorithm (TCP_NODELAY) speeds it up a little more. The fastest
    was 0.06831 seconds for 800000 bytes.That means 93.6 MBit/s!! If I ping
    the other host, it needs 0.00014 to 0.00015 seconds.

    I go nuts on this.
    Stephan
     
    Stephan Absmeier, Sep 30, 2003
    #7
  8. Can you please do that? The graphing, I mean. We need more data points.
    And indicate how you are doing these tests, so that the results become
    reproducible (and meaningful!).
    Eh? What I suggest is just due to the way some 100BT NICs behave.
    What I suggest is nothing to do with the o/s.

    Please perform the further experiments to confirm or deny. Detail your
    measurement mode. To make valid measurements you will need to let the
    NIC quiesce after each "packet". At least 1s delay.

    You should then remeasure on a continuous stream, taking the average
    speed on the stream. If I am right, all should converge there.


    Peter
     
    Peter T. Breuer, Sep 30, 2003
    #8
  9. Also, don't always start with the smaller files and work your way up
    to bigger ones - there are probably some startup costs associated with
    the first transfer. ARP comes to mind - are you sure that both hosts
    have the other's MAC addrs in their ARP caches before you start
    testing?

    --KW :cool:
     
    Keith Wansbrough, Sep 30, 2003
    #9
  10. I don't have root rights. How should I quiesce the card?
    There is no packet delay module installed.
    Hi,

    this time upside down. I started with 800000 Bytes going to
    8000 Bytes:

    Bytes fastest run average run slowest run
    800000 0.068346 0.0688138 0.069734
    500000 0.042749 0.046447 0.073162
    400000 0.034251 0.0346192 0.035127
    350000 0.030081 0.0334658 0.062391
    300000 0.026015 0.0261624 0.026631
    200000 0.017581 0.0177638 0.017923
    100000 0.009186 0.0092719 0.009383
    80000 0.007103 0.0109007 0.0411
    50000 0.0048 0.0048566 0.005025
    40000 0.003765 0.0258131 0.040525
    35000 0.003641 0.0036926 0.003859
    30000 0.039624 0.039726 0.039826
    20000 0.038596 0.0392895 0.039463
    10240 0.037367 0.0401245 0.049999
    8000 0.03831 0.0386388 0.039061

    nearly the same values as before but exactly the same picture.

    It is done in a loop. There is a 1 second break between the
    runs and a 5 second break between the different packets. The
    socket has benn closed after each run, cause of time measure
    methods.

    arp can be a factor at the first run, but not later: I had a
    look at /proc/net/arp, it wasn't there, I pinged the other
    host, it was there, three minute later, the MAC address was
    still there.

    Stephan
     
    Stephan Absmeier, Sep 30, 2003
    #10
  11. Eh? Just sleep for a while. 1s between sends.
    Eh? How are you performing your measurements if you cannot control
    when your packet is sent? I presumed you were just doing a write to a
    socket, and timing how long it takes!
    Useless, unless you tell us what your experimental procedure is!

    But we (I) asked for more details. What you show is not fine grained
    enough to give us an idea of what could be happening.
    Runs? What is a "run"?
    Well, I would be happier if you did not close it, but it makes no
    difference at the packet level.

    Now confirm that streaming continuously gives you the same speeds for
    all, and you have the answer. The NIC is waiting for more on
    the small stuff.


    Peter
     
    Peter T. Breuer, Sep 30, 2003
    #11
  12. see the code at the end, should I repost my first post or
    what are you interested in
    sending the data for one time
    it's although to start again with slow start
    How big should this file be? 1MB, 10MB or more?
    the version using gettimeofday, I deleteted all the
    getRealTime stuff to reduce the length and tried to
    translate it to english. Hopefully you see what I did.All
    the tests were made with option 0 so far. the receiver puts
    the data in a buffer and dump it so far.

    Hope it's not too long to post.

    Thanks
    Stephan


    #include "abs_net.h" //constants like TCP_SERVER_PORT, KB...
    #include <unistd.h>
    #include <sys/time.h>
    #include <netinet/tcp.h>
    #include <sstream>
    #include <stdlib.h>
    #include <iostream>
    using namespace std;


    class tcpSender
    {
    private:
    int length;
    char *line;
    int mySocket, protokollAddr;
    struct sockaddr_in serverAddr, clientAddr;
    struct hostent *host;
    int i,paketlength;
    int window_size, mss, priority;

    public:
    tcpSender(const char *server,char *lineIn,int lengthIn)
    {
    length=lengthIn;
    line=lineIn;
    /* get server IP address (no check if input is IP
    address or DNS name */
    host = gethostbyname(server);

    if(host==NULL)
    {
    cout<<"tcpSender: unknown host "<<server<<endl;
    exit(1);
    }

    }

    void init(int option)
    {
    serverAddr.sin_family = host->h_addrtype;
    memcpy((char *)
    &serverAddr.sin_addr.s_addr,host->h_addr_list[0],
    host->h_length);
    serverAddr.sin_port = htons(TCP_SERVER_PORT);


    /* socket creation */
    mySocket = socket(AF_INET,SOCK_STREAM,0);
    if(mySocket<0)
    {
    cout<<"tcpSender: cannot open socket"<<endl;
    exit(1);
    }

    /*wait for all data beiing send,need for time measure */
    linger lingerwert;
    lingerwert.l_onoff=1;
    lingerwert.l_linger=32767;

    setsockopt(mySocket,SOL_SOCKET,SO_LINGER,(char*)&lingerwert,sizeof(lingerwert));

    /*
    tcp optionen
    8 buffer
    4 priority
    2 nagle off
    1 mss
    0 none
    15 for all
    */
    if (option>=8)
    {
    /* set SO_RCVBUF and SO_SNDBUF to 128*1024 Bytes */
    window_size=KB*1024;
    setsockopt(mySocket,SOL_SOCKET,SO_RCVBUF,(char*)&window_size,sizeof(window_size));
    setsockopt(mySocket,SOL_SOCKET,SO_SNDBUF,(char*)&window_size,sizeof(window_size));
    option -=8;
    }
    if (option >=4)
    {
    /* Priority of the Queue */
    priority=PRIORITAET;
    setsockopt(mySocket,SOL_SOCKET,SO_PRIORITY,(char*)&priority,sizeof(priority));
    option -= 4;
    }
    if (option >= 2)
    {
    /* Nagle-Algorithm off */
    setsockopt(mySocket, SOL_TCP,
    TCP_NODELAY,(char*)true,sizeof(true));
    option -=2;
    }
    if (option >= 1)
    {
    /* try an other mss */
    mss=MSS;
    setsockopt(mySocket, SOL_TCP,
    TCP_MAXSEG,(char*)&mss,sizeof(mss));
    option -= 1;
    }

    /* bind any port */
    clientAddr.sin_family = AF_INET;
    clientAddr.sin_addr.s_addr = htonl(INADDR_ANY);
    clientAddr.sin_port = htons(0);

    protokollAddr = bind(mySocket, (struct sockaddr *)
    &clientAddr, sizeof(clientAddr));
    if(protokollAddr<0)
    {
    cout<<"tcpSender: cannot bind TCP port
    "<<clientAddr.sin_port<<endl;
    exit(1);
    }

    /* connect to server */
    protokollAddr = connect(mySocket, (struct sockaddr *)
    &serverAddr, sizeof(serverAddr));
    if(protokollAddr<0)
    {
    cout<<"tcpSender: cannot connect to Socket"<<endl;
    exit(1);
    }

    }

    void work()
    {
    protokollAddr = send(mySocket, line, length, 0);
    if(protokollAddr<0)
    {
    cout<<"tcpSender: cannot send data"<<endl;
    close(mySocket);
    exit(1);
    }

    }

    int end()
    {
    return close(mySocket);
    }

    };

    int main(int argc, char *argv[])
    {
    int length, numberOfRuns,option, closeid;
    timeval start, end;
    /* check command line args */
    if(argc!=4)
    {
    cout<<"usage: tcpSender <server> number_of_runs
    option"<<endl;
    cout<<"example: tcpSender blowfish 10 15"<<endl;
    cout<<"options :"<<endl;
    cout<<"8 sendingbuffer "<<KB<<"kb"<<endl;
    cout<<"4 priority "<<PRIORITAET<<endl;
    cout<<"2 Nagle algorithm off"<<endl;
    cout<<"1 mss "<<MSS<<endl;
    cout<<"0 default"<<endl;
    exit(1);
    }

    numberOfRuns=atoi(argv[2]);
    double before,after,duration[numberOfRuns],complete,min,max;
    option=atoi(argv[3]);
    char line[800000];
    // fill the array with random chars
    for (int aa=0;aa<800000;)
    {
    switch(rand() % 16)
    {
    case 0:
    line[aa]='a';
    break;
    case 1:
    line[aa]='b';
    break;
    case 2:
    line[aa]='c';
    break;
    case 3:
    line[aa]='d';
    break;
    case 4:
    line[aa]='e';
    break;
    case 5:
    line[aa]='f';
    break;
    case 6:
    line[aa]='g';
    break;
    case 7:
    line[aa]='h';
    break;
    case 8:
    line[aa]='i';
    break;
    case 9:
    line[aa]='j';
    break;
    case 10:
    line[aa]='k';
    break;
    case 11:
    line[aa]='l';
    break;
    case 12:
    line[aa]='m';
    break;
    case 13:
    line[aa]='n';
    break;
    case 14:
    line[aa]='o';
    break;
    default:
    line[aa]='p';
    break;
    }
    aa++;
    }
    // try different length
    for(int testrun=0;testrun<15;)
    {
    switch(testrun)
    {
    case 0:
    length=800000;
    break;
    case 1:
    length=500000;
    break;
    case 2:
    length=400000;
    break;
    case 3:
    length=350000;
    break;
    case 4:
    length=300000;
    break;
    case 5:
    length=200000;
    break;
    case 6:
    length=100000;
    break;
    case 7:
    length=80000;
    break;
    case 8:
    length=50000;
    break;
    case 9:
    length=40000;
    break;
    case 10:
    length=35000;
    break;
    case 11:
    length=30000;
    break;
    case 12:
    length=20000;
    break;
    case 13:
    length=10240;
    break;
    default:
    length=8000;
    break;
    }
    testrun++;
    complete=min=max=0.0;
    cout<<"option: "<<option<<" data size: "<<length<<endl;
    cout<<endl;
    //send data, measure time
    for (int test=0;test<numberOfRuns;)
    {
    tcpSender testSender=tcpSender(argv[1],line,length);
    testSender.init(option);
    gettimeofday(&start,NULL);
    testSender.work();
    closeid=testSender.end();
    gettimeofday(&end,NULL);

    before=start.tv_sec+1.e-6*start.tv_usec;
    after=end.tv_sec+1.e-6*end.tv_usec;
    duration[test]=after-before;
    cout<<"time for run "<<test<<" : "<<duration[test]<<"
    sec"<<endl;
    complete +=duration[test];
    if (min==0)
    min=duration[test];
    else
    if (duration[test]<min)
    min=duration[test];
    if (max<duration[test])
    max=duration[test];

    test++;
    //break before next run, same length
    sleep(1);
    }

    cout<<endl;
    cout<<"average time: "<<complete/numberOfRuns<<" sec "<<endl;
    cout<<"fastest run: "<<min<<" sec"<<endl;
    cout<<"slowest run: "<<max<<" sec"<<endl;
    cout<<endl;
    cout<<endl;
    //break before new length
    sleep(5);
    }
    return 1;


    }
     
    Stephan Absmeier, Sep 30, 2003
    #12
  13. Stephan Absmeier

    Rick Jones Guest

    Some NIC/driver combos (particularly Gigabit, but perhaps some 100BT)
    may not have the "interrupt avoidance" or "coalescing" parms set
    terribly well for anyting other than large bulk transfers. Such
    situations can often be uncovered with a single-byte, netperf TCP_RR
    test:

    $ netperf -t TCP_RR -H <remote> -l <time> -- -r 1

    rick jones
     
    Rick Jones, Sep 30, 2003
    #13
  14. I have no idea what this is saying to me. Can anybody help
    and explain to me? Thanks.
    For me it looks like it is OK. I did the first one with -r
    80000 and the Trans Rate per sec was 56.37. I did the second
    (-l 10) with -r 80000. Result: 71.34



    ../netperf -t TCP_RR -H 192.169.1.110 -l 1 -- -r 1

    TCP REQUEST/RESPONSE TEST to 192.169.1.110
    Local /Remote
    Socket Size Request Resp. Elapsed Trans.
    Send Recv Size Size Time Rate
    bytes Bytes bytes bytes secs. per sec

    16384 87380 1 1 0.99 9974.64
    16384 87380



    ./netperf -t TCP_RR -H 192.169.1.110 -l 10 -- -r 1

    TCP REQUEST/RESPONSE TEST to 192.169.1.110
    Local /Remote
    Socket Size Request Resp. Elapsed Trans.
    Send Recv Size Size Time Rate
    bytes Bytes bytes bytes secs. per sec

    16384 87380 1 1 9.99 9881.40
    16384 87380



    ./netperf -t TCP_RR -H 192.169.1.110 -l 100 -- -r 1

    TCP REQUEST/RESPONSE TEST to 192.169.1.110
    Local /Remote
    Socket Size Request Resp. Elapsed Trans.
    Send Recv Size Size Time Rate
    bytes Bytes bytes bytes secs. per sec

    16384 87380 1 1 100.00 9899.73
    16384 87380



    ../netperf -t TCP_STREAM -H 192.169.1.110 -l 1 -- -r 1

    TCP STREAM TEST to 192.169.1.110
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 1.00 93.76



    ./netperf -t TCP_STREAM -H 192.169.1.110 -- -r 1

    TCP STREAM TEST to 192.169.1.110
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 10.00 93.85
     
    Stephan Absmeier, Sep 30, 2003
    #14
  15. Is there a possibility to change this settings? If yes: how?
     
    Stephan Absmeier, Oct 1, 2003
    #15
  16. Did some more tests again. See at the bottom.
    average means the average out of ten runs. It's all in seconds.
    I found the threshold is 31856. What's up with that number?
    The mss of the network is 1450 bytes. But there are 1448
    bytes in each tcp-packet. That's what tcpdump shows me.So
    the threshold is 22 tcp-packets.
    Who knows anything about that? Peter thought it have to do
    with the 100BT card. I did the some test with UDP and a
    reliable layer. It was about 0.019 seconds for 8000 Bytes.
    Is it the card? Or has it to do with the TCP stack or TCP
    implementation of Red Hat Linux 9?

    Any idea will be good.

    Thanks
    Stephan


    data fastest average slowest
    ================================================
    1000 0.031766 0.0323698 0.037186
    2500 0.031419 0.0315112 0.031607
    5000 0.031157 0.0312168 0.031319
    8000 0.040823 0.0409172 0.041005
    10000 0.040437 0.0405977 0.040693
    10240 0.040190 0.0402849 0.040364
    15000 0.039587 0.0399397 0.040098
    20000 0.039550 0.0396485 0.039733
    25000 0.039024 0.0393186 0.039426
    30000 0.038928 0.0390251 0.0391
    31000 0.037754 0.0386034 0.038794
    31854 0.038263 0.0383824 0.03847
    31855 0.038017 0.0380871 0.038178
    31856 0.003377 0.003417 0.003431
    31857 0.003370 0.0067964 0.03724
    32000 0.003378 0.0034376 0.003461
    33000 0.003410 0.0067806 0.036802
    34000 0.003437 0.0034616 0.003476
    35000 0.003419 0.0034721 0.003499
    40000 0.004366 0.0107605 0.036063
    45000 0.004483 0.0045104 0.00455
    50000 0.035283 0.0353607 0.03545
    55000 0.005510 0.005675 0.005775
    60000 0.005591 0.0057249 0.005954
    65000 0.005879 0.0100547 0.044415
    70000 0.006354 0.0067697 0.006989
    75000 0.006704 0.0105618 0.043935
    80000 0.007581 0.0149989 0.043544
    85000 0.007643 0.0078219 0.008091
    90000 0.008611 0.0124246 0.042909
    95000 0.008654 0.0087931 0.009042
    100000 0.008868 0.0091854 0.009645
    150000 0.013124 0.0176344 0.052137
    200000 0.017541 0.0179289 0.018181
    250000 0.021830 0.026077 0.061378
    300000 0.025925 0.0262731 0.026594
    350000 0.030249 0.0366534 0.060875
    400000 0.034372 0.0346935 0.035909
    500000 0.042992 0.0438456 0.048419
    800000 0.068332 0.0692489 0.074687
     
    Stephan Absmeier, Oct 2, 2003
    #16
  17. What is a "run"? Please be precise. We can't tell you what your figures
    mean unless you tell us that.
    It's twice 15928.
    11 \* 1448 = 15928

    Well, looks as though you buffer 22 packets/fragments on the card
    before coughing.
    That's max speed for 100BT.

    It looks to me as though it all depends on your experimental procedure.
    I askd you to measure the streaming speeds too, to give us a
    comparison, but since you don't say in what way your procedure differs
    from streaming, nobody can tell you much!

    Peter
     
    Peter T. Breuer, Oct 2, 2003
    #17
  18. I send every data size like 8000 bytes, for example 10
    times, so I have 10 runs. You see what I mean with run?
    In short: One run is open the socket, sending for example
    8000 bytes, and close the socket, including the time
    measure. Repating this 10 times is the same as having 10
    runs. So I get a fastest time, an average time and a slowest
    time

    What's special to this number?
    I see. Can I get this from the hardware manual?
    yes, I saw it. I also tried:

    .../netperf -t TCP_STREAM -H 192.169.1.110 -l 1 -- -r 1

    TCP STREAM TEST to 192.169.1.110
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 1.00 93.76

    I can't get your point. I posted this 2 days ago, thought
    that's what you need:

    .../netperf -t TCP_STREAM -H 192.169.1.110 -l 1 -- -r 1

    TCP STREAM TEST to 192.169.1.110
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 1.00 93.76



    ./netperf -t TCP_STREAM -H 192.169.1.110 -- -r 1

    TCP STREAM TEST to 192.169.1.110
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 10.00 93.85


    But if not, please tell me what you need exactly. Thanks

    mySocket = socket(AF_INET,SOCK_STREAM,0);
    linger lingerwert;
    lingerwert.l_onoff=1;
    lingerwert.l_linger=32767;
    setsockopt(mySocket,SOL_SOCKET,SO_LINGER,
    (char*)&lingerwert,sizeof(lingerwert));
    clientAddr.sin_family = AF_INET;
    clientAddr.sin_addr.s_addr = htonl(INADDR_ANY);
    clientAddr.sin_port = htons(0);
    bind(mySocket, (struct sockaddr *) &clientAddr,
    sizeof(clientAddr));
    connect(mySocket, (struct sockaddr *) &serverAddr,

    sizeof(serverAddr));
    gettimeofday(&start,NULL);
    // line is a char[length], filled with chars using random
    send(mySocket, line, length, 0);
    close(mySocket);
    gettimeofday(&end,NULL);

    I did it in a class using the public functions in main, but
    it is shorter this way. SO_LINGER is used for time measure.
    It's TCP streaming, isn't it?
    It's without the stuff for time measure, generating the
    array. If you want the programm, I can mail it to you. Have
    to translate the names to english, but no problem. It's
    about 9.5 kb in the moment, think it's too long to post,
    isn't it?

    I also did:

    .../netperf -t TCP_RR -H 192.169.1.110 -l 1 -- -r 1

    TCP REQUEST/RESPONSE TEST to 192.169.1.110
    Local /Remote
    Socket Size Request Resp. Elapsed Trans.
    Send Recv Size Size Time Rate
    bytes Bytes bytes bytes secs. per sec

    16384 87380 1 1 0.99 9974.64
    16384 87380


    I hope you can ust this stuff.


    Thanks for help and spending your time.

    Stephan
     
    Stephan Absmeier, Oct 2, 2003
    #18
  19. Hi all,

    nobody answered to my last posting, so I thought I try it
    one more time.

    the situation (in pseudo code):

    open_socket
    setoption(SO_LINGER)
    bind_socket
    generate_a_char[],filled with letters
    take_the_starting_time
    send_data
    close_socket
    take_the_ending_time
    calculate_duration_of_sending

    I called this procedure as run im my former postings. I
    repated this procedure 10 times for each packet size, to
    middle the duration.

    I used a 100MBit BaseT network, switched by a 3COM
    SuperStack 4400, host A and B uses Intel 82801BD PRO/100VE
    chipsets (e100 driver) running Linux Red Hat 9.0 and host c
    and d uses 3Com 3C905 C chipsets (3C59x driver) running Red
    Hat 7.3. I used tcpdump to figure out that the mss is 1460
    bytes, but every tcp package had a payload of 1448 bytes of
    data. I tested 4 combinations of communication for sending:
    1 : A -> D
    2 : A -> B
    3 : C -> D
    4 : C -> B

    This is what I figured out so far:

    1) sending 3 * 1448, 4 * 1448, 5*1448, 6*1448, 7*1448,
    8*1448, 9*1448, 11*1448, 13*1448, 14*1448, 15*1448 and
    16*1448 is done in 3 to 4 milliseconds (ms) for each
    sending, but sending only 1 bytes less or more the duration
    is about 30 to 40 ms.
    2) sending 12 * 1448 bytes was faster then 2 ms for sending
    1,2 and 4
    3) starting with 17 * 1448 for comunication 1 and 3: values
    that are equal or bigger ( up to 33305 bytes, there I
    stopped for this test) are being send faster then 8 ms
    4) for communication 2 and 4 there is the threshold of 22 *
    1448 so that values thar are equal or bigger aer being send
    faster then 8 ms
    5) sending 10 * 1448 was really bad for all communications
    (1, 2, 3, 4): 32 to 37 ms.

    I hope you can give me some hints. Is there a possibility to
    flush the buffer? Maybe by changing the driver? In the
    application I have to modify, all the data is less then 10
    kb but there are different network interfaces.

    You asked for an streamint test. I think that's it:

    netperf -t TCP_STREAM -H 192.169.1.110 -- -r 1

    TCP STREAM TEST to 192.169.1.110
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 10.00 93.85


    I did one more test. I changed the maximum segmentsize
    using setsockopt and TCP_MAXSEG to 1280 bytes. But 10240
    bytes (= 8*1280) isn't faster and the other get worse.

    I can need any hint, any idea.


    please :-((((((((

    Stephan
     
    Stephan Absmeier, Oct 13, 2003
    #19
  20. Stephan Absmeier () wrote:
    : Hi all,

    : nobody answered to my last posting, so I thought I try it
    : one more time.

    : the situation (in pseudo code):

    : I used a 100MBit BaseT network, switched by a 3COM
    : SuperStack 4400, host A and B uses Intel 82801BD PRO/100VE
    : chipsets (e100 driver) running Linux Red Hat 9.0 and host c
    : and d uses 3Com 3C905 C chipsets (3C59x driver) running Red
    : Hat 7.3. I used tcpdump to figure out that the mss is 1460
    : bytes, but every tcp package had a payload of 1448 bytes of
    : data. I tested 4 combinations of communication for sending:
    : 1 : A -> D
    : 2 : A -> B
    : 3 : C -> D
    : 4 : C -> B

    : This is what I figured out so far:

    : 1) sending 3 * 1448, 4 * 1448, 5*1448, 6*1448, 7*1448,
    : 8*1448, 9*1448, 11*1448, 13*1448, 14*1448, 15*1448 and
    : 16*1448 is done in 3 to 4 milliseconds (ms) for each
    : sending, but sending only 1 bytes less or more the duration
    : is about 30 to 40 ms.
    : 2) sending 12 * 1448 bytes was faster then 2 ms for sending
    : 1,2 and 4
    : 3) starting with 17 * 1448 for comunication 1 and 3: values
    : that are equal or bigger ( up to 33305 bytes, there I
    : stopped for this test) are being send faster then 8 ms
    : 4) for communication 2 and 4 there is the threshold of 22 *
    : 1448 so that values thar are equal or bigger aer being send
    : faster then 8 ms
    : 5) sending 10 * 1448 was really bad for all communications
    : (1, 2, 3, 4): 32 to 37 ms.

    : I hope you can give me some hints. Is there a possibility to
    : flush the buffer? Maybe by changing the driver? In the
    : application I have to modify, all the data is less then 10
    : kb but there are different network interfaces.

    : I did one more test. I changed the maximum segmentsize
    : using setsockopt and TCP_MAXSEG to 1280 bytes. But 10240
    : bytes (= 8*1280) isn't faster and the other get worse.

    AFAIK if the data you send is not an intergral multiple of segment
    size, then tcp is supposed to sent the last segment only after
    getting acknowledgement for all previous segments. This may explain
    little delay, but in itself can not explain large delays you see.

    I suspect that some other activity takes place on one of your
    machines. If your program can do all work in one turn then you get
    low time, if you give up processor for even short time then you
    have to wait to regain control. You may try to put busy loop for
    few ms between send and close -- it may matter.

    I wonder way you are not using udp. Tcp was designed for troughput,
    not low latency.
     
    Waldek Hebisch, Oct 28, 2003
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.