Real-Time UDP non-blocking sockets in Linux

Discussion in 'Linux Networking' started by Michael Drew, Feb 7, 2004.

  1. Michael Drew

    Michael Drew Guest

    I am working on a research project which is investigating control
    algorithms over wireless networks. I have created an ad-hoc network
    consisting of 2 PCs. PC #1 must send UDP data packets to PC #2 at a
    regular interval (sample rate). These data packets are very small, so
    there is never a chance of over-running the max buffer length.

    I am not a programmer by trade, but I have successfully figured out
    how to implement this arrangement in Gentoo Linux with some basic UDP
    socket programming. The problem is this: Whenever the signal strength
    between the 2 computers starts to degrade the send() function seems to
    block for several milliseconds - though not consistently. This is made
    obvious by disconnecting the wireless card on PC #2 during an
    experiment.

    For example, using a sample period of 5ms, I'll get consistent
    performance with very little jitter (variation between when I want to
    send a packet and when it is actually sent) until the wireless card on
    PC #2 is disconnected. At that point I'll measure latencies of several
    milliseconds. I've traced this latency to the send() function.

    I've declared the socket as non-blocking, and used the MSG_DONTWAIT
    flag in the send() function, but to no avail. I've also tried to use
    the select() function before the send() function to make sure the
    socket is writeable. (This works well for receiving packets without
    blocking.) This doesn't help either. What I need is something that
    will try to send a packet regardless of the channel quality (even if
    the target machine is unreachable).

    I am wondering if the problem may lie in the hardware level. Maybe the
    wireless card causes latencies during re-transmissions. I thought that
    UDP didn't do an re-transmission or error correction, but perhaps
    there is some at the hardware level.

    Any thoughts on this would be much appreciated. I am fairly new to
    this stuff, and like I said, I am not a programmer so any response may
    have to be dumbed-down for me. Either way, thanks for taking the time
    to read my rambling post!

    -md
     
    Michael Drew, Feb 7, 2004
    #1
    1. Advertisements

  2. Possibly the delay is simply waiting for ARP requests (the destination is
    on the same subnet as the source) to time out. A network sniffer
    (ethereal works well) would be a useful for diagnosing this.
     
    Owen Jacobson, Feb 7, 2004
    #2
    1. Advertisements

  3. Michael Drew

    Michael Drew Guest

    I'm not sure what ARP requests are. I assign PC #1 an IP of, say
    10.10.10.51, and PC #2 10.10.10.50. Both have a subnet of
    255.255.255.0. This is completely arbitrary though because the system
    is completely stand-alone. Is there something I might try changing in
    this configuration that could identify and/or fix the problem?

    Thanks for your response.
     
    Michael Drew, Feb 7, 2004
    #3
  4. Like the man said, first use a network sniffer. Second, you'll
    need to read up on Ethernet and IP.
     
    Grant Edwards, Feb 7, 2004
    #4
  5. Michael Drew

    Michael Drew Guest

    Thanks Owen, I'll check into the network sniffer. I guess what I was
    trying to get at in my previous response is, what settings in an
    ad-hoc configuration could be causing delay? That is, assuming I see
    something from the network sniffer, what changes can be made at the
    software level that would conceivably affect socket sending latencies?

    Remember, I don't care about the delay in packet transport - just the
    latency/blocking of the send() function at the sending PC. In my
    understanding, the send() function should proceed without delay
    regardless of whether the packet makes it to its destination. The
    delay of the packet across the channel is expected to increase as the
    channel degrades. I expect this - in fact, it is part of the research
    project.

    As for needing to read up on ethernet and IP, thanks for the tip
    Grant. I guess I was presumptious in assuming someone on this list
    might be able to teach me something about it. :)
     
    Michael Drew, Feb 9, 2004
    #5
  6. Well, if it *is* ARP timeouts that are causing your problem, there are
    ways to set up static ARP mapping. I'm not conversant with the tools for
    that, as I prefer to let as much happen automagically as possible, but the
    arp(8) man page should be a good place to start.
    The gods help those who help themselves. Especially on subjects as
    complex as IP and ethernet.

    A quick summary of ARP:

    Host A (10.1) wants to send data to host B (10.2) on the same network.
    It's never talked to host B, or the last time A talked to B was a long
    time ago. So host A sends an ARP request to host B's IP (10.2) with the
    broadcast ethernet address (ff:ff:ff:ff:ff:ff). If host B is present it
    sends an ARP reply with its ethernet address.
     
    Owen Jacobson, Feb 10, 2004
    #6
  7. Michael Drew

    Michael Drew Guest

    A quick summary of ARP:
    I see. Thanks for the mini tutorial. I'll definitely look into that
    further. So this process could be causing send() to block? It seems
    like the send function should execute withou waiting for any response
    or indication of success from the lower-level processes involved in
    actually sending the packet.

    Anyway. I'll get ehereal this week and check it out. Thanks again.

    -mike
     
    Michael Drew, Feb 10, 2004
    #7
  8. Michael Drew

    Rick Jones Guest

    Typing boldly beyond my knowledge base...

    Speaking in broad generalities and not necessarily linux-specific (or
    even correct perhaps), when you call send(), that will call-into
    socket code (in the kernel), which will call into UDP code, which will
    call into IP code, which will call into the driver.

    If the driver notices that the card does not have a full DMA queue -
    and in this case, perhaps it will find the queue empty? - it will
    start to "tell" the NIC about the packet. That may involve updating
    various control sctructures on the card, and I suppose it is
    conceivable that if the card is otherwise busy trying to deal with a
    low signal strength issue, it may not respond to say a PIO read with
    as good quickness. I'm not sure that such a card is especially "good"
    but perhaps it is unavoidable.

    Your application would be more or less "stuck" for that length of time.

    I would have thought that PCI/whatever I/O bus timeouts would be
    rather low there - but it would take more knowledge than I have and
    perhap a bus analysis to see if that were the case.

    I would have thought that if one went into ARP on the way down, and
    found that there was not an IP->MAC mapping for the destination IP,
    that the ARP request would be generated, and _that_ would be sent to
    the driver, perhaps encountering delays as above, but if not, the
    actual datagram being sent would be queued in ARP, and the "stack"
    would unwind back to the caller, his datagram being sent once the ARP
    reply arrived.

    Or I could be entirely out to left-field.

    rick jones
     
    Rick Jones, Feb 12, 2004
    #8
  9. Ah, but sendto() doesn't return until the packet has actually been sent
    (or definitely not sent), barring non-blocking sockets. The ARP
    explanation was an off-the-cuff guess, but some other parts of your
    explanation seem more plausible to me.
     
    Owen Jacobson, Feb 12, 2004
    #9
  10. Michael Drew

    Rick Jones Guest

    Eew, you mean that linux will not have sendto() block for however long
    it might take to perform ARP resolution? Icky. For lack of a more
    technical term. Just what are the ARP timeouts? I would think on the
    order of small numbers of seconds.

    Are you _really_ sure that is the behaviour of sendto() under Linux?
    If that is indeed correct, ignoring the ARP stuff, it would suggest
    that an application calling sendto() could never cause packets to be
    queued in the driver - and if there were other apps running, the
    sendto() could take however long it takes to drain the transmit
    queue... - unless the definition of "sent" is really "has made it to
    the driver" which would be somewhat different.

    rick jones
     
    Rick Jones, Feb 12, 2004
    #10
  11. No, just the reverse. I mean sendto() *should* and I believe *does* block
    until ARP resolution completes or fails. IANA kernel developer, though.
     
    Owen Jacobson, Feb 13, 2004
    #11
  12. Michael Drew

    Rick Jones Guest

    Sigh, silly typo on my part.

    I'm not sure I agree that a sendto() should block waiting for ARP
    resultion, but then I'm not completely one with the tao of linux :)
    Why do you believe that a sendto() should block for the duration of
    ARP resolution?

    rick jones
     
    Rick Jones, Feb 13, 2004
    #12
  13. By analogy with send() on a TCP socket, which blocks until (the sent
    part of) the message was acknowledged by the recieving host. That by
    definition includes any resolution delays for intermediate protocols. It
    would be consistent for sendto() to behave as close to the same as
    possible.
     
    Owen Jacobson, Feb 13, 2004
    #13
  14. A TCP socket send() does not block until an ack has been received. send() will
    only block until the data can be copied to an internal buffer.
    Except send() does not behave that way...
     
    Phil Frisbie, Jr., Feb 13, 2004
    #14
  15. Michael Drew

    Michael Drew Guest

    I just istalled ethereal and ran my experiment.

    I can now look at a log of the UDP packets sent and received from PC
    #1. As long as the signal is strong I get expected results. A packet
    is sent from PC1 every .005 seconds and received from PC2 roughly
    ..0015 seconds afterwards.

    Unplugging the wireless card on PC2 does not produce any ARP requests.
    By looking at the packet checksums, it seems that packets are
    retransmitted several times and this seems to be causing the blocking
    behaviour of the send() function.

    Anyone know a way of turning off retransmission on an Orinoco silver
    wireless card? Or, is there a way to make the send() function truly
    non-blocking?

    Thanks to everyone for their responses so far.
     
    Michael Drew, Feb 19, 2004
    #15
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.