Networking Forums

Networking Forums > Computer Networking > Linux Networking > When did I lost packets?

Reply
Thread Tools Display Modes

When did I lost packets?

 
 
Spoon
Guest
Posts: n/a

 
      04-27-2006, 09:08 AM
Hello everyone,

I have written small bits of code to test high-rate packet handling.
(Approximately 10,000 packets per second.)

I send UDP packets at a constant rate from one computer:

while ( 1 )
{
sendto(sock, &seqno, sizeof seqno, 0,
(struct sockaddr *)&addr, sizeof addr);
++seqno;
busy_loop(100);
}

The only payload in the UDP packet is a 64-bit sequence number.
(Please ignore endianness issues.)

I use busy_loop(int us) to do nothing for 'us' micro-seconds.

This code is run by root, on an otherwise idle system, at the default
scheduling policy, with nice -n -10



I receive the packets on a different computer:

while ( 1 )
{
recvfrom(sock, &R_seqno, sizeof R_seqno, 0, NULL, NULL);
while ( E_seqno != R_seqno )
{
++lost; ++E_seqno;
}
++E_seqno;
}

R_seqno is the _received_ sequence number.
E_seqno is the _expected_ sequence number.
lost tracks the number of packets missed.

This code is run by root, on an otherwise idle system, as a SCHED_FIFO
process, with priority 80. (Why 80? I don't know.)

param.sched_priority = 80;
if ( sched_setscheduler(0, SCHED_FIFO, &param) < 0 )
{
perror("sched_setscheduler");
}

I registered a signal handler to print statistics:

static void catch(int sig)
{
printf("RECEIVED=%llu LOST=%llu\n", R_seqno, lost);
}

signal(SIGQUIT, catch);

(AFAIU I'm not supposed to call printf() inside a signal handler?
However, I don't think it would explain why I drop packets. But I
could be wrong!)

I ran the setup overnight (1000 minutes) and here are my results:

According to top, the receive process ate 16.5 minutes of CPU time.
(i.e. 1.65% CPU occupancy on average.)
The system stays very responsive despite the SCHED_FIFO process.

RECEIVED=577.5 million packets
LOST=3225 packets

I don't understand why I lose ANY packet...

I forgot to mention: I increased the size of the socket buffer.
(That was my intention, at least.)

$ /sbin/sysctl net | grep rmem_
net.core.rmem_default = 1064960
net.core.rmem_max = 1064960

The link layer does not report any problem.
(errors:0 dropped:0 overruns:0 frame:0 can someone explain what
these numbers mean exactly?)

$ /sbin/ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:13:20:0D:1F:47
inet addr:10.10.10.208 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::213:20ff:fe0d:1f47/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:661166427 errors:0 dropped:0 overruns:0 frame:0
TX packets:20981 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1036860947 (988.8 Mb) TX bytes:1686030 (1.6 Mb)
Interrupt:9

I noticed that I lose packets in bursts of 30-100 packets, and these
loss bursts are quite rare (~1 every 10-40 minutes). Someone told me
another high-priority process (another SCHED_FIFO??) might be running.

I checked /var/log/messages and saw:
# cat /var/log/messages
Apr 27 04:40:01 venus syslogd 1.4.1: restart.
Apr 27 05:04:33 venus -- MARK --
Apr 27 05:24:34 venus -- MARK --
Apr 27 05:44:34 venus -- MARK --
Apr 27 06:04:34 venus -- MARK --
Apr 27 06:24:34 venus -- MARK --
Apr 27 06:44:35 venus -- MARK --
Apr 27 07:04:35 venus -- MARK --
Apr 27 07:24:35 venus -- MARK --
Apr 27 07:44:35 venus -- MARK --
Apr 27 08:04:36 venus -- MARK --
Apr 27 08:24:36 venus -- MARK --
Apr 27 08:44:36 venus -- MARK --
Apr 27 09:04:36 venus -- MARK --
Apr 27 09:24:37 venus -- MARK --
Apr 27 09:44:37 venus -- MARK --
Apr 27 10:04:37 venus -- MARK --
Apr 27 10:24:37 venus -- MARK --

1 every 20 minutes... What do these log entries refer to?
Is it a high-priority process? Perhaps even a kernel thread?
Is it CPU-intensive? Could it explain why I drop packets?

If you've read this far, THANKS! :-)

Regards,

Spoon
 
Reply With Quote
 
 
 
 
Spoon
Guest
Posts: n/a

 
      04-27-2006, 12:15 PM
Spoon wrote:

> I have written small bits of code to test high-rate packet handling.
> (Approximately 10,000 packets per second.)
>
> I send UDP packets at a constant rate from one computer:
>
> while ( 1 )
> {
> sendto(sock, &seqno, sizeof seqno, 0,
> (struct sockaddr *)&addr, sizeof addr);
> ++seqno;
> busy_loop(100);
> }
>
> The only payload in the UDP packet is a 64-bit sequence number.
> (Please ignore endianness issues.)
>
> I use busy_loop(int us) to do nothing for 'us' micro-seconds.
>
> This code is run by root, on an otherwise idle system, at the default
> scheduling policy, with nice -n -10
>
>
>
> I receive the packets on a different computer:
>
> while ( 1 )
> {
> recvfrom(sock, &R_seqno, sizeof R_seqno, 0, NULL, NULL);
> while ( E_seqno != R_seqno )
> {
> ++lost; ++E_seqno;
> }
> ++E_seqno;
> }
>
> R_seqno is the _received_ sequence number.
> E_seqno is the _expected_ sequence number.
> lost tracks the number of packets missed.
>
> This code is run by root, on an otherwise idle system, as a SCHED_FIFO
> process, with priority 80. (Why 80? I don't know.)
>
> param.sched_priority = 80;
> if ( sched_setscheduler(0, SCHED_FIFO, &param) < 0 )
> {
> perror("sched_setscheduler");
> }
>
> I registered a signal handler to print statistics:
>
> static void catch(int sig)
> {
> printf("RECEIVED=%llu LOST=%llu\n", R_seqno, lost);
> }
>
> signal(SIGQUIT, catch);
>
> (AFAIU I'm not supposed to call printf() inside a signal handler?
> However, I don't think it would explain why I drop packets. But I
> could be wrong!)
>
> I ran the setup overnight (1000 minutes) and here are my results:
>
> According to top, the receive process ate 16.5 minutes of CPU time.
> (i.e. 1.65% CPU occupancy on average.)
> The system stays very responsive despite the SCHED_FIFO process.
>
> RECEIVED=577.5 million packets
> LOST=3225 packets
>
> I don't understand why I lose ANY packet...
>
> I forgot to mention: I increased the size of the socket buffer.
> (That was my intention, at least.)
>
> $ /sbin/sysctl net | grep rmem_
> net.core.rmem_default = 1064960
> net.core.rmem_max = 1064960
>
> The link layer does not report any problem.
> (errors:0 dropped:0 overruns:0 frame:0 can someone explain what
> these numbers mean exactly?)
>
> $ /sbin/ifconfig eth0
> eth0 Link encap:Ethernet HWaddr 00:13:20:0D:1F:47
> inet addr:10.10.10.208 Bcast:10.10.10.255 Mask:255.255.255.0
> inet6 addr: fe80::213:20ff:fe0d:1f47/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:661166427 errors:0 dropped:0 overruns:0 frame:0
> TX packets:20981 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:1036860947 (988.8 Mb) TX bytes:1686030 (1.6 Mb)
> Interrupt:9
>
> I noticed that I lose packets in bursts of 30-100 packets, and these
> loss bursts are quite rare (~1 every 10-40 minutes). Someone told me
> another high-priority process (another SCHED_FIFO??) might be running.
>
> I checked /var/log/messages and saw:
> # cat /var/log/messages
> Apr 27 04:40:01 venus syslogd 1.4.1: restart.
> Apr 27 05:04:33 venus -- MARK --
> Apr 27 05:24:34 venus -- MARK --
> Apr 27 05:44:34 venus -- MARK --
> Apr 27 06:04:34 venus -- MARK --
>
> 1 every 20 minutes... What do these log entries refer to?
> Is it a high-priority process? Perhaps even a kernel thread?
> Is it CPU-intensive? Could it explain why I drop packets?


I forgot to mention that the two computers are on the same LAN:

SENDER <---> ETHERNET <---> RECEIVER
SWITCH

AFAIK, the UDP stream is the only traffic on the LAN.

I turned syslogd and klogd off (I thought HDD access might make me drop
packets. But the HDD controller performs DMA, right? So the CPU should
be available to service network interrupts, even when the HDD is used?)

I'm still dropping packets (420 in 83 million).
 
Reply With Quote
 
owmtia@gmail.com
Guest
Posts: n/a

 
      05-01-2006, 06:37 AM
UDP traffic doesn't guarantee that the packets will arrive in order it
is possible that your code could be getting confused by out of order
packets and reporting a series of lost packets when none have been lost
just the order was changed.

E.g. Packets 10 11 12 14 15 16 17 13 18
In this example both packets 13 and 17 will be reported lost by your
code. This would easily explain your burst of packet loss.

 
Reply With Quote
 
Spoon
Guest
Posts: n/a

 
      05-02-2006, 02:21 PM
(E-Mail Removed) wrote:

> UDP traffic doesn't guarantee that the packets will arrive in order it
> is possible that your code could be getting confused by out of order
> packets and reporting a series of lost packets when none have been lost
> just the order was changed.
>
> E.g. Packets 10 11 12 14 15 16 17 13 18
> In this example both packets 13 and 17 will be reported lost by your
> code. This would easily explain your burst of packet loss.


(Next time, could you provide some context by quoting the relevant parts
of the message you are replying to?)

The two computers are connected through a Cisco Catalyst switch.

SENDER <---> ETHERNET <---> RECEIVER
SWITCH

I don't believe a switch will re-order packets.

Has anyone ever witnessed an Ethernet _switch_ re-ordering packets?
 
Reply With Quote
 
owmtia
Guest
Posts: n/a

 
      05-04-2006, 04:02 AM
while ( 1 )
{
recvfrom(sock, &R_seqno, sizeof R_seqno, 0, NULL, NULL);
while ( E_seqno != R_seqno )
{
++lost; ++E_seqno;
}
++E_seqno;
}

I just noticed I had read your code wrong. It can't be an out of order
packet.

>From my example Packets 10 11 12 14 15 16 17 13 18


It would report packet 13 missing when it hits packet 14 and then when
it does reach packet 13 later it would get stuck in the while (E_seqno
!= R_seqno ) loop as R_seqno < E_seqno and E_seqno only increases.

The only other thing I can think of is that maybe something Like ARP
request are causing you to drop packets. The default timeout on the ARP
Table is 4 hours. Maybe run a packet sniffer and see what is around.

 
Reply With Quote
 
Spoon
Guest
Posts: n/a

 
      05-04-2006, 07:45 AM
Note: please take a look at http://cfaj.freeshell.org/google/

owmtia wrote:

> while ( 1 )
> {
> recvfrom(sock, &R_seqno, sizeof R_seqno, 0, NULL, NULL);
> while ( E_seqno != R_seqno )
> {
> ++lost; ++E_seqno;
> }
> ++E_seqno;
> }
>
> I just noticed I had read your code wrong. It can't be an out of order
> packet.
>
> From my example Packets 10 11 12 14 15 16 17 13 18
>
> It would report packet 13 missing when it hits packet 14 and then when
> it does reach packet 13 later it would get stuck in the while (E_seqno
> != R_seqno ) loop as R_seqno < E_seqno and E_seqno only increases.


Correct.

> The only other thing I can think of is that maybe something Like ARP
> request are causing you to drop packets. The default timeout on the ARP
> Table is 4 hours. Maybe run a packet sniffer and see what is around.


The problem was the switch.
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Lost packets - strange problem martin.ferrari@gmail.com Linux Networking 4 03-28-2006 04:01 PM
packets being lost in ip_layer seossenk Linux Networking 0 11-11-2005 02:51 AM
wlan not working at all (lost packets) Jochen Demmer Linux Networking 1 04-27-2005 12:21 AM
ppp lost packets - 16850 uart Al Linux Networking 0 07-26-2004 06:16 AM
Slow connection/lost packets to adsl router (note only router and one PC is on network) Tony Collins Windows Networking 2 01-28-2004 03:55 AM



1 2 3 4 5 6 7 8 9 10 11