Networking Forums

Networking Forums > Computer Networking > Linux Networking > Unfair TCP client servicing

Reply
Thread Tools Display Modes

Unfair TCP client servicing

 
 
RR
Guest
Posts: n/a

 
      10-04-2004, 01:20 AM
I've been searching for an answer for this for at least 12 months with no
luck.

We have an internal program that process batches of information. Call this
the "Server" program.
We have two or more Client programs that are trying to send batches of
information to the Server.

We are relying on the TCP connection scheduling algorithm (i.e. what happens
internally in the "accept" system call) to provide fair access for the
Clients to the Server.

All systems are running RedHat Linux 7.3 with a 2.4 kernel.

The problem is that the Server processes one client much more than another.

Here's a log that shows the problem:
****
Batch 89 from ClientA
Batch 90 from ClientA
Batch 5041 from ClientB
Batch 5042 from ClientB
Batch 5043 from ClientB
Batch 5047 from ClientB
Batch 5046 from ClientB
Batch 5050 from ClientB
Batch 5051 from ClientB
Batch 5052 from ClientB
Batch 5054 from ClientB
Batch 5044 from ClientB
Batch 5057 from ClientB
Batch 5058 from ClientB
Batch 91 from ClientA
Batch 5059 from ClientB
Batch 92 from ClientA
Batch 5060 from ClientB
****

During that big hunk of ClientB processing, ClientA was trying to send a
batch and wasn't getting service.

Each batch takes up to 12 seconds to process.

The average batch *send* time for ClientB is 4 seconds, and for ClientA it's
60 seconds. This measures the time from just before the socket is created,
to when it is closed, and includes the "connect" call and all "write" calls.

So, these numbers simply provide a metric for the unfair service that
ClientA is receiving.

The Server does a "listen(socket,10)" then is a loop of "accept,
read-process loop, close".

I've looked at the 2.4 kernel's accept queueing, and it seems to be a simple
FIFO queue...when a SYN is received the connection is placed on the end of
the queue.

So, the most obvious reason is that packets are being lost or dropped
somewhere between Server and ClientA.

Can anyone suggest another reason why we are seeing this unfair servicing?

tia,
RR


 
Reply With Quote
 
 
 
 
Howard Johnson
Guest
Posts: n/a

 
      10-04-2004, 04:29 PM
In article <uz18d.13221$(E-Mail Removed)>,
RR <(E-Mail Removed)> wrote:
>We have an internal program that process batches of information. Call this
>the "Server" program.
>We have two or more Client programs that are trying to send batches of
>information to the Server.
>
>We are relying on the TCP connection scheduling algorithm (i.e. what happens
>internally in the "accept" system call) to provide fair access for the
>Clients to the Server.


This is not sufficient to provide fair service if batches arrive faster
than they can be processed.

>The problem is that the Server processes one client much more than another.


The batch server needs to have a separate queue for each client and then
process one batch per queue in a round-robin fashion.

>During that big hunk of ClientB processing, ClientA was trying to send a
>batch and wasn't getting service.
>
>Each batch takes up to 12 seconds to process.
>
>The average batch *send* time for ClientB is 4 seconds, and for ClientA it's
>60 seconds. This measures the time from just before the socket is created,
>to when it is closed, and includes the "connect" call and all "write" calls.


Since ClientB can send batches faster than the server can process them,
you need to provide a way to process the batches fairly (as described
above). You should probably have a flow-control mechanism as well so
you can put an upper bound on how long the queues will get.
 
Reply With Quote
 
RR
Guest
Posts: n/a

 
      10-04-2004, 10:00 PM
"Howard Johnson" <(E-Mail Removed)> wrote in message
news:cjrtpj$cn1$(E-Mail Removed)...
> This is not sufficient to provide fair service if batches arrive faster
> than they can be processed.


Thanks for your reply. But, I can't see how that could be correct....

In the linux 2.4 kernel, the accept queue is strictly FIFO (from what I can
see).

Therefore, connections should be accepted by the Server application in the
order that they are placed in the kernel's accept queue.

So, even if ClientB is faster, ClientA's batches should be processed in the
order they are placed in the accept queue.

They will be processed less often, but they shouldn't ever be pre-empted by
a ClientB connection.

For example, the kernel accept queue might look like this:
B1-B2-B4-B3-B5-A1-B6-B8-A2-B7-B9

Now, vonce B1 thru B5 are accepted by the Server app, the next one *should*
be A1, followed by B6, B8, then A2, and so on. (Note I've put a couple out
of order to illustrate the possibility that an ealier connection may take
longer to complete the setup than a later one - due to packet loss or delay
or different network path.)

The point being that connections are not placed anywhere other than the end
of the kernel accept queue.

Do you agree? Am I missing some part of the logic?

BTW, do you know of any easy way to view the kernel's accept queue (I guess
I could write a prog that reads it in /dev/kmem) in real time?

thanks,
RR




 
Reply With Quote
 
Tauno Voipio
Guest
Posts: n/a

 
      10-08-2004, 10:29 AM
RR wrote:
> I've been searching for an answer for this for at least 12 months with no
> luck.
>
> We have an internal program that process batches of information. Call this
> the "Server" program.
> We have two or more Client programs that are trying to send batches of
> information to the Server.
>
> We are relying on the TCP connection scheduling algorithm (i.e. what happens
> internally in the "accept" system call) to provide fair access for the
> Clients to the Server.
>
> All systems are running RedHat Linux 7.3 with a 2.4 kernel.
>
> The problem is that the Server processes one client much more than another.
>
> Here's a log that shows the problem:
> ****
> Batch 89 from ClientA
> Batch 90 from ClientA
> Batch 5041 from ClientB
> Batch 5042 from ClientB
> Batch 5043 from ClientB
> Batch 5047 from ClientB
> Batch 5046 from ClientB
> Batch 5050 from ClientB
> Batch 5051 from ClientB
> Batch 5052 from ClientB
> Batch 5054 from ClientB
> Batch 5044 from ClientB
> Batch 5057 from ClientB
> Batch 5058 from ClientB
> Batch 91 from ClientA
> Batch 5059 from ClientB
> Batch 92 from ClientA
> Batch 5060 from ClientB
> ****
>
> During that big hunk of ClientB processing, ClientA was trying to send a
> batch and wasn't getting service.
>
> Each batch takes up to 12 seconds to process.
>
> The average batch *send* time for ClientB is 4 seconds, and for ClientA it's
> 60 seconds. This measures the time from just before the socket is created,
> to when it is closed, and includes the "connect" call and all "write" calls.
>
> So, these numbers simply provide a metric for the unfair service that
> ClientA is receiving.
>
> The Server does a "listen(socket,10)" then is a loop of "accept,
> read-process loop, close".
>
> I've looked at the 2.4 kernel's accept queueing, and it seems to be a simple
> FIFO queue...when a SYN is received the connection is placed on the end of
> the queue.
>
> So, the most obvious reason is that packets are being lost or dropped
> somewhere between Server and ClientA.
>
> Can anyone suggest another reason why we are seeing this unfair servicing?
>
> tia,
> RR


Take an Ethereal dump of the client connections, so we can
see the real timing of the arriving requests and responses.

To make it easier to read, limit the capture to the hosts
of interest.

Tauno Voipio
tauno voipio (at) iki fi


 
Reply With Quote
 
Howard Johnson
Guest
Posts: n/a

 
      10-08-2004, 05:15 PM
In article <GJj8d.14320$(E-Mail Removed)>,
RR <(E-Mail Removed)> wrote:
>"Howard Johnson" <(E-Mail Removed)> wrote in message
>news:cjrtpj$cn1$(E-Mail Removed)...
>> This is not sufficient to provide fair service if batches arrive faster
>> than they can be processed.

>
>Thanks for your reply. But, I can't see how that could be correct....
>
>In the linux 2.4 kernel, the accept queue is strictly FIFO (from what I can
>see).


Fairness has to be ensured at *every* network protocol layer, *including*
the application. I trust that Linux 2.4 kernels process packets fairly.
But there are a number of things applications can do to prevent clients
from being served fairly. Lower-level protocols cannot be discussed in
the abstract without considering higer-level protocol (your application's)
implementations.

>Do you agree? Am I missing some part of the logic?


One of my favorite answers to these kinds of questions is to say that
there is not enough information to give you a definitive answer. And I've
had over 10 years of professional experience diagnosing network protocol
problems.

>BTW, do you know of any easy way to view the kernel's accept queue (I guess
>I could write a prog that reads it in /dev/kmem) in real time?


No. And it's not possible for a user-mode program in a preemptive-
scheduled OS like Linux to read /dev/kmem in real time. It's better to
put trace information in your application anyway.

Good events to log and/or trace include:
* when a client connection is accepted,
* when beginning to receive a client message,
* when each TCP read() is complete,
* when a message is completely received,
* when beginning processing a message,
* and when done processing a message.

Include millisecond-resolution timestamps, socket numbers, message IDs, etc.

However, the easiest way to get reliable information is to collect a
packet trace (really, ethereal is your friend) on the *server* machine.
There is enough information to correctly diagnose these problems in
perhaps half of all cases, in my experience. And for those problems that
aren't diagnosed, it usually does a good job of pointing you in the right
direction.
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
The DHCP service is not servicing any clients Mirco Wilhelm Windows Networking 3 07-13-2007 03:38 PM
DHCP not servicing clients Nathan Phillips Windows Networking 7 11-09-2005 08:50 PM
DHCP Server servicing different VLANs ik Windows Networking 0 08-12-2005 05:29 PM
+Net Unfair Dr Teeth Broadband 27 12-01-2004 12:07 AM
Load Ballacing broke DHCP - The DHCP service is not servicing any clients because....... Creative Twitch Windows Networking 3 10-28-2004 01:25 PM



1 2 3 4 5 6 7 8 9 10 11