Networking Forums

Networking Forums > Computer Networking > Linux Networking > Problems with an NFS server

Reply
Thread Tools Display Modes

Problems with an NFS server

 
 
Mathias Gaunard
Guest
Posts: n/a

 
      07-10-2007, 03:21 PM
Hello,

I have been having a few performance issues with an NFSv3 server
running RHEL4, and serving 400 PCs and 100 diskless computers.
Every user accesses its home directory from a computer of the network
through NFS.

The server is an Intel Xeon 5160, two processors with two cores each,
with 16 GB of RAM.
The network is mostly full gigabit ethernet, and the server accesses
the main switch through an aggregation of three gigabit links.

The server is connected to a direct-attached storage of one terabyte
which uses RAID-5, through an Ultra-320 SCSI card, but which only runs
at 160 MB/s.
All of the shared space is on a single ext3 partition.

The NFS is configured to use TCP, with rsize and wsize set to 32k. The
NFS server uses 64 instances.

The server doesn't seem to be able to handle the load well. Running
the iozone test suite on 20 computers is enough to get the server
average CPU load go as up as 50, and make the server very slow to
respond to any request.
Basically, any operation that's a bit write intensive has a serious
impact on the network performance. (of course iozone is a bit extreme)

I am wondering if I should upgrade NFS to version 4, or eventually use
AFS. Maybe I simply need to split work among different servers?
Maybe the issue is more tied to the filesystem underlyingly used or
the DAS?
I had considered ZFS, but unfortunetely there is no Linux support.
Moving to Fiber Channel is also planned.

I would gladly appreciate any comments, especially if you can spot
something obviously wrong.
Experiences with similar networks would also be interesting.

 
Reply With Quote
 
 
 
 
Tim Southerwood
Guest
Posts: n/a

 
      07-10-2007, 03:48 PM
Mathias Gaunard wrote:

> Hello,
>
> I have been having a few performance issues with an NFSv3 server
> running RHEL4, and serving 400 PCs and 100 diskless computers.
> Every user accesses its home directory from a computer of the network
> through NFS.
>
> The server is an Intel Xeon 5160, two processors with two cores each,
> with 16 GB of RAM.
> The network is mostly full gigabit ethernet, and the server accesses
> the main switch through an aggregation of three gigabit links.
>
> The server is connected to a direct-attached storage of one terabyte
> which uses RAID-5, through an Ultra-320 SCSI card, but which only runs
> at 160 MB/s.
> All of the shared space is on a single ext3 partition.
>
> The NFS is configured to use TCP, with rsize and wsize set to 32k. The
> NFS server uses 64 instances.
>
> The server doesn't seem to be able to handle the load well. Running
> the iozone test suite on 20 computers is enough to get the server
> average CPU load go as up as 50, and make the server very slow to
> respond to any request.
> Basically, any operation that's a bit write intensive has a serious
> impact on the network performance. (of course iozone is a bit extreme)
>
> I am wondering if I should upgrade NFS to version 4, or eventually use
> AFS. Maybe I simply need to split work among different servers?
> Maybe the issue is more tied to the filesystem underlyingly used or
> the DAS?
> I had considered ZFS, but unfortunetely there is no Linux support.
> Moving to Fiber Channel is also planned.
>
> I would gladly appreciate any comments, especially if you can spot
> something obviously wrong.
> Experiences with similar networks would also be interesting.


Hi

I built a similar box with multiple bonded gig links. In order to even get
it to saturate 2 links (*very* fast Eurologic RAID controller and 10 GB RAM
for cache) I had to set a number of things. Not all of these are
necessarily optimal - I gave up when I had it working "well enough":

All of what you've done is completely sensible. Try some of these in
addition and see what gives:

OK - Mandriva/Redhat style config added to ifcfg.ethX (foreach of the gig
NICs):

....
TUNING=yes
TUNINGETHTOOL="-G DEVICE rx 4096 tx 4096"
TUNINGTXQUEUELEN=10000

This is for e1000 server class NICs, and the TUNINGETHTOOL= option calls
ethtool to set a transmit and receive buffer of 4096. I patched the e1000.c
driver slightly to force interrupt coalescing the way I wanted, but you can
get a long way with setting the IRQ coalesce options when loading the e1000
module (BTW what NIC do you have?)

TUNINGTXQUEUELEN=10000 is an option given to ifconfig (or ip, doesn't
matter)

In sysfs, I set the IO schedulers to deadline throughout (this helped), eg:
/sysfs/block/sda/queue/scheduler = deadline
/sysfs/block/sdb/queue/scheduler = deadline
/sysfs/block/sdc/queue/scheduler = deadline

I only had 32 NFSD handlers, but I did set (in /etc/sysconfig/nfs)

# Increase the memory limits on the socket input queues for
# the nfs processes .. NFS benchmark SPECsfs demonstrate a
# need for a larger than default size (64kb) .. setting
# TUNE_QUEUE to yes will set the values to 256kb.
TUNE_QUEUE="yes"
NFS_QS=262144

For what it's worth, I also set:

/proc/sys/net/core/rmem_max=8388608
/proc/sys/net/core/rmem_default=8388608

Questions:

How fast is your RAID array (really, not how fast are the links)?
What is it?
What NICs?
What is your max NFS throughput (ideal is fine, say copying same large file
to multiple clients). If you manage to saturate 3 gig links, I will
congratulate you because I couldn't.

Seriously, after that, I can to the conclusion that Xeon hardware wasn't up
to the job. Too little bus bandwidth, single RAM bus. I suspected that a
twin Opteron server might do better, but I never got around to trying it.

Cheers

Tim
 
Reply With Quote
 
Mathias Gaunard
Guest
Posts: n/a

 
      07-11-2007, 01:31 PM
On 10 juil, 17:48, Tim Southerwood <t...@dionic.net> wrote:

> All of what you've done is completely sensible. Try some of these in
> addition and see what gives:
>
> OK - Mandriva/Redhat style config added to ifcfg.ethX (foreach of the gig
> NICs):
>
> ...
> TUNING=yes
> TUNINGETHTOOL="-G DEVICE rx 4096 tx 4096"
> TUNINGTXQUEUELEN=10000
>
> This is for e1000 server class NICs, and the TUNINGETHTOOL= option calls
> ethtool to set a transmit and receive buffer of 4096. I patched the e1000.c
> driver slightly to force interrupt coalescing the way I wanted, but you can
> get a long way with setting the IRQ coalesce options when loading the e1000
> module (BTW what NIC do you have?)
>
> TUNINGTXQUEUELEN=10000 is an option given to ifconfig (or ip, doesn't
> matter)


What exactly is this queue, and how does it affect performance?

I had also thought of enabling the jumbo frame, do you think it would
be a good thing?

>
> In sysfs, I set the IO schedulers to deadline throughout (this helped), eg:
> /sysfs/block/sda/queue/scheduler = deadline
> /sysfs/block/sdb/queue/scheduler = deadline
> /sysfs/block/sdc/queue/scheduler = deadline


According to various stats I have seen on the Internet, the default
Linux 2.6 scheduler, the anticipatory one, performs way better than
the deadline scheduler.
Why do you think it is relevant to use the deadline scheduler in this
situation?


> I only had 32 NFSD handlers, but I did set (in /etc/sysconfig/nfs)
>
> # Increase the memory limits on the socket input queues for
> # the nfs processes .. NFS benchmark SPECsfs demonstrate a
> # need for a larger than default size (64kb) .. setting
> # TUNE_QUEUE to yes will set the values to 256kb.
> TUNE_QUEUE="yes"
> NFS_QS=262144


Interesting, I will try that.


> For what it's worth, I also set:
>
> /proc/sys/net/core/rmem_max=8388608
> /proc/sys/net/core/rmem_default=8388608


>From the tests that coworkers have done, it seems that such a high

value is not necessarily the best.


> Questions:
>
> How fast is your RAID array (really, not how fast are the links)?
> What is it?


It's a Transtec SCSI RAID 5200, which provides two Ultra160 SCSI
links.
However, I believe the disks inside are Ultra320 SCSI, not sure about
their models.


> What NICs?


Two Broadcom NetXtreme II BCM5708 (bnx2 driver) and one onboard Intel
82545GM (e1000 driver).
Before, two Intel cards were used instead of the Broadcom ones, but
they supposedly led to kernel panics.
The three controllers are bound, with load balancing round-robin.


> What is your max NFS throughput (ideal is fine, say copying same large file
> to multiple clients). If you manage to saturate 3 gig links, I will
> congratulate you because I couldn't.


Do you mean the performance of a read on multiple clients, which would
copy the file locally?

> Seriously, after that, I can to the conclusion that Xeon hardware wasn't up
> to the job. Too little bus bandwidth, single RAM bus. I suspected that a
> twin Opteron server might do better, but I never got around to trying it.


Is Opteron really better?

Thank you for your help and tips.

 
Reply With Quote
 
Mathias Gaunard
Guest
Posts: n/a

 
      07-11-2007, 02:12 PM
On 11 juil, 15:31, Mathias Gaunard <loufo...@gmail.com> wrote:
> On 10 juil, 17:48, Tim Southerwood <t...@dionic.net> wrote:


> > In sysfs, I set the IO schedulers to deadline throughout (this helped), eg:
> > /sysfs/block/sda/queue/scheduler = deadline
> > /sysfs/block/sdb/queue/scheduler = deadline
> > /sysfs/block/sdc/queue/scheduler = deadline

>
> According to various stats I have seen on the Internet, the default
> Linux 2.6 scheduler, the anticipatory one, performs way better than
> the deadline scheduler.


Actually, it seems that information was outdated, because Completely
Fair Scheduling is the default scheduler now.
I have however found quite a few sources that say the Deadline
scheduler is best for disk-intensive applications with heavy I/O.
Looks like that idea should be quite good.

 
Reply With Quote
 
Tim Southerwood
Guest
Posts: n/a

 
      07-11-2007, 05:06 PM
Hi

Mathias Gaunard wrote:

> On 10 juil, 17:48, Tim Southerwood <t...@dionic.net> wrote:
>
>> All of what you've done is completely sensible. Try some of these in
>> addition and see what gives:
>>
>> OK - Mandriva/Redhat style config added to ifcfg.ethX (foreach of the gig
>> NICs):
>>
>> ...
>> TUNING=yes
>> TUNINGETHTOOL="-G DEVICE rx 4096 tx 4096"
>> TUNINGTXQUEUELEN=10000
>>
>> This is for e1000 server class NICs, and the TUNINGETHTOOL= option calls
>> ethtool to set a transmit and receive buffer of 4096. I patched the
>> e1000.c driver slightly to force interrupt coalescing the way I wanted,
>> but you can get a long way with setting the IRQ coalesce options when
>> loading the e1000 module (BTW what NIC do you have?)
>>
>> TUNINGTXQUEUELEN=10000 is an option given to ifconfig (or ip, doesn't
>> matter)

>
> What exactly is this queue, and how does it affect performance?


I believe it is the pending queue, meaning that applications don;t have to
block just because the card is busy. But I'm not 100%. Bigger is helpful
though, upto a point.

> I had also thought of enabling the jumbo frame, do you think it would
> be a good thing?


It probably is if your NICS and switches and client *all* support it.

>>
>> In sysfs, I set the IO schedulers to deadline throughout (this helped),
>> eg: /sysfs/block/sda/queue/scheduler = deadline
>> /sysfs/block/sdb/queue/scheduler = deadline
>> /sysfs/block/sdc/queue/scheduler = deadline

>
> According to various stats I have seen on the Internet, the default
> Linux 2.6 scheduler, the anticipatory one, performs way better than
> the deadline scheduler.
> Why do you think it is relevant to use the deadline scheduler in this
> situation?


I note your other post.

>
>> I only had 32 NFSD handlers, but I did set (in /etc/sysconfig/nfs)
>>
>> # Increase the memory limits on the socket input queues for
>> # the nfs processes .. NFS benchmark SPECsfs demonstrate a
>> # need for a larger than default size (64kb) .. setting
>> # TUNE_QUEUE to yes will set the values to 256kb.
>> TUNE_QUEUE="yes"
>> NFS_QS=262144

>
> Interesting, I will try that.
>
>
>> For what it's worth, I also set:
>>
>> /proc/sys/net/core/rmem_max=8388608
>> /proc/sys/net/core/rmem_default=8388608

>
>>From the tests that coworkers have done, it seems that such a high

> value is not necessarily the best.


That's quite possible - my attempts were a stab in the dark with testing at
each stage - so I probably overdid these.

>
>> Questions:
>>
>> How fast is your RAID array (really, not how fast are the links)?
>> What is it?

>
> It's a Transtec SCSI RAID 5200, which provides two Ultra160 SCSI
> links.
> However, I believe the disks inside are Ultra320 SCSI, not sure about
> their models.
>
>
>> What NICs?

>
> Two Broadcom NetXtreme II BCM5708 (bnx2 driver) and one onboard Intel
> 82545GM (e1000 driver).


I don't know the BCM5708 chips. I've certainly had 4 e1000's running
together quite happily though.

> Before, two Intel cards were used instead of the Broadcom ones, but
> they supposedly led to kernel panics.
> The three controllers are bound, with load balancing round-robin.
>
>
>> What is your max NFS throughput (ideal is fine, say copying same large
>> file to multiple clients). If you manage to saturate 3 gig links, I will
>> congratulate you because I couldn't.

>
> Do you mean the performance of a read on multiple clients, which would
> copy the file locally?


Large file (5GB, like a DVD ISO) on the server. Get lots of clients to copy
it to /dev/null off NFS and measure the server side throughput with sar

>> Seriously, after that, I can to the conclusion that Xeon hardware wasn't
>> up to the job. Too little bus bandwidth, single RAM bus. I suspected that
>> a twin Opteron server might do better, but I never got around to trying
>> it.

>
> Is Opteron really better?


Definitely, in many ways. For one, it is a very decent 64 bit platfrom, for
seconds it is a NUMA architecture, so if you have a dual Opteron (not dual
core) system, you get a memory bus per CPU.

> Thank you for your help and tips.


Your welcome - I'm interested in some feedback on better parameters than
mine too

Cheers

Tim
 
Reply With Quote
 
Mathias Gaunard
Guest
Posts: n/a

 
      07-13-2007, 12:43 PM
On 11 juil, 19:06, Tim Southerwood <t...@dionic.net> wrote:

> Large file (5GB, like a DVD ISO) on the server. Get lots of clients to copy
> it to /dev/null off NFS and measure the server side throughput with sar


I get about 1.5 Gb/s (measuring with iptraf), with 20 clients using
dd.

The problem isn't reading though, I get acceptable performance. The
problems I've been having are with writing. When lots of clients
write, the server becomes quite unresponsive. This doesn't happen,
however, with great numbers of processes writing intensively locally.

 
Reply With Quote
 
Steve Wolfe
Guest
Posts: n/a

 
      07-13-2007, 08:14 PM
> I get about 1.5 Gb/s (measuring with iptraf), with 20 clients using
> dd.
>
> The problem isn't reading though, I get acceptable performance. The
> problems I've been having are with writing. When lots of clients
> write, the server becomes quite unresponsive. This doesn't happen,
> however, with great numbers of processes writing intensively locally.


If I recall, you're using a RAID 5 array, correct? If so, that may very
well be a source of the problem - writes under RAID 5 are vastly more
expensive, and even with local SCSI disks in such an array, having 20 people
doing heavy writes simultaneously (especially in random I/O) can bog things
down in quite a hurry.

steve



 
Reply With Quote
 
Mathias Gaunard
Guest
Posts: n/a

 
      07-15-2007, 10:27 PM
On Jul 13, 10:14 pm, "Steve Wolfe" <h...@codon.com> wrote:
>
> If I recall, you're using a RAID 5 array, correct? If so, that may very
> well be a source of the problem - writes under RAID 5 are vastly more
> expensive, and even with local SCSI disks in such an array, having 20 people
> doing heavy writes simultaneously (especially in random I/O) can bog things
> down in quite a hurry.


As I just said, I'm not having serious problems when doing the I/O
directly on the machine, but only when doing it through NFS.

 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Problems staying connected to server 2003 over XP + Acquiring Network Adress display problems wolverinegod Windows Networking 1 10-18-2006 01:32 PM
DNS problems: 'Can't find server name for address x.x.x.x : server mark@ctcommunications.com Windows Networking 3 07-21-2006 05:43 PM
Share Problems between client & old server - new server /me pulls hairout TheSingingCat Windows Networking 0 04-05-2006 02:47 PM
Problems with Win2003 Server RRAS and Netgear print server Michael04 Windows Networking 0 03-06-2006 03:14 PM
Small Business Server 2003 Premium DHCP server problems =?Utf-8?B?ZHBjaHJpc3RAaG9sZ2VyZGFuc2tlLmNvbQ==?= Windows Networking 4 06-10-2004 09:51 AM



1 2 3 4 5 6 7 8 9 10 11