Networking Forums

Networking Forums > Computer Networking > Linux Networking > Re: NFS writes became extremely slow overnight

Reply
Thread Tools Display Modes

Re: NFS writes became extremely slow overnight

 
 
General Schvantzkoph
Guest
Posts: n/a

 
      07-12-2010, 06:43 PM
On Mon, 12 Jul 2010 10:09:04 -0500, Ignoramus15939 wrote:

> I have two NFS servers. Server A and server B. They serve unrelated
> shares.
>
> Server A is actually two servers using DRBD to stay in sync.
>
> I also have a number of NFS clients.
>
> Both servers worked great up to about last night.
>
> Somehow, NFS writes became extremely slow over Server A, for all
> clients, but not for Server B.
>
> I have no idea, yet, what changed, but wanted to ask here if anyone ever
> had a similar problem (NFS performance going to shit overnight).
>
> Some data points:
>
> 1) From any client, reading files over NFS is fast 2) From any client,
> writing to NFS on Server A is VERY SLOW 3) From any client, writing to
> Samba shares on Server A is fast. 4) From any client, writing to NFS on
> Server B is fast. 5) When logged on to Server A, copying a file from
> local A filesystem to itself is fast.
>
> IOW, everything is fast, EXCEPT for NFS writes to server A.
>
> Any ideas what might cause this?
>
> i


Have you looked at the SMART status of the drives? If a drive went bad
you will probably see a lot of retries.
 
Reply With Quote
 
 
 
 
General Schvantzkoph
Guest
Posts: n/a

 
      07-12-2010, 07:34 PM
On Mon, 12 Jul 2010 13:53:07 -0500, Ignoramus20495 wrote:

> On 2010-07-12, General Schvantzkoph <(E-Mail Removed)> wrote:
>> On Mon, 12 Jul 2010 10:09:04 -0500, Ignoramus15939 wrote:
>>
>>> I have two NFS servers. Server A and server B. They serve unrelated
>>> shares.
>>>
>>> Server A is actually two servers using DRBD to stay in sync.
>>>
>>> I also have a number of NFS clients.
>>>
>>> Both servers worked great up to about last night.
>>>
>>> Somehow, NFS writes became extremely slow over Server A, for all
>>> clients, but not for Server B.
>>>
>>> I have no idea, yet, what changed, but wanted to ask here if anyone
>>> ever had a similar problem (NFS performance going to shit overnight).
>>>
>>> Some data points:
>>>
>>> 1) From any client, reading files over NFS is fast 2) From any client,
>>> writing to NFS on Server A is VERY SLOW 3) From any client, writing to
>>> Samba shares on Server A is fast. 4) From any client, writing to NFS
>>> on Server B is fast. 5) When logged on to Server A, copying a file
>>> from local A filesystem to itself is fast.
>>>
>>> IOW, everything is fast, EXCEPT for NFS writes to server A.
>>>
>>> Any ideas what might cause this?
>>>
>>> i

>>
>> Have you looked at the SMART status of the drives? If a drive went bad
>> you will probably see a lot of retries.

>
> General, yes, I looked at SMART status of drives and found NO errors.
>
> Here's something weird:
>
> Server A :~# top -b -n 1|grep ' D '
> 3005 root 15 0 0 0 0 D 2 0.0 1287:33 nfsd 2884
> root 10 -5 0 0 0 D 0 0.0 818:53.66 kjournald 2999
> root 15 0 0 0 0 D 0 0.0 1317:34 nfsd 3006 root
> 15 0 0 0 0 D 0 0.0 1297:07 nfsd
>
> I have lots of NFS daemons waiting on disk or something. The "disk" is
> DRBD, and maybe its performance has degraded.
>
> i


Have you done a reboot yet?

 
Reply With Quote
 
General Schvantzkoph
Guest
Posts: n/a

 
      07-12-2010, 07:45 PM
On Mon, 12 Jul 2010 14:39:00 -0500, Ignoramus20495 wrote:

> On 2010-07-12, General Schvantzkoph <(E-Mail Removed)> wrote:
>> On Mon, 12 Jul 2010 13:53:07 -0500, Ignoramus20495 wrote:
>>
>>> On 2010-07-12, General Schvantzkoph <(E-Mail Removed)> wrote:
>>>> On Mon, 12 Jul 2010 10:09:04 -0500, Ignoramus15939 wrote:
>>>>
>>>>> I have two NFS servers. Server A and server B. They serve unrelated
>>>>> shares.
>>>>>
>>>>> Server A is actually two servers using DRBD to stay in sync.
>>>>>
>>>>> I also have a number of NFS clients.
>>>>>
>>>>> Both servers worked great up to about last night.
>>>>>
>>>>> Somehow, NFS writes became extremely slow over Server A, for all
>>>>> clients, but not for Server B.
>>>>>
>>>>> I have no idea, yet, what changed, but wanted to ask here if anyone
>>>>> ever had a similar problem (NFS performance going to shit
>>>>> overnight).
>>>>>
>>>>> Some data points:
>>>>>
>>>>> 1) From any client, reading files over NFS is fast 2) From any
>>>>> client, writing to NFS on Server A is VERY SLOW 3) From any client,
>>>>> writing to Samba shares on Server A is fast. 4) From any client,
>>>>> writing to NFS on Server B is fast. 5) When logged on to Server A,
>>>>> copying a file from local A filesystem to itself is fast.
>>>>>
>>>>> IOW, everything is fast, EXCEPT for NFS writes to server A.
>>>>>
>>>>> Any ideas what might cause this?
>>>>>
>>>>> i
>>>>
>>>> Have you looked at the SMART status of the drives? If a drive went
>>>> bad you will probably see a lot of retries.
>>>
>>> General, yes, I looked at SMART status of drives and found NO errors.
>>>
>>> Here's something weird:
>>>
>>> Server A :~# top -b -n 1|grep ' D '
>>> 3005 root 15 0 0 0 0 D 2 0.0 1287:33 nfsd 2884
>>> root 10 -5 0 0 0 D 0 0.0 818:53.66 kjournald 2999
>>> root 15 0 0 0 0 D 0 0.0 1317:34 nfsd 3006 root
>>> 15 0 0 0 0 D 0 0.0 1297:07 nfsd
>>>
>>> I have lots of NFS daemons waiting on disk or something. The "disk" is
>>> DRBD, and maybe its performance has degraded.
>>>
>>> i

>>
>> Have you done a reboot yet?
>>
>>

> No, I have not done so yet. I may do it tonight.
>
> i


I would power cycle the system, it's brute force but it clears out
everything.
 
Reply With Quote
 
Greg Russell
Guest
Posts: n/a

 
      07-12-2010, 07:47 PM
In news:(E-Mail Removed),
General Schvantzkoph <(E-Mail Removed)> typed:

> Have you done a reboot yet?


<sigh>

# /etc/init.d/{nfs,portmap} restart

Individually, of course (or your distro-cpecific location) ... reboots are
for $lusers.




 
Reply With Quote
 
Chris Ahlstrom
Guest
Posts: n/a

 
      07-13-2010, 12:06 AM
Ignoramus23418 stopped playing his vuvuzela long enough to say:

> An update:
>
> Restart of the NFS daemon did not help.
>
> I restarted the secondary cluster server, then the primary (the
> secondary took over). Now everything is running great. I guess 485
> days is a bit too much for those servers to go at any one time.


Would have been nice to figure out what part was causing the issue,
though. But the bottom line is you're good to go for another 485 days.

--
When Dexter's on the Internet, can Hell be far behind?"
 
Reply With Quote
 
Stan Bischof
Guest
Posts: n/a

 
      07-13-2010, 12:32 AM
In comp.os.linux.misc Chris Ahlstrom <(E-Mail Removed)> wrote:
> Ignoramus23418 stopped playing his vuvuzela long enough to say:
>
>> secondary took over). Now everything is running great. I guess 485
>> days is a bit too much for those servers to go at any one time.

>
> Would have been nice to figure out what part was causing the issue,
> though. But the bottom line is you're good to go for another 485 days.


or 485 hours, or 485 minutes, since the root cause isn't known.

At least it is very easy to fix- just restart.

If it happens again on your watch would suggest more
investigation.

Stan
 
Reply With Quote
 
Chris Ahlstrom
Guest
Posts: n/a

 
      07-13-2010, 10:41 AM
Ignoramus15939 stopped playing his vuvuzela long enough to say:

> On 2010-07-13, Chris Ahlstrom <(E-Mail Removed)> wrote:
>> Ignoramus23418 stopped playing his vuvuzela long enough to say:
>>
>>> An update:
>>>
>>> Restart of the NFS daemon did not help.
>>>
>>> I restarted the secondary cluster server, then the primary (the
>>> secondary took over). Now everything is running great. I guess 485
>>> days is a bit too much for those servers to go at any one time.

>>
>> Would have been nice to figure out what part was causing the issue,
>> though. But the bottom line is you're good to go for another 485 days.

>
> I blame DRBD myself. After 485 more days, these servers need to be retired.


Looks like it is relatively new to the Linux kernel.

--
Proof techniques #2: Proof by Oddity.
Topics is be covered in future issues include proof by:
Intimidation
Gesticulation (handwaving)
"Try it; it works"
Constipation (I was just sitting there and ...)
Blatant assertion
Changing all the 2's to _n's
Mutual consent
Lack of a counterexample, and
"It stands to reason"
 
Reply With Quote
 
Greg Russell
Guest
Posts: n/a

 
      07-13-2010, 09:26 PM
"Ignoramus23418" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ...

> An update:
>
> Restart of the NFS daemon did not help.


.... it sounds as if you ignored the portmap restart.


 
Reply With Quote
 
J G Miller
Guest
Posts: n/a

 
      07-13-2010, 10:15 PM
On Tue, 13 Jul 2010 17:06:15 -0500, Ignoramus3537 wrote:
>
> What was stuck was NFSD instances writing to DRBD.]


Would changing to NFSv4 using TCP rather than NFSv3 with UDP
avoid the problem?


 
Reply With Quote
 
J G Miller
Guest
Posts: n/a

 
      07-13-2010, 11:50 PM
On Tuesday, July 13th, 2010 at 18:31:30h -0500, Ignoramus3537 wrote:
>
> I would think, not if DRBD is the culprit. It is the underlying media
> for writing.


Can you test the DRBD independently of NFS?
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Realtek NIC extremely slow scott streit Windows Networking 5 03-18-2008 04:15 AM
reads slow, writes fast, on server. Jan Mannoury Linux Networking 22 10-03-2005 02:57 PM
Network extremely slow Jos Notermans Broadband Hardware 1 06-21-2004 05:24 AM
Slow writes to 2003 Server Std TheSingingCat Windows Networking 2 06-16-2004 12:52 AM
Extremely slow transfers Tanisha Ranury Wireless Internet 2 07-20-2003 10:30 PM



1 2 3 4 5 6 7 8 9 10 11