Hi all,
I run dhcpd on my fileserver and find, that dhcpd
dies every morning between 5 and 6 AM.
I wrote a little script (see below) to restart
dhcpd whenever needed and run it every minute by
cron. The script fails to restart the server. Log
output below.
The timestamps in that log seem to have different
localisations - not really a problem but strange.
The command "service dhcpd start" or
"/etc/init.d/dhcpd strt" works fine and gives
peace until next morning.
syslog shows nothing special, a snip is below. The
only thing, I discovered are the entries
- Oct 19 05:49:47 dsasrv dhcpd: nss_ldap:
reconnecting to LDAP server...
- Oct 19 05:49:47 dsasrv dhcpd: nss_ldap:
reconnected to LDAP server after 1 attempt(s)
They appear every morning at dhcpd's diing times.
I don't know exactly, what they mean, but they
don't sound terrible in my ears.
That's all, I know. Do you can point me a
direction to search the problem?
Thanks in advance,
Clemens
-------------
The script/logs:
--------
1) The script, that sould recover the dhcpd server:
#!/bin/bash
PID="/var/run/dhcpd.pid";
LOGFILE="/var/log/services.log";
function reincarnate() {
service dhcpd start>/dev/null;
#TMP=`ps -e | grep $(cat $PID) | wc -l`;
if [ `ps -e | grep $(cat $PID) | wc -l`
== "0" ]; then
echo "$(date +%c) : DHCP died,
reincarnation FAILED.">>$LOGFILE;
echo "Critical error: Failed to
reincarnate DHCPD!";
else
echo "$(date +%c) : DHCP died,
reincarnation SUCCEEDED.">>$LOGFILE;
echo "Reincarnation suceeded.";
fi
}
if [ -f $PID ]; then
TMP=`ps -e | grep $(cat $PID) | wc -l`;
if [ $TMP == "0" ]; then
echo "DHCPD died, reincarnate...";
reincarnate;
else
echo "DHCPD is running.";
fi
else
echo "No lockfile found. Assuming service
is down and reincarnate...";
reincarnate;
exit;
fi
----------
2)
The output, script no 1 produced this morning.
First to entries are result of a test run
yesterday. The last entry was generated after
restarting dhcpd and "kill"ing its process:
Mo 18 Okt 2004 14:55:20 CEST : DHCP died,
reincarnation SUCCEEDED.
Mo 18 Okt 2004 14:55:41 CEST : DHCP died,
reincarnation SUCCEEDED.
Tue Oct 19 05:50:00 2004 : DHCP died,
reincarnation FAILED.
Tue Oct 19 05:51:00 2004 : DHCP died,
reincarnation FAILED.
<...>
Tue Oct 19 08:55:00 2004 : DHCP died,
reincarnation FAILED.
Di 19 Okt 2004 08:55:05 CEST : DHCP died,
reincarnation SUCCEEDED.
----------
3) The syslog around the diing time (seems to be
between 05:49:47 and 06:00:00):
Oct 19 05:49:00 dsasrv CROND[12904]: (root) CMD
(/usr/local/sbin/check_dhcpd)
Oct 19 05:49:00 dsasrv postfix/pickup[12417]:
9F6B8A3AF0: uid=0 from=<root>
Oct 19 05:49:00 dsasrv postfix/cleanup[11808]:
9F6B8A3AF0:
message-id=<(E-Mail Removed)>
Oct 19 05:49:00 dsasrv postfix/qmgr[1437]:
9F6B8A3AF0: from=<root@dsanet>, size=445, nrcpt=1
(queue active)
Oct 19 05:49:00 dsasrv postfix/local[11810]:
9F6B8A3AF0: to=<root@dsanet>, orig_to=<root>,
relay=local, delay=0, status=sent (ma
ilbox)
Oct 19 05:49:47 dsasrv dhcpd: Wrote 0 deleted host
decls to leases file.
Oct 19 05:49:47 dsasrv dhcpd: Wrote 0 new dynamic
host decls to leases file.
Oct 19 05:49:47 dsasrv dhcpd: Wrote 16 leases to
leases file.
Oct 19 05:49:47 dsasrv dhcpd: nss_ldap:
reconnecting to LDAP server...
Oct 19 05:49:47 dsasrv dhcpd: nss_ldap:
reconnected to LDAP server after 1 attempt(s)
Oct 19 05:50:00 dsasrv CROND[12914]: (root) CMD
(/usr/local/sbin/check_dhcpd)
Oct 19 05:50:00 dsasrv postfix/pickup[12417]:
BE02BA3AF0: uid=0 from=<root>
Oct 19 05:50:00 dsasrv postfix/cleanup[11808]:
BE02BA3AF0:
message-id=<(E-Mail Removed)>
Oct 19 05:50:00 dsasrv postfix/qmgr[1437]:
BE02BA3AF0: from=<root@dsanet>, size=565, nrcpt=1
(queue active)
Oct 19 05:50:00 dsasrv postfix/local[11810]:
BE02BA3AF0: to=<root@dsanet>, orig_to=<root>,
relay=local, delay=0, status=sent (ma
ilbox)
|