Server slows down after few days of uptime.

Asked by Snaglpus00

I have installed the 7.04 server version on an IBM 2.4Ghz with 1GB ram and 160Gb HD, no GUI installed. Ran the LAMP configuration on install and had no problems with getting everything up and running. The server is running a simple Twiki documentation server as well as 1 other web page that does some test case management (commercial SW which is designed to run Linux with apache2/mod_perl). There are a max of maybe 5 users on the server at a time.

The problem is that after a few days of up time the server grown increasingly slower. Web pages takes longer to download, ssh takes a little longer to even get a login prompt, top takes a few min to display anything. The real kicker is I wrote a small script that gets the time, counts to 5 seconds, gets new time and does the difference. When the server completes a fresh reboot this works fine, usually works fine for about a few days, then it will start to report slower times, such as 32 seconds. So why does the PC counting to 5 in a script take 32 real time seconds.
Script Start:
#!/bin/bash

while [ 1 == 1 ]; do
START=$(date +%s)
echo "$START"
echo "sleeping"
sleep 5
END=$(date +%s)
echo "$END"
DIFF=$(( $END - $START ))
echo "It took $DIFF seconds"
echo "re-running"
done

EOF

-The PC is not swapping, has plenty of memory to spare according to free, memtest passed, ps axv does not show any abnormal processes. However top does take a while to respond and everything seems to run slower.
-Rebooting fixes this, but to have to reboot a server every 3-4 days (longest I got was 6) is no acceptable for a server version.
-The days it becomes slow on are random, so I don't think its a cron job that could be causing problems.
-There are no full partitions (not even close to full maybe max used on a partition is 5-10%)
-load average is 0.1
-Nothing in dmseg, apache_error.log or kern.log or other logs that show me what might be going on.
-apache is re-started nightly for backup reasons.
-monit is monitoring apache server as well as CPU usage and no alerts from monit when server becomes slow (but does not check every 120, more like every 10 min seems to be linked to the counting problems above)

Any ideas are welcome and I will gather what info is requested when server is in bad stat again.

~K

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Solved by:
Snaglpus00
Solved:
Last query:
Last reply:
Revision history for this message
Dmitry Mityugov (dmitry-mityugov) said :
#1

Any chance this is caused by a power saving kernel module for the CPU, like p4_clockmod?

Revision history for this message
Snaglpus00 (kthornberg) said :
#2

To the best of my knowledge that module is not loaded on my machine. I do have it in the lib dir but its not in /etc/modules file. If this is not a good enough check to see if its being loaded, please let me know a better / correct way.

I did disable add the boot flags of noacpi and noapic and the machine has been up for 3 days and no real problems yet. Usually does not start to slow down until 4-5 days, but it has been as short as 2-3 days.

Thanks and I'll try to keep this updated.

~K

Revision history for this message
Snaglpus00 (kthornberg) said :
#3

Update: Its been 2 weeks since I added the noacpi flag to the boot parameters and the server has been up for the 2 weeks straight without a problem. Seems that this might be my fix.

So I am going to close this bug with the resolution being to TURN OFF ALL power management functionality with boot parameters if you are running a true server and want the stability.

However, I still think this bug needs to be addressed in future releases.