Invalidly high process stats for some processes on 10.04

Asked by Denis Haskin

On 2 new 10.04 servers (at AWS) that we have recently set up, we get ridiculously high stats for a few processes, even immediately after reboot (this machine has only been up 22 hours at the moment). Looks like it only happens with mysqld and console-kit-daemon.

For example, from ps aux:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
mysql 767 183065341 50.1 36772040 35958884 ? Sl Aug10 48408964:27 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/mnt/mysql/data --user=mysql --log-error=/mnt/mysql/data/mynode.err --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
root 936 208382845 0.0 188336 3996 ? Sl Aug10 46865656:43 /usr/sbin/console-kit-daemon --no-daemon

atop shows similar strangeness:

ATOP - perf1 2010/08/11 14:04:35 10 seconds elapsed
PRC | sys 5124095h35m | user 2351959867h05m | #proc 191 | #zombie 0 | #exit 0 |
CPU | sys 28% | user 62% | irq 0% | idle 673% | wait 36% |
CPU | steal 1% | stl/cpu 0% | | | |
cpu | sys 10% | user 18% | irq 0% | idle 55% | cpu000 w 16% |
cpu | sys 8% | user 15% | irq 0% | idle 75% | cpu001 w 1% |
cpu | sys 6% | user 11% | irq 0% | idle 79% | cpu002 w 4% |
cpu | sys 2% | user 7% | irq 0% | idle 90% | cpu003 w 1% |
cpu | sys 2% | user 6% | irq 0% | idle 92% | cpu004 w 0% |
cpu | sys 0% | user 2% | irq 0% | idle 86% | cpu005 w 12% |
cpu | sys 0% | user 2% | irq 0% | idle 98% | cpu006 w 1% |
cpu | sys 0% | user 1% | irq 0% | idle 99% | cpu007 w 0% |
CPL | avg1 4.63 | avg5 4.58 | avg15 3.76 | csw 119617 | intr 124132 |
MEM | tot 68.4G | free 6.8G | cache 21.5G | buff 216.6M | slab 163.2M |
SWP | tot 0.0M | free 0.0M | | vmcom 37.9G | vmlim 34.2G |
DSK | sdb | busy 40% | read 0 | write 937 | avio 4 ms |
NET | transport | tcpi 5226 | tcpo 4696 | udpi 0 | udpo 0 |
NET | network | ipi 5226 | ipo 4696 | ipfrw 0 | deliv 5226 |
NET | eth0 ---- | pcki 5167 | pcko 4637 | si 858 Kbps | so 6140 Kbps |
NET | lo ---- | pcki 60 | pcko 60 | si 2 Kbps | so 2 Kbps |

  PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK ST EXC S CPU CMD 1/1
  767 5124095h35m 2351959867h05m -34.1M 1508K 0K 872K -- - S 800% mysqld
31876 0.01s 0.41s 0K 0K 0K 0K -- - S 4% ruby
28193 0.01s 0.01s 0K 0K 0K 0K -- - R 0% atop
  449 0.01s 0.00s 0K 0K 0K 8K -- - S 0% kjournald
 5801 0.00s 0.00s 0K 0K 0K 0K -- - S 0% haproxy
  159 0.00s 0.00s 0K 0K 0K 4K -- - S 0% kjournald

Any thoughts? We don't see this on a very similar 9.10 servers.

We don't do anything unusual in setup for these (e.g. they are stock 64-bit Amazon AMIs, not patched, etc). The only thing slightly unusual is that we use a third-party MySQL storage engine (tokutek), but they have never seen this issue either (although they said they don't do as much testing on Ubuntu--mostly CentOS and Fedora).

Thanks,

dwh

Question information

Language:
English Edit question
Status:
Answered
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Hans Spaans (hspaans) said :
#1

Nothing to worry about really, and don't trust commands like top and atop too much. How they calculate certain values is sometimes fairly close to black magic, luck and guessing. This has been so since the first version of top and related tools. If you want more background, I advise you to read more why Sun Microsystems created Dtrace and how they slowly are replacing certain tools for their next big release. Hopefully some of their engineering efforts have made it into Linux when they go mainstream and until there is a better top you have to use it with care.

Can you help with this problem?

Provide an answer of your own, or ask Denis Haskin for more information if necessary.

To post a message you must log in.