Wednesday, October 31, 2012

TOP - The Monitoring Tool


Top command provides a real-time look at what is happening with your system. Top produces so much output that a new user may get over whelmed with all thats presented and what it means.

Lets take a look at TOP one line at a time. The server has been flooded with http requests to create some load on the server.
top output:
top - 22:09:08 up 14 min,  1 user,  load average: 0.21, 0.23, 0.30
Tasks:  81 total,   1 running,  80 sleeping,   0 stopped,   0 zombie
Cpu(s):  9.5%us, 31.2%sy,  0.0%ni, 27.0%id,  7.6%wa,  1.0%hi, 23.7%si,  0.0%st
Mem:    255592k total,   167568k used,    88024k free,    25068k buffers
Swap:   524280k total,        0k used,   524280k free,    85724k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3166 apache    15   0 29444 6112 1524 S  6.6  2.4   0:00.79 httpd
 3161 apache    15   0 29444 6112 1524 S  5.9  2.4   0:00.79 httpd
 3164 apache    15   0 29444 6112 1524 S  5.9  2.4   0:00.75 httpd
 3169 apache    15   0 29444 6112 1524 S  5.9  2.4   0:00.74 httpd
 3163 apache    15   0 29444 6112 1524 S  5.6  2.4   0:00.76 httpd
 3165 apache    15   0 29444 6112 1524 S  5.6  2.4   0:00.77 httpd
 3167 apache    15   0 29444 6112 1524 S  5.3  2.4   0:00.73 httpd
 3162 apache    15   0 29444 6112 1524 S  5.0  2.4   0:00.77 httpd
 3407 root      16   0  2188 1012  816 R  1.7  0.4   0:00.51 top
  240 root      15   0     0    0    0 S  0.3  0.0   0:00.08 pdflush
  501 root      10  -5     0    0    0 S  0.3  0.0   0:01.20 kjournald
 2794 root      18   0 12720 1268  560 S  0.3  0.5   0:00.73 pcscd
    1 root      15   0  2060  636  544 S  0.0  0.2   0:03.81 init
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0
    5 root      10  -5     0    0    0 S  0.0  0.0   0:00.07 events/0
The first line in top:
top - 22:09:08 up 14 min,  1 user,  load average: 0.21, 0.23, 0.30
“22:09:08″ is the current time; “up 14 min” shows how long the system has been up for; “1 user” how many users are logged in; “load average: 0.21, 0.23, 0.30″ the load average of the system (1minute, 5 minutes, 15 minutes).
Load average is an extensive topic and to understand its inner workings can be daunting. The simplest of definitions states that load average is the cpu utilization over a period of time. A load average of 1 means your cpu is being fully utilized and processes are not having to wait to use a CPU. A load average above 1 indicates that processes need to wait and your system will be less responsive. If your load average is consistently above 3 and your system is running slow you may want to upgrade to more CPU’s or a faster CPU.
The second line in top:
Tasks:  82 total,   1 running,  81 sleeping,   0 stopped,   0 zombie
Shows the number of processes and their current state.
The third lin in top:
Cpu(s):  9.5%us, 31.2%sy,  0.0%ni, 27.0%id,  7.6%wa,  1.0%hi, 23.7%si,  0.0%st
Shows CPU utilization details. “9.5%us” user processes are using 9.5%; “31.2%sy” system processes are using 31.2%; “27.0%id” percentage of available cpu; “7.6%wa” time CPU is waiting for IO.
When first analyzing the Cpu(s) line in top look at the %id to see how much cpu is available. If %id is low then focus on %us, %sy, and %wa to determine what is using the CPU.
The fourth and fifth lines in top:
Mem:    255592k total,   167568k used,    88024k free,    25068k buffers
Swap:   524280k total,        0k used,   524280k free,    85724k cached
Describes the memory usage. These numbers can be misleading. “255592k total” is total memory in the system; “167568K used” is the part of the RAM that currently contains information; “88024k free” is the part of RAM that contains no information; “25068K buffers and 85724k cached” is the buffered and cached data for IO.
So what is the actual amount of free RAM available for programs to use ?
The answer is: free + (buffers + cached)
88024k + (25068k + 85724k) = 198816k
How much RAM is being used by progams ?
The answer is: used – (buffers + cached)
167568k – (25068k + 85724k) = 56776k
The processes information:
Top will display the process using the most CPU usage in descending order. Lets describe each column that represents a process.
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
3166 apache    15   0 29444 6112 1524 S  6.6  2.4   0:00.79 httpd
PID – process ID of the process
USER – User who is running the process
PR – The priority of the process
NI – Nice value of the process (higher value indicates lower priority)
VIRT – The total amount of virtual memory used
RES – Resident task size
SHR – Amount of shared memory used
S – State of the task. Values are S (sleeping), D (uninterruptible sleep), R (running), Z(zombies), or (stopped or traced)
%CPU – Percentage of CPU used
%MEM – Percentage of Memory used
TIME+ – Total CPU time used
COMMAND – Command issued

Interacting with TOP

Now that we are able to understand the output from TOP lets learn how to change the way the output is displayed.
Just press the following key while running top and the output will be sorted in real time.
M – Sort by memory usage
P – Sort by CPU usage
T – Sort by cumulative time
z – Color display
k – Kill a process
q – quit
If we want to kill the process with PID 3161, then press “k” and a prompt will ask you for the PID number, and enter 3161.

Command Line Parameters with TOP

You can control what top displays by issuing parameters when you run top.
- d – Controls the delay between refreshes
- p – Specify the process by PID that you want to monitor
-n – Update the display this number of times and then exit
If we want to only monitor the http process with a PID of 3166
$ top -p 3166
If we want to change the delay between refreshes to 5 seconds
$ top -d 5


No comments:

Post a Comment