Sophie

Sophie

distrib > Mandriva > 2010.1 > x86_64 > media > contrib-release > by-pkgid > 7e26efb37d341018703ae869a9607e64 > files > 53

collectl-3.4.0-1mdv2010.1.noarch.rpm

<html>
<head>
<link rel=stylesheet href="style.css" type="text/css">
<title>collectl - Process Monitoring</title>
</head>

<body>
<center><h1>Process Monitoring</h1></center>
<p>
<h3>Introduction</h3>
Collectl has the ability to monitor processes in pretty much the same way as
ps or top do as can be see here:

<div class=terminal-wide14>
<pre>
# collectl -sZ
# PROCESS SUMMARY (faults are /sec)
# PID  User     PR  PPID S   VSZ   RSS  SysT  UsrT Pct  AccuTime MajF MinF Command
21502  root     15  1749 S    6M    2M  0.00  0.00   0   0:06.40    0    0 /usr/sbin/sshd
21504  root     15 21502 S    4M    1M  0.00  0.00   0   0:00.79    0    0 -bash
22984  root     15     1 S    7M    1M  0.00  0.00   0   0:00.78    0    0 cupsd
23073  apache   15  1914 S   18M    8M  0.00  0.00   0   0:00.01    0    0 /usr/sbin/httpd
</pre>
</div>

You can select processes to monitor by pid, parent, owner or command name (see
section on Filters below).
When using names, you can use partial or full names or even use strings that
were part of the command invocation string such as parameters.  
The main benefit of monitoring processes with
collectl is that you can coordinate the sample times of process data with
any of the other subsystems collectl can monitor.
<p>
The way you tell collectl to monitor processes is to specify the Z subsystem
and any optional parameters with --procopts.  Since
monitoring processes is a heavier-weight function, it is recommended to use a
different interval, which can be specified after the main monitoring interval
separated by a colon.  The default is 60 seconds.  Therefore, to monitor all
the processes once every 20 seconds and the rest of the parameters every 5
simply say:

<pre>
collectl -sZ -i5:20
</pre>

The biggest mistake people make when running this command interactively is to
leave off the interval or specificy something like -i1 and not see any process
data.  That is because the default interval is 60 seconds and they just
haven't waited long enough for the output!  This should obvious since collectl
will announce it is <i>waiting for a 60 second sample</i>.
<p>
There are also a few restrictions to the way these intervals are specified.  The
process interval must be a multiple of the main interval AND cannot be less
than it.  If you specify a process interval without a main interval, the main
interval defaults to the process interval.
<p>
Finally, as with other data collected by collectl, you can play back process
data by specifying -p.  While not exactly plottable data, you can specify
-P and the output will be written to a separate file as time stamped space
delimited data, one process per line.
<p>
<h3>Options</h3>
As stated earlier, you can specify options specific to process monitoring.  These
apply to all forms of process output unless otherwise noted, including <i>--top</i>
and <i>--procanalyze</i>
<ul>
<li>c: include cpu times of children who have exited with their parent</li>
<li>f: use cumulative totals for maj/min page faults</li>
<li>i: Show alternate format which includes <i>all</i> io counters</li>
<li>m: Show alternate format which includes <i>all</i> memory sizes</li>
<li>p: Never look for new pids or threads after startup (improves performance)</li>
<li>r: Only show root name of command, leaving off path</li>
<li>t: Look for thread for all processes</li>
<li>w: Include command arguments, making a <i>wider</i> display.  Can be combined with r.</li>
<li>z: Only show processes with non-zero sort field.  This only applies to --top.</li>
</ul>
<p>
<h3>Filters</h3>
You can tell collectl to monitor a subset of processes by using the 
<i>--procfilt</i> switch followed by one or more
process selectors, separated by commas (see <i>collectl -x</i> to see a
detailed list).  For the most part, the use of filters is pretty
straighforward in that if you want to see all processes whose parent is 1234
as well as those that contain <i>http</i> in their command name, you would
specify a filter of <i>--procfilt P1234,chttp</i>.  However there is one
important distinction to keep in mind.  The c prefix says to select on the
command name only whereas the f prefix says to look at the entire command 
string including arguments.  In other words, if you're editing the file
<i>abc</i> and try to select it via <i>--procfilt cabc</i> you'll never see it.
This is particularly annoying when monitoring perl scripts since the name of
the command is perl and the name of the script shows up as an argument.
 <p>
If a plus sign immediately follows a
process selector any processes selected by it will have their threads
monitored as well.  See <i>collectl -x</i> or <i>man collectl</i> for more details.
<p>
<h3>Dynamic Process/Thread Monitoring</h3>
A unique feature of process monitoring is that processes specified with a
selection list via --procfilt do
not have to exist at the time collectl is run.  In other words, collectl will
continue to look for new processes that match this selection list during every
collection cycle!  While this is indeed a good thing if that is
what you want to do, it does
come with a price in overhead: <i>not a lot, but overhead never-the-less</i>.
<p>
If you do not want this effect and only want to look at those processes that match
the selection list at the time collectl is started, specify <i>--procopts p</i> 
to suppress dynamic process discovery.
This holds for process threads as well, suppressing looking for new ones.
<p>
Perhaps the best way to see this in effect is to run collectl with the
following command:

<pre>
collectl -i:.1 -sZ --procfilt fabc
</pre>

The .1 for an interval is not a mistake.  It is there to show that you
can indeed use collectl to spot the appearance of short
lived processes - <i>just don't do it unless you really need to</i>.  The <i>--procfilt</i>
switch is saying to look for any processes invoked with a command that contains the
string 'abc' in it.  When this command is invoked there shouldn't be any output unless someone IS
running a command with 'abc' in it.  Now go to a
different window or terminal and edit the file abc with your favorite editor.  
You will immediately see
collectl display process information for your editor and when you exit the editor the output will stop.
<p>
<h3>The Time Fields</h3>
The SysT and UsrT represent the system and user time the line item spent during
the current interval.  One might think this means that in a 60 second interval the
most time a process could spend is 60 seconds.  Not quite!  If this is a
multi-processor/multi-core system the process could actually spend up to 60 seconds
on each core, so just be careful how the times are interpretted.  The Pct field is
the percentage of the current interval the process had consumed in system and user
time, which can also exceed 100% in multi-processor situations.  Finally, since the
AccuTime field accumulates these times it can exceed the actual wall clock time.
<p>
When run in non-threaded mode, the times reported include all time consumed by
all threads.  When run in threaded mode, times are reported for indivual
threads as well as the main process.  In other words, if a
process's only job is to start threads, it will typically show times of 0.  If
you rerun collectl in non-threaded mode you will see it report aggregated
times.
<p>
<h3>Process Memory Utilization</h3>
The types of memory utilization displayed as part of the process monitoring output
are the <i>Virtual</i> and <i>Resident</i> sizes.  However there are additional
type of memory that collectl tracks and to see them as well as page faults
you can select alternate process display format as follows:

<div class=terminal-wide14>
<pre>
# collectl -sZ -i:1 --procopts m
# PID  User     S VmSize  VmLck  VmRSS VmData  VmStk  VmExe  VmLib MajF MinF Command
 9410  root     R 81760K      0 15828K 14132K    84K    16K  3620K    0   18 /usr/bin/perl
    1  root     S  4832K      0   556K   212K    84K    36K  1388K    0    0 init
    2  root     S      0      0      0      0      0      0      0    0    0 kthreadd
    3  root     S      0      0      0      0      0      0      0    0    0 migration/0
</pre>
</div>

<h3>Process I/O Statistics</h3>
As of collectl Version 2.4.0, if process I/O stats have
been built into the kernel collectl will add 2 additional columns to the
process display named <i>RKB</i> and <i>WKB</i>,
noting in the following example I've set the display interval to
1 second and removed the initialization message from the output.  As with all
fields reported as rates/sec these will show consistent values independent of
the interval and if you want the <i>unnormalized</i> value be sure to include
that option with the -o switch as -on.

<div class=terminal-wide14>
<pre>
# collectl -sZ -i:1
# PROCESS SUMMARY (faults are /sec)
# PID  User     PR  PPID S   VSZ   RSS  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
    1  root     20     0 S    4M  552K  0.00  0.00   0   0:00.68    0    0    0    0 init
    2  root     15     0 S     0     0  0.00  0.00   0   0:00.00    0    0    0    0 kthreadd
    3  root     RT     2 S     0     0  0.00  0.00   0   0:00.02    0    0    0    0 migration/0
</pre>
</div>

A particularly useful feature I've found is monitoring one or more processes by name (you
can also monitor by pid, ppid and uid) to see
what they're doing.  In this case I'm using the dt program to write a large file and telling
collectl to display any process whose command string matches <i>dt</i> as well as to include
time stamps.

<div class=terminal-wide14>
<pre>
# collectl -sZ -i:1 --procfilt cdt -oT
# PROCESS SUMMARY (faults are /sec)
#          PID  User     PR  PPID S   VSZ   RSS  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
09:01:03 13577  root     20 12775 R    1M    1M  0.04  0.00   4   0:01.92    0  16K    0    0 ./dt
09:01:04 13577  root     20 12775 D    1M    1M  0.40  0.00  40   0:02.32    0 118K    0    0 ./dt
09:01:05 13577  root     20 12775 D    1M    1M  0.24  0.00  24   0:02.56    0  65K    0    0 ./dt
</pre>
</div>

Finally, note that there is more process I/O data available but I chose to leave it off
the default display and instead have the following alternate format. This is
the same methodology used for reporting process memory utilitation, namely you only
see VSZ and RSS in the default display but much more with <i>--procopts m</i>.  
Also note in this caseI chose 1/2 second monitoring as well as showing time in msec resolution:

<div class=terminal-wide14>
<pre>
# collectl -sZ --procopts i -i:.5 --procfilt cdt -oTm
#              PID  User     S  SysT  UsrT   RKB   WKB  RKBC  WKBC  RSYS  WSYS  CNCL  Command
09:03:24.003 13614  root     D  0.12  0.00     0   32K     0   32K     0    64     0  ./dt
09:03:24.503 13614  root     D  0.14  0.00     0   32K     0   32K     0    64     0  ./dt
09:03:25.003 13614  root     R  0.10  0.00     0   24K     0   24K     0    48     0  ./dt
</pre>
</div>

<h3>The <i>--top Switch</i></h3>
A feature that has been in collectl for awhile has been the <i>--top</i> switch which generates
data in a format similar to the linux <i>top</i> command, though it was limited to process data
only.  However, since the inclusion of process I/O statistics as well as the inclusion of a few
additional handy switches, this command has become much more useful as explained below.
<p>
In its simplest form, this switch tells collectl to simply display the top consumers of cpu.
However, as of collectl V2.6.4 you can now now tell it to optionally display the list sorted by
I/O or page faults.  
Here I'm simply looking for the top processes sorted by page faults with the
command <i>collectl --top flt</i> and the display fills my window, which in this case is only 
10 lines high.  To look at the top consumers of I/O, simply use <i>--top io</i> instead:

<div class=terminal-wide14>
<pre>
# PID  User     PR  PPID S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
 3009  root     20     1 S    2M  280K  3  0.00  0.00   0   0:43.01    0    0    0    8 irqbalance
 7144  root     20  6485 R   81M   15M  2  0.00  0.06   6   0:01.70    0    0    0    5 /usr/bin/perl
    1  root     20     0 S    4M  556K  2  0.00  0.00   0   0:03.60    0    0    0    0 init
    2  root     15     0 S     0     0  2  0.00  0.00   0   0:00.00    0    0    0    0 kthreadd
    3  root     RT     2 S     0     0  0  0.00  0.00   0   0:00.10    0    0    0    0 migration/0
    4  root     15     2 S     0     0  0  0.00  0.00   0   0:00.06    0    0    0    0 ksoftirqd/0
    5  root     RT     2 S     0     0  0  0.00  0.00   0   0:00.30    0    0    0    0 watchdog/0
    6  root     RT     2 S     0     0  1  0.00  0.00   0   0:00.08    0    0    0    0 migration/1
    7  root     15     2 S     0     0  1  0.00  0.00   0   0:00.02    0    0    0    0 ksoftirqd/1
</pre></div>

As discussed earlier, threads can be considered for displaying
with <i>--procopts t</i>
which requests <i>all selected processes</i> be examined for threads.  You can also
specify a subset of processes be examined by specifying a + with 
<i>--procfilt</i> but that's getting into more advanced concepts.
Fnally, in the spirit of saving screen real estate collectl doesn't include command arguments in
the output but including <i>--procopts w</i> will request a wider display that does include
them.  In fact you can get an even narrow display by including <i>--procopts</i> which requests
only a command's root name be displayed so in the example about we would see <i>perl</i>
instead if <i>/usr/bin/per</i>.
<p>
The following 3 successive displays are the result of
monitoring a processes named <i>thread.pl</i> which creates a couple of threads 10
seconds apart which then do some I/O.  In the first display we see the main script, which is
actually run under the perl interpretter and so the string <i>thread</i> does exist as part of
the argument string, but I chose to leave it off the output to save screen real estate:

<div class=terminal-wide14>
<pre>
collectl --top io --procfilt fthread --procopt t
# PROCESS SUMMARY (faults are /sec) 06:57:42
# PID  User     PR  PPID S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF CommandnF Command
 7024  root     20  6725 S   61M    2M  2  0.00  0.00   0   0:00.00    0    0    0    0 /usr/bin/perl
</pre></div>

A few seconds later the first thread starts up and immediately goes to the top of the list since
it does have the dominant I/O:

<div class=terminal-wide14>
<pre>
# PROCESS SUMMARY (faults are /sec) 06:57:52
# PID  User     PR  PPID S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
 7065+ root     20  6725 R   73M    5M  0  0.88  0.12 100   0:01.98    0 291K    0    0 thread.pl
 7064  root     20  6725 S   73M    5M  2  0.88  0.11  99   0:01.98    0    0    0    0 /usr/bin/perl
</pre></div>

And still later the second thread shows up, it too having a higher sort order than the root script:

<div class=terminal-wide14>
<pre>
# PROCESS SUMMARY (faults are /sec) 06:58:02
# PID  User     PR  PPID S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
 7098+ root     20  6725 R   83M    8M  1  0.12  0.02  14   0:00.86    0  29K    0    0 thread.pl
 7096+ root     20  6725 R   83M    8M  0  0.16  0.00  16   0:04.24    0  27K    0    0 thread.pl
 7095  root     20  6725 S   83M    8M  2  0.28  0.02  30   0:05.13    0    0    0    0 /usr/bin/perl
</pre></div>

Naturally, as with all other data in collectl, you can record it to a file and play it
back later using various combinations of options with <i>--procopts</i> and even
<i>--top</i>.  If you are using <i>--top</i> collectl simply displays a blank line
between intervals.  Also don't forget to try different sort options
and experioment with the number of lines per process interval you want to 
examine since the default is your screen height and may be too big for 
playback purposes.
<p>
<b>Including non-process data</b><br>
The native <i>top</i> can natively show other types of data besides the top processes
and so can collectl.  Just specify those subsystems you are interested in with -s and
they will be displayed in a scrolling window above the process data - by
scrolling multiple lines of data, you are able to see history, something the
linux command cannot do.  You may also want to include timestamps with the
brief data by using -oT to make it easier to read.
<p>
But don't stop with brief data, you can even show verbose data as well.  However
in the case of multiple subsystems it just isn't practical to show scrolling
history and so you will only see the latest sample.  If you choose to show
a single verbose subsystem you will see scrolled data.
<p>
Finally, if you want to customize the way the screen real-estate is allocated between 
the process and other data, you can change the size of the process section
by including the number of lines to display as the second argument to 
<i>--top</i>.  You can also control the size of the subsystem data with
<i>--hr lines</i>, a synonym for <i>--headerrpeat lines</i>.

<p>
<h3>Experimental <i>--export proctree</i></h3>
This is an experimental (meaning subject to change) alternate format for displaying
process data.  Rather than simply show processes in the defaut PID order or sorted 
by a particular field when using the --top format, this format displays processes
in a parent/child relationship.  As with all <i>--export</i> formats, one can use 
this interactively, when playing
back data or to send the data over a socket when using <i>--address</i>.
At the very least, this could offer a good
starting point for developing your own alternative process output formats.
<p>
There are actually 2 main functional components to this format, the main one 
being to determine the parent child relationship between all processes (there IS 
some additional overhead involved here).  A second function is the aggregation of
various counters and meters.  
<p>
Proctree can also be combined with --top to limit the number of processes display OR
in playback mode with or without --top.  Consider the following output when playing 
back a file with <i>--top --export proctree</i>:

<div class=terminal-wide14>
<pre>
#  PID       PPID User     PR S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime MajF MinF Command
00001           0 root     15 S    2G  108M  0  0.03  0.03   0   0:18.21    0    0 init
 05535          1 root     15 R  106M   15M  2  0.01  0.03   0   0:07.68    0    0  /usr/bin/perl
 05452          1 haldaemo 15 S   85M    7M  1  0.02  0.00   0   0:06.99    0    0  hald
  05453      5452 root     15 S   55M    3M  0  0.02  0.00   0   0:05.42    0    0   hald-runner
   05474     5453 root     16 S    9M  652K  0  0.02  0.00   0   0:05.41    0    0    hald-addon-storage:
</pre></div>

One can quickly see that the total CPU consumption for this monitoring interval is 0.03 of
both system and user time by simply looking at root process 1.  Furthermore, of the system time
0.01 is consumed by 05535 while the other 0.02 is consumed by one of the children of 05452, actually
the grandchild 05474.  One should also note that any processes with no CPU time will be excluded
from the display to keep the output reasonable dense.  Without --top all processes are shown.
<p>
One can also use most of the process options as well (see --showsubopts for the complete set).
<p>
<b>Additional interactive --top options</b>
<br>
Proctree was really developed for real-time display with <i>--top</i> and so there are more available
options, the main one to consider is the suppression of fields with zero in them.  In the previous
example, fields with 0 CPU were suppressed because by default <i>--top</i> sorts by CPU (even though
we're not sorting).  If one were to choose a different sort field with <i>--top</i>, proctree will 
use that field to suppress entries with zero in them.  In fact, there are a number
of different switches one can select interactively, one of which is to change the suppression
value from 0 to something else.
<p>
So let's take a closer look at running in interactive mode by typing the command
<pre>
collectl --top --export proctree
</pre>
and at some time after the first data screen is displayed, type RETURN.  You will now see
a menu like this:

<div class=terminal>
<pre>
Enter a command and RETURN while in display mode:
  pid    only display this pid and its children
  a      toggle aggregation between 'on' and 'off'
  dxx    change display hierarchy depth to xx
  i      change display format to 'I/O'
  k      toggle multiplication of I/O numbers by 1024 between 'on' and 'off'
  m      change display format to 'memory'
  p      change display format to 'process'
  h      show this menu
  stype  where 'type' is a valid sorting type (see --showtopopts)
         entries with 0s in those field(s) will be skipped
  wxx    max width for display of command arguments
  z      toggle 'skip' logic between 'on' and 'off'
  Zxx    when skipping, only keep entries with I/O fields > xxKB
Press RETURN to go back to display mode...
</pre></div>

This list shows a number of <i>commands</i> which will change the display contents and/or 
format much in the way you can do with the standard linux <i>top</i> command.  First type
RETURN to get back into real-time display mode and then simply type
a command at any time while collectl is running and the command will take effect on the 
next display cycle.
<p>
These commands fall into several categories, one being those that <i>toggle</i> behavior 
such as aggregation, multiplication and the skip logic.  By default, all values are
aggregated up through their parent hierarchy and typing the a command followed by a
RETURN will turn this
behavior off.  Similarly, when the values of the I/O counters are too large to easily read
you can force their division by 1K with the k command.  And finally, you can disable the
logic that skips zero-based entries with the z command.  If you'd rather skip on some
value other 0 you can set the skip value with Zxxx.
<p>
Look at the display line above the following process data:

<div class=terminal-wide14>
<pre>
Process Tree 09:06:03 [skip when 'time'<=0 is 'on' aggr: 'on' x1024: 'off' depth 5]
#  PID       PPID User     PR S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime MajF MinF Command
00001           0 root     15 S  674M  272M  1  0.00  0.06   6   0:09.96    0    0 init
 01766          1 root     15 S   50M   24M  0  0.00  0.06   6   0:01.30    0    0  /usr/sbin/sshd
  02142      1766 root     15 S   25M   14M  1  0.00  0.06   6   0:00.88    0    0   /usr/sbin/sshd
   02144     2142 root     15 S   18M   12M  1  0.00  0.06   6   0:00.87    0    0    -bash
    02229    2144 root     19 R   14M   10M  0  0.00  0.06   6   0:00.84    0    0     /usr/bin/perl
</pre></div>

Following the time field you can see what the toggle states are of the three fields as well
as the skip value and display depth.  By default you only see 5 levels of the process 
hierarchy but can change this with the d switch.  For example d7 will set the depth to 7.
<p>
As with other process displays, you can also choose whether you want to see the default display, one
that shows all I/O fields or one focused on memory using the p, i or m commands.  You can easily
switch between these formats at any time.
<p>
If you enter a number as a command, this is interpretted as
a process PID and the display will be adjusted such that this becomes the first entry in the
display.  If you would like to skip on something other than the current field, you can easily
change that with the s command immediately following by one of the sort field names listed with
<i>--showtop</i>.  Finally, if using the wide command option with <i>--procopts w</i>, long command
string will cause wrapping and make the display unreadable.  The w command can be used to set
the maximum width of the command field.
<p>
As with other collectl options, there are simply far too many combinations to describe which
are appropriate for a particular situation (such as using --procopt) so it is recommended 
you experiment to better understand the many capabilities of <i>proctree</i>.
<p>
<h3>Process Analysis</h3>
If you've run collectl as a daemon and collected process data, you now have a huge pile of
data and up until version 2.6.5 it wasn't entirely clear what you could do with is short of
writing some scripts to try and interpret it.  Now you have an alternative and that's to
playback one or more data files with the <i>--procanalyze</i> switch.  When this switch is
passed to collectl, it generates a <i>process summary file</i> with the extension of 
<i>prcs</i> in the same direcory as specified with the -f switch.  This file will contain
one line for each unique process and the fields will be separated by collectl's field separator
which by default is a space but something you can also change with the <i>--sep</i> switch.
<p>
The fields themselves summarize all the key data elements associated with each process making
it possible to see the process start/end times, cpu consumption, I/O (if the kernel supports
I/O stats), page faults and even the ranges of the different types of memory consumed.  And since
the data elements are separated by a single character delimeter you can easily load the file
into your favorite spreadsheet and perform deeper analysis (the data is actually not very user 
friendly as written).
<p>
It is also important to remember a couple of things:
<ul>
<li>Not all processes show up in the output if they were created and exited between samples</li>
<li>You can control the time frame that is analyzed by using the <i>--from</i> and 
<i>-thru</i> switches which can be a
useful thing to do when you're interesed in a specific time period and don't want this file
too cluttered</li>
<li>The <i>--procanalyze</i> overrides collectl's default behavior of processing all the data
that has been collected and so will only generate the <i>prcs</i> file(s).  If you want to
generate plot data for other subsystems at the same time be sure to include then with -s.
<li>By default, all data is written in space-separated format which is collectl's standard
default.  Since command arguments (if any) are also space-separated they will each show up
in a different spreadsheet cell, which shouldn't be a big deal but if you want the entire
command string together, you can always use <i>--sep 9</i> to make the data tab-delimited
or even choose a different separator.</li>
</ul>
<p>

<h3>Understanding Processing Overhead</h3>
This is intended to be a brief description of how process monitoring works with
the hope that it will help one use the capability more efficiently and avoid unnecessary
processing overhead.  Normally the overhead is modest, but if you intend to run at higher
monitoring rates or looking at threads it's worth reading further.
<p>
Collectl maintains 2 data structures that control monitoring: 
<i>pids-to-monitor</i> and <i>pids-to-ignore</i>.  
These lists are built at the time collectl starts, so if
<i>--procopts p</i> is not specified, the effect is to execute a ps command 
and save all the
pids in the pids-to-monitor list.  If filters are specified with 
<i>--procfilt</i>, only those
pids that match are placed in pids-to-monitor list and the rest placed in the
pids-to-ignore list and so you can see that when filters are used there can be 
a significant reduction in overhead since collectl need not examine every
processes data.
<p>
If collectl is only monitoring a specific set of processes, either because <i>--procopts p</i>
was specified or <i>procfilt</i> was used and only specified specific pids (not ppids), on
each monitoring pass collectl only looks at the pids in the to-be-monitored
list.  In other words, this is as efficient as it gets because it needn't look
for processes if neither list, aka <i>newly created processes</i>.
<p>
If doing dynamic process monitoring, every monitoring pass collectl has to
read all the pids in <i>/proc</i> to get a list of ALL current processes.  While it ignores any in
the do-not-monitor list, it must look at the rest.  If any of these are in the
to-be-monitored list and have had thread monitoring requested, additional work
is required to see if any new threads have shown up.
Any processes not in the to-be-monitored list are obviously NEW processes and must
then be examined to see if they
match any selection criteria and this involves
reading the <i>/proc/pid/stat</i> file.  That pid is
then placed in one of the two lists.
It should be understood that during any particular interval a lot of processes
come and go, such as cat, ls, etc.  However, these are short lived enough as to
not even be seen by collectl, unless of course collectl is running at a very
fine grained monitoring level.
<p>
Occasionally a process being monitored disappears because it had terminated.
When this happens its pid is removed from the to-be-monitored list.
<p>
Finally, these data structures (and a couple of others that have not been
described) need maintenance to keep them from growing.  If the number of
processes to monitor has been fixed, this maintenance is significantly reduced.
<p>
So the bottom line is if you have to use dynamic monitoring, try to bound the
number of processes and/or threads.  If you really need to see it all, don't
be afraid to but just be mindful of the overhead.  Collecting all process
data with the default interval has been observed to take about 1 minute of CPU
time, which is less than 0.1%, on a lightly loaded Proliant DL380, but that load
will be higher with more active process.
<p>
<h3>RESTRICTIONS</h3>
<ul>
<li>You cannot specify <i>--procfilt</i> during playback mode.  If you need to look at a subset of
the data consider using a filter like <i>grep</i>.</li>
<li>Thread monitoring is limited to 2.6 kernels.</li>
<li>Process I/O monitoring is limited to kernels that have that capability enabled and that
didn't even appear before 2.6.22.  If you don't see the file <i>/proc/self/io</i>, your kernel
was not built with process I/O accounting enabled and you need to get one that has the following
enabled: CONFIG_TASKSTATS, CONFIG_TASK_XACCT and CONFIG_TASK_IO_ACCOUNTING.</li>
<li>There is a bug in the way the kernel reports I/O stats (see bugzilla 
<a href=http://bugzilla.kernel.org/show_bug.cgi?id=10702>10702</a>) such 
that when you exclude threads from the display the parent process I/O counts are not 
aggregated into the I/O counts from the threads resulting in misleading results.
Andrea Righi has provided a patch <a href=http://lkml.org/lkml/2008/5/19/463>here</a> that
provides the correct aggregation, but of course that will require a kernel rebuild.  He
has also informed me that his patch has been included in the 2.6.26 kernel so this should no
longer be an issue from that version forward.
</ul>
</body>
</html>