Sophie

Sophie

distrib > Mandriva > 2010.0 > x86_64 > by-pkgid > 1898956993b565515416aa8e5bf49e8f > files > 20

devmon-0.3.1-0.beta1.1mdv2010.0.noarch.rpm


 DEVMON TEMPLATES
=====================================================================


 What are templates?
---------------------------------------------------------------------

    Templates are one of the key concepts behind Devmon, which make
  it uniquely flexible when compared to many other monitoring scripts.

    Templates allow you to configure the way Devmon treats your 
  devices, on a per-model basis.   They allow you define the following
  specific items for different model types:

    - Preferred SNMP version(s)
    - The specific OIDs to query on (both repeater and non-repeaters)
    - Any transformations needed to be preformed on collected data
    - Model specific thresholds
    - Model specific exceptions
    - Custom output messages

    This flexibility should (in theory) allow a savvy systems
  administrator to monitor any type of device for any possible
  condition;  the only pre-requisite is that the device be SNMP
  monitorable in the first place!



 Rolling your own
---------------------------------------------------------------------

    Creating a Devmon template is (I hope) a relatively trivial
  task.  It requires no programming experience; however you will
  most likely benefit from knowing a little about regular expressions.
  
    If you aren't familiar with regular expressions (or 'regexps') you
  should take a few minutes and look over this website:
    http://www.regular-expressions.info/

    Okay, first we examine the template file structure.  All templates
  data is located in the "templates" subdirectory of your Devmon
  installation.  A single node installation of Devmon reads this
  directory once per poll-cycle, while a multi-node installation
  reads it from the Devmon database on an as-needed basis.

  MULTI-NODE NOTE: 
    I would recommend that you only keep a single copy of your 
    templates directory, preferably the one on your display server.  
    All others templates directories (i.e. the ones on your Devmon 
    nodes) are extraneous and should be removed.  This way, when you 
    sync your templates on disk to your database, there's no 
    confusion as to which set of templates on disk match the ones 
    that a multinode devmon installation is using (which are the 
    ones in the database).

    For all our examples below, we will assume that we are working in
  the template directory "/usr/local/devmon/templates".  This will
  probably vary in your installation, depending on where you have
  installed Devmon.

    The first tier of sub-directories in the templates dir are
  vendor-model specific.  So, each subdirectory represents a 
  particular model from a particular vendor (so, a Cisco 2950 would 
  have one directory, while a Cisco 3750 would have another).  Any 
  files in the templates directory will be ignored;  only directories
  are examined.  The actual name of these directories is irrelevant,
  as the actual vendor and model names are specified in the 'specs' 
  file (described below).  However, it doesn't hurt to make the 
  subdirectories somewhat descriptive, I usually use a 
  "vendor-model" style.
  

  SPECS FILE
  ------------------------------------------------------------------

    Under each vendor-model directory, there should be a single
  file, named 'specs' (for specifications), and one or more
  subdirectories, each of which represent a particular test for that
  vendor-model.  The specs file contains data that is vendor-model 
  specific, but not necessarily test specific.  

  The 'specs' file should look something like this:

  -<start file>-----------------------------

    vendor   : cisco
    model    : 2950
    snmpver  : 2
    sysdesc  : C2950

  -<end file>-------------------------------

    Note that their variables and their values are each listed on 
  their own newline, and separated by colons.  This is the format 
  used for most (if not all) of the files in the Devmon template 
  structure.

    The 'vendor' and 'model' variables are specific to this 
  particular device type, that is, there should not be another specs
  file elsewhere in the templates tree that has the same values for 
  both variables.  If there is, Devmon will complain about trying to 
  redefine a template and reject the second template.

    The 'snmpver' variable defines what version of snmp should be 
  used to query this device.  Acceptable values are 1, 2, and 2c 
  (which is the same as '2').

    The 'sysdesc' variable is used by the type auto-detection that 
  Devmon does when it initially reads the host from the bb-hosts
  file (when using the --readbbhosts command line argument).  This
  value should be unique when compared to the value of the other
  templates.  It's a regular expression, so you can match a 
  fairly complicated pattern, if you so desire.


  TEST DIRECTORY
  -------------------------------------------------------------------
  
    Each subdirectory of the vendor-model directory represents an
  individual test.  The name of the directory is significant, as
  it is what determines the name of the test reported to your
  display server!  So the subdirectory in your vendor-model 
  directory named 'cpu' defines the cpu test, the one named 'if_err'
  defines the if_err test, etc.

    Under each test subdirectory, there are five files:
  
    oids
    transforms
    thresholds
    exceptions
    message

    All five files MUST be present for the template to be read
  successfully, although the thresholds, transforms and exceptions
  files can all be empty.  So, a quick list of files needed for a 
  'cpu' test on a Cisco 2950 should look as follows:

    /usr/local/devmon/templates/cisco-2950/specs
    /usr/local/devmon/templates/cisco-2950/cpu/oids
    /usr/local/devmon/templates/cisco-2950/cpu/transforms
    /usr/local/devmon/templates/cisco-2950/cpu/thresholds
    /usr/local/devmon/templates/cisco-2950/cpu/exceptions
    /usr/local/devmon/templates/cisco-2950/cpu/message

  Note that all of these files except for the message file can
  contain comments.  Any line that starts with a pound symbol (#)
  is treated as a comment by Devmon, and ignored.
 
  Now we'll go over each of these files in detail...
 
  ----------------------------------
  -- The 'oids' file
  ----------------------------------

    The oids file contains, you guessed it, the oids that you
  want to SNMP query for this type of device.  It should look 
  something like this:

  -<start file>-----------------------------

    sysDescr        : .1.3.6.1.2.1.1.1.0               : leaf
    sysReloadReason : .1.3.6.1.4.1.9.2.1.2.0           : leaf
    sysUpTime       : .1.3.6.1.2.1.1.3.0               : leaf
    CPUTotal5Min    : .1.3.6.1.4.1.9.9.109.1.1.1.1.5.1 : leaf

  -<end file>-------------------------------

  Note that there are three values per line;  the first value is 
  the alias that Devmon uses throughout the rest of the template
  files, the second value is the *NUMERIC* value for the oid, the
  third is the repeater type ('leaf', which is a non-repeater type
  oid, vs 'branch' which is a repeater type).

  Its important that you use the numeric version of an oid for
  the second value in this file.  Devmon will not map the string
  version of an OID to its numeric version before it does a query,
  which means that your SNMP query is will fail if you use
  an alphanumeric oid instead of a numeric one (i.e. 'sysDescr'
  is alphanumeric, '.1.3.6.1.2.1.1.1.0' is numeric).  I chose
  to do this because it is a pain to keep all the various MIBs
  installed on all of the nodes in a multi-node cluster, and it
  was just easier to specify them once here.  Note that the
  oid aliases are case sensitive: 'SysDescr' is treated as a
  separate alias from 'sysDescr'.

  Also important to note is that OIDs are shared between tests
  on the same template.  So if you specify OID aliases with i
  identical names (they are case sensitive, remember) in multiple 
  tests in a template, there is only going to be a single value
  stored in memory, which both OID aliases point to.  The upshot 
  of this is, that if you use the same OID alias in multiple
  tests (and this is recommended, as it will make your template
  run faster), then they *MUST* have the numeric OID value.  If
  they dont, you are going to get inconsistent results, as
  the value stored in memory might arbirtrarily be from one SNMP
  variable or another.

  ----------------------------------
  -- The 'transforms' file
  ----------------------------------
  
    The most complicated file in your template, the transforms
  file lays out the different data transformations that Devmon
  needs to perform on the collected SNMP data before it applies
  thresholds and renders the final message.

  The cisco 2950 cpu test uses a very simple transforms file:
 
  -<start file>-----------------------------

    sysUpTimeSecs   : MATH          : {sysUpTime} / 100
    UpTimeTxt       : ELAPSED       : {sysUpTimeSecs}

  -<end file>-------------------------------

  Like the oids file, it has three values per line, separated
  by colons.  

    The first value is the OID alias.  Note that this
  should be a unique value compared to any of the aliases
  defined in the 'oids' file.  Notice in this example that
  the 'sysUpTimeSecs' alias is a transformed version of
  the 'sysUpTime' alias, which was defined in the oids file
  and whose data is collected via SNMP.  For the rest of
  this help file, we use the term 'alias' to interchangeably
  refer to either a variable containing data collected via SNMP
  or containing data from a translation.
 
    The second value in the line is the name of the type of
  transform.  These are case insensitive (i.e. 'MATH' is the
  same as 'math') but we refer to them in the uppercase form
  to distinguish them from other functions.

    The third value defines the 'data input' for the transform
  specified by the second value.  The result of this data
  put through the specified transform will be stored in the
  alias defined by the first value.  Note that any aliases
  supplied in this field are encased in curly braces (e.g. 
  {sysUpTime}).  This tells Devmon that this is an alias
  containing an snmp or translated value, and not just a normal
  string. 

    The data input for a transform can (depending on the 
  transform type) consist of one or more OID aliases defined 
  elsewhere. These aliases don't have to necessarily be defined 
  in a line prior the transform that they are used in, Devmon is 
  smart enough to figure out the hierarchy in which they should 
  be used.  If you have a dependency loop somewhere, Devmon will 
  point that out to you, as well.

  Also, note that if you use a non-repeater type data alias
  as the input for a transform, the transformed alias will also
  be a non-repeater.  Likewise for a repeater type data alias.
  If you mix repeater and non-repeater type data aliases in
  the transform input, the resulting transformed alias will
  be a repeater.
 
  With regards to duplicated OID aliases across multiple tests
  in a single template, transformed OID aliases have the same
  rules as non-transformed aliases: if you use the same 
  transformed OID alias in multiple tests (which is recommended
  as this cuts down on the time devmon spends running test logic)
  then their transform rules *must be identical*,  as must
  all OID aliases that your transformed alias depends on.  So, for 
  example, if you have this defined in your if_load test on your 
  cisco-2950 template:

    ifInBps         : MATH          : {ifInOps} x 8

  and this defined in your if_stat test on your cisco-2950 template:
  
    ifInBps         : MATH          : {ifInOctets} x {time} x 8

  you are going to be in trouble, because the 'time' OID alias
  might not even exist in the if_load test.  So try to keep your 
  duplicated OID aliases as simple as possible, so you dont have
  you tests stepping on each others toes (although if you do have
  two transformed OIDs doing the same transform on the same data,
  you should by all means duplicate them, as this will make your
  tests run much faster).

  There are a number of different types of transforms, which
  we will discuss below: (listed in alphabetical order)


  'BEST' transform:

    This transform takes two data aliases as input, and stores
    the values for the one with the 'best' alarm color (green being 
    the 'best' and red being the 'worst') in the transformed data 
    alias. The oids can either be comma or space delimited.

  'CHAIN' transform:

    Occasionally a device will store a numeric SNMP oid (AKA the 
    'data' oid) as a string value under another OID (the 'leaf' oid).
    The CHAIN transform will create a third 'transformed' oid, 
    containing the leaves of the 'leaf' oid and the values of
    the 'data' oid.  A quick example:

      In your oids file, you have defined:
        leafOid  : .1.1.2     : branch
        dataOid  : .1.1.3     : branch

      After walking leafOid and dataOid, they return the values:
        .1.1.2.1 = '.1.1.3.1194'
        .1.1.2.2 = '.1.1.3.2342'
          and
        .1.1.3.1194 = 'CPU is above nominal temperature'
        .1.1.3.2342 = 'System fans are non-operational'

      Chances are that you won't know what leaf values will be returned
      for .1.1.3, but you know that .1.1.2 returns consistant values.
      You can use the CHAIN transform to 'chain' these two oids together
      to make the data more accessible.  The format for the CHAIN
      transform is:
        chainedOid   : CHAIN    : {leafOd} {dataOid}

      If you used the above transform with the previously mentioned
      data, you would end up with:
        chainedOid.1 = 'CPU is above nominal temperature'
        chainedOid.2 = 'System fans are non-operational'


  'CONVERT' transform:
    
    Convert a string in either hexidecimal or octal to its
    decimal equivalent.  Takes two arguments, a target OID alias
    and a conversion type, which must be either 'hex' or 'oct'.

    For instance, to convert the hex string '07d6' to its decimal
    equivalent (2006, as it so happens), do this:

      intYear : CONVERT: {hexYear} hex


  'DELTA' transform:

    The DELTA transform performs a 'change over time' calculation
    on the supplied data.  It takes a single data alias, with an
    optional 'upper limit' (separated from the alias by whitespace)
    as input.  

    The change over time calculation will be performed between
    one poll interval and the next, and returns a measurement
    of data units per second.

    The limit is used as the maximum value of the data alias, and 
    comes in to play when the value from supplied data alias from 
    last polling cycle is more than the value from your current 
    polling cycle.  

    This typically occurs when you have counter-wrap issues in 
    SNMP (as most counters are still 32 bit;  an interface with
    heavy traffic can wrap its ifOctet counters in less than 
    two minutes).

    If you don't specify a limit and Devmon detects a counter
    wrap, it will use either the 32bit or 64bit upper limit,
    accordingly.  

    The upshot of this is that you CANNOT MEASURE NEGATIVE
    DELTAS WITH THIS TRANSFORM.  If you really really need to,
    please contact the software author and make a feature request.

    Delta examples:
      changeInValue  : DELTA : {value}
        or
      changeInValue  : DELTA : {value} 2543456983

    Keep in mind that the DELTA transform takes at least two
    poll cycles to return meaningful data, so in the mean time
    you will get a 'wait' result stored in the target OID alias
    (as well as an aliases that are transformed based off the
    target alias).


  'DATE' transform:

    This transform takes a single data alias as input, the value of 
    which Devmon assumes to be seconds in "Unix time" (i.e. seconds 
    since the Epoch [00:00:00 GMT, January 1, 1970]) It then stores
    in the transformed data alias a text string containing the date
    corresponding to the number of seconds input, in the format 
    CCYY-MM-DD, HH:MM:SS (24 hour time).  


  'ELAPSED' transform:

    This transform takes a single data alias as input, the value of 
    which Devmon assumes to be in seconds.  It then stores a text 
    string in the transformed data alias containing the number of 
    years, days, hours, minutes and seconds equal to the number of 
    seconds provided as input to the transform.


  'INDEX' transform:
    This transform allows you to access the index part of a
    numerical OID in a repeater OID.

    For example, in the cdpCache table for the Cisco CDP MIB,
    walking the cdpCacheDevicePort OID will return values such as:

    CISCO-CDP-MIB::cdpCacheDevicePort.4.3 = STRING: GigabitEthernet4/41
    CISCO-CDP-MIB::cdpCacheDevicePort.9.1 = STRING: GigabitEthernet2/16
    CISCO-CDP-MIB::cdpCacheDevicePort.12.14 = STRING: Serial2/2

    The value is the interface on the remote side, and there is no
    OID for the interface on the local side. To get the interface
    on the local side, you must use the last value in the index
    (e.g. 3 for GigabitEthernet4/41) and look in the ifTable:

    IF-MIB::ifName.3 = STRING: Fa0/0

    The index transform allows you to get the index value (4.3 
    in this case) as an OID value. Any operations you need to
    do on the index value should be possible with existing
    transforms.

  'MATH' transform: 
   
    The MATH transform performs a mathematical expression defined
    by the supplied data.  It can use the following mathematical
    operators:
       '+'           (Addition)
       '-'           (Subtraction)
       'x'           (Multiplication)
       '/'           (Division)
       '^'           (Exponentiation)
       '(' and ')'   (Expression nesting)

    This transform is not whitespace sensitive, so both:
       {sysUpTime} / 100
         and
       {sysUpTime}/100

    ...would be accepted, and are functionally equivalent.
    The mathematical expressions you can perform can be
    quite complex, such as:
       ((({sysUpTime}/100) ^ 2 ) x 15) + 10

    Note that the syntax of the MATH transform is not stringently
    checked at the time the template is loaded, so if there are
    any logic errors, they will not be apparent until you attempt
    to use the template for the first time (any errors will be 
    dumped to the devmon.log file on the node that they occurred on).

    Decimal precision can also be controlled via an additional 
    variable seperated from the main expression via a colon:

       transTime : MATH : ((({sysUpTime}/100) ^ 2 ) x 15) + 10 : 4 
 
    This would ensure that the transTime alias would have a 
    precision value (zero padded, if needed) of exactly 4 characters
    (i.e. 300549.3420).  The default value is 2 precision characters.
    To remove the decimal characters alltogether, specify a value of 0.
    

  'REGSUB' transform:

    One of the most powerful and complicated transforms, the regsub
    transform allows you to perform a regular expression substitution
    against a single data alias input.  The data input for a regsub
    transform should consist of a single data alias, followed by a
    regular expression substitution (the leading 's' for the 
    expression should be left off).  For example:

      ifAliasBox : REGSUB  : {ifAlias} /(\S+.*)/ [$1]/

    The transform above takes the input from the ifAlias data 
    alias and, assuming that it is not an empty string (ifAlias
    has to have at least one non-whitespace character in it) it
    puts square braces around the value and puts a space in front
    of it.  This example is used by all of the Cisco interface
    templates included with Devmon, to include the ifAlias
    information for an interface, but only if it has a value
    defined.  A very powerful, but easily misused transform.  
    Should be interest in using it but not know much about 
    substitution, you might want to google 'regular expression
    substitution' and try reading up on it.


  'SPEED' transform:

    This transform takes a single data alias as input, which it 
    assumes to be a speed in bits.  It then stores a value in the 
    transformed data alias, corresponding to the largest whole 
    speed measurement.  So a value of 1200 would render the string
    '1.2 Kbps', a value of 13000000 will return a value of '13 
    Mbps', etc.


  'SUBSTR' transform:

    The substr transform is used to extract a portion of the 
    text (aka a 'substring') stored in the target OID alias.
    This transform takes as arguments: a target alias, a
    starting position (zero based, i.e. the first position is
    0, not 1), and an optional length value.  If a length
    value is not specified, substr will copy up to the end of
    the target string.

    So, if you had an OID alias 'systemName' that contained
    the value 'Cisco master switch', you could do the following:

      switchName : SUBSTR : {systemName} 0 12
        stores 'Cisco master' in the 'switchName' alias, or:

      switchName : SUBSTR : {systemName} 6
        stores 'master switch' in the 'switchName' alias.


  'SWITCH' transform:

    The switch transform transposes one data value for another.  
    This is most commonly used to transform numeric values 
    returned by an snmp query into its textual equivalent. The 
    first  argument in the transform input should be the oid to be 
    transformed.  Following this should be a list of  comma-
    delimited pairs of values, with each pair of values being 
    separated by an equals sign.  For example: 

      upsBattRep : SWITCH : {battRepNum} 1 = Battery OK, 2 = Replace battery

    So this transform would take the input from the 'upsBattRepNum'
    data alias and compare it to its list of switch values.  If
    the value of upsBattRepNum was 1, it would store a 'Battery OK'
    value in the 'upsBattRep' data alias. 

    You can use simple mathmatical tests on the values of the 
    source OID alias, as well as assigning values for different
    OIDs to the target alias.  For instance:

      dhcpStatus : SWITCH : {dhcpPoolSize} 0 = No DHCP, >0 = DHCP available

    The format for the tests are as follows (assuming 'n','a' and 'b' are 
    floating point numerical value [i.e. 1, 5.33, 0.001, etc], and 's'
    is a alphanumeric string):

 
       n     :   Source alias is equal to this amount
       >n    :   Source alias is greater than this amount
       >=n   :   Source alias is greater than or equal to this amount
       <n    :   Source alias is less than this amount
       <=n   :   Source alias is less than or equal to this amount
       a - b :   Source alias is between 'a' and 'b', inclusive
       's'   :   Source alias matches this string exactly (case sensitive)
       "s"   :   Source alias matches this regular expression (non-anchored)

      
    Note that switch statements are applied in a left to right order;
    so if you have a value that matches the target value on multiple
    switch statements, the leftmost statement will be the one applied.

    The switch statement can also assign values from another OID 
    to the target OID alias, depending on the value of the source OID
    alias, like this:


      dhcpStatus : SWITCH : {dhcpPoolSize} 0 = No DHCP, >0 = {dhcpAvail}

    This would assign the value 'No DHCP' to the 'dhcpStatus' alias
    if and only if the 'dhcpPoolSize' alias contained a value equal 
    to zero.  Otherwise, the value of the 'dhcpAvail' alias would
    be assigned to dhcpStatus.  Note that threshold stats for
    the 'dhcpAvail' status (i.e. the 'color' and 'message' assigned
    to 'dhcpAvail' by any threshold tests) would not be inherited
    by the 'dhcpStatus' variable;  if you want to inherit threshold
    information, use the TSWITCH transform instead.

  'TSWITCH' transform:
    
    The TSWITCH transform is functionally equivalent to the SWITCH
    transform in every way, with one exception:  if any OID alias
    is used as a data source for the target alias, such as the
    'dhcpAvail' alias in this transform:

      dhcpStatus : TSWITCH : {dhcpPoolSize} 0 = No DHCP, >0 = {dhcpAvail}

    The threshold values for that alias will be copied to the target
    alias (in this case, 'dhcpStatus'), and no further thresholds
    will be applied to the target alias.  Any non-OID data sources
    can still have thresholds applied against them (for instance, if
    'dhcpStatus' had been assigned the string 'No DHCP' by this
    transform, you could have matched a threshold against that 
    value).  This is useful if you have two seperate OIDs and you
    need to do a compound threshold involving them both.

  'UNPACK' transform:

    The unpack transform is used to unpack binary data into any
    one of a number of different data types (all of which are 
    eventually stored as a string by Devmon).  This transform
    requires a target OID alias and an unpack type (case sensitive),
    separated by a space.

    As an example, to unpack a hex string (high nybble first), try this:

      hexString : UNPACK : {binaryHex} H

    The unpack types are as follows:
    
      Type  |  Description              
     ----------------------------------------------------
         a  | ascii string, null padded
         A  | ascii string, space padded
         b  | bit string, low to high order
         B  | bit string, high to low order
         c  | signed char value
         C  | unsigned char value
         d  | double precision float
         D  | single precision float
         h  | hex string, low nybble first
         H  | hex string, high nybble first
         i  | signed integer
         i  | unsigned integer
         l  | signed long value
         L  | unsigned long value
         n  | short integer in big-endian order
         N  | long integer in big-endian order
         s  | signed short integer
         S  | unsigned short integer
         v  | short integer in little-endian order
         V  | long integer in little-endian order
         u  | uuencoded string
         x  | null byte
       
    
  'WORST' transform:

    This transform takes two data aliases as input, and stores
    the values for the one with the 'worst' alarm color (red being 
    the 'worst' and green being the 'best') in the transformed data 
    alias. The oids can either be comma or space delimited.

  ----------------------------------
  -- The 'thresholds' file
  ----------------------------------

    The thresholds file defines the limits against which the
  various data aliases that you have created in your 'oids'
  and 'transforms' files are measured against.  And example
  thresholds file is as follows:

  -<start file>-----------------------------

    upsLoadOut  : red     : 90          : UPS load is very high.
    upsLoadOut  : yellow  : 70          : UPS load is high.

    upsBattStat : red     : Battery low : Battery time remaining is low.
    upsBattStat : yellow  : Unknown     : Battery status is unknown.

    upsOutStat  : red     : On battery|Off|Bypass              : {upsOutStat}
    upsOutStat  : yellow  : Unknown|voltage|Sleeping|Rebooting : {upsOutStat}

    upsBattRep  : red     : replacing : {upsBattRep}

  -<end file>-------------------------------

    As you can see, the thresholds file consists of one entry per 
  line, with each entry consisting of three to four fields separated 
  by colons.  The first field in an entry is the data alias that the 
  threshold is to be applied against.  The second field is the color
  that will be assigned to the data alias should it match this
  threshold.  The third field are the threshold values, which are
  the values that the data alias in the first field will be compared
  against.   You can have multiple values, delimited by commas, in
  the third field.  The fourth field is the threshold message, which
  will be assigned to the data alias in the first field if it
  matches this threshold.  

  The threshold message can contain other data aliases.  If the data
  alias in field one is a repeater type alias and the alias in field
  four is also a repeater type alias, then the data in the fourth
  field will match that in the first field on a per-leaf basis.

  You typically do not need to specify a 'green' threshold, as
  Devmon will assign a green value to a data alias if it doesn't 
  match a red or yellow threshold.  If you want to have a message
  associated with a green threshold, you can specify it with
  a green color and a threshold value of '_AUTOMATCH_' (with the
  single quotes).  This will cause devmon to automatically match the
  threshold when it gets to it, and will assign the green message
  to the data alias.

  Keep in mind that Devmon attempts to match thresholds in order
  from highest severity to lowest severity (the severity list
  being: red->yellow->clear->green).  If more than one threshold
  matches, the highest is considered to be the most accurate.

  One important thing to note about thresholds is that they
  are lumped into one of two categories: numeric and non-numeric.

  Numeric thresholds should consist of only numbers, possibly
  preceded by one of the following logical math operators:
    >    (greater than)
    <    (less than)
    >=   (greater than or equal to)
    <=   (less than or equal to)
    =    (equal to)

  If no math operator is defined in the threshold, Devmon assumes
  that it is a 'greater than' type threshold.  That is, if the
  value obtained via SNMP is greater than this threshold value, the
  the threshold is considered to be met and devmon will deal with
  it accordingly.

  If a threshold value contains even one non-numeric character
  (other than the math operators illustrated above), it is
  considered a non-numeric threshold.  Non-numeric thresholds are
  treated as regular expressions, and devmon tries to match
  them against the value of the data contained in the oid that
  the threshold is applied against.

  Regular expressions in threshold matches are non-anchored,
  which means they can match any substring of the compared data.
  So be careful how you define your thresholds, as you could
  match more than you intend to!  If you want to make sure your
  pattern matches explicitly, precede it with a '^' and terminate
  it with a '$'.

  
  ----------------------------------
  -- The 'exceptions' file
  ----------------------------------

    The exceptions file is contains rules which are only applied
  against repeater type data aliases.

  An example of a exceptions file is as follows:

  -<start file>-----------------------------

    ifName : alarm  : Gi.+
    ifName : ignore : Nu.+|Vl.+

  -<end file>-------------------------------

  You can see that each entry is on its on line, with three
  fields separated by colons.  The first field is the primary
  data alias that the exception should be applied against.  The
  second field is the exception type, and the third field is 
  the regular expression that the primary alias is matched
  against.  Exception regular expressions (unlike non-numeric
  thresholds) ARE anchored, and thus need to match the primary
  oid EXACTLY.

  Exceptions are only applied against the first (primary) alias 
  in a repeater table (which is described below). There are four 
  types of exceptions types that you can use, they are:

    ignore:
      The 'ignore' exception type causes Devmon not display
      rows in a repeater table which have a primary oid that
      matches the exception regexp.

    only:
      The 'only' exception type causes Devmon to only display 
      rows in a repeater table which have a primary oid that
      matches the exception regexp.

    alarm:
      The 'alarm' exception causes Devmon to only generate
      alarms for rows in a repeater table that have a 
      primary oid that matches the exception regexp.

    noalarm:
      The 'noalarm' exception causes Devmon to not generate
      alarms for rows in a repeater table that have a 
      primary oid that matches the exception regexp.

  The exceptions are applied in the order above, and one primary
  alias can match multiple exceptions.  So if you have a primary
  alias that matches both an 'ignore' and an 'alarm' exception,
  no alarm will be generated (in fact, the row wont even be
  displayed in the repeater table). 
    
  The example file listed above, from a cisco 2950 if_stat test, 
  tells Devmon to only alarm on repeater table rows which
  have a primary oid (in this case, ifName) that starts
  with 'Gi' and has any number of characters after that (which
  will match any Gigabit interfaces on the switch).  Also, it
  tells Devmon not to display any rows with a primary alias
  that has a value that behind with Nu (a Null interface) or
  Vl (A VLAN interface).


  ----------------------------------
  -- The 'messages' file
  ----------------------------------

    The messages file is what brings all the data collected from the
  other files in the template together in a single cohesive entry.
  It is basically an web page (indeed, you can add html to it, if you
  like) with some special macros embedded in it.

  An example of a simple messages file is as follows:

  -<start file>-----------------------------
 
  {upsStatus.errors}
  {upsBattStat.errors}
  {upsLoadOut.errors}
  {upsBattRep.errors}

  UPS status:

  Vendor:              apc
  Model:               {upsModel}

  UPS Status:          {upsOutStat}
  Battery Status:      {upsBattStat}

  Runtime Remaining:   {upsMinsRunTime} minutes
  Battery Capacity:    {upsBattCap}%
  UPS Load:            {upsLoadOut}%

  Voltage in:          {upsVoltageIn}v
  Voltage out:         {upsVoltageOut}v

  Last failure due to: {upsFailCause}
  Time on battery:     {upsSecsOnBatt} secs

  -<end file>-------------------------------
  
  You can see in this file that it is just a bunch of data aliases,
  with one or two special exceptions.  Most of these will just be 
  replaced with their corresponding values.  You can see at the
  top of the file, however, that there are a few weird looking
  data aliases (the ones that end in .errors).  These are just
  normal data aliases with a special flag appended to them, that
  lets Devmon know that you want something from them than just
  their data value.

  Here are all of the alias flags, and their functions:

    color:

      This flag will print out the bb/hobbit color string
    assigned to this data alias by the thresholds (this string
    looks like '&red' or '&green', etc).  This color string
    will be interpreted by hobbit as a colored icon, which
    makes alarm conditions much easier to recognize.  Like the
    'errors' flag, it will also modify the global color.

    errors: 

      The errors flag on a data alias will list any errors on
    this data alias.  In this case, 'errors' refers to the message
    assigned to the alias from a non-green threshold match (the
    message is the value assigned in the fourth field of an entry
    in the thresholds file, remember?).  If the value assigned
    to an data alias is green, then this value that replaces this
    flag will be blank.  
    
    Error messages will always be printed as the TOP of the message
    file, regardless of where they are defined with in it.   This 
    is done to make sure that the user sees any errors that might
    have occurred, which they might miss if the messages file is 
    too long.

    The errors flag will also modify the global color of the 
    message.  So if this error flag reports a yellow error, and the
    global color is currently green, it will increase the global 
    color to yellow.  If the error flag reports a red error, it 
    will increase the global color to red.  The global color
    of a message defaults to green, and is modified upwards
    (if you consider more severe colors to be 'up') depending
    on the contents of the 'error' and 'color' flags.

    msg:

      The msg flag prints out the message assigned to the data
    alias by its threshold.  Unlike the errors flag, it prints
    the message even if the data alias matches a green threshold
    and it also does NOT modify the global color of the message. 
    
    thresh:

      The syntax for the threshold flag is {oid.thresh:<color>}.
    It displays the value in the threshold file (or custom threshold)
    that corresponds with the supplied color.  So,
    {CPUTotal5Min.thresh:yellow} would display the template value
    for the yellow threshold for the CPUTotal5Min oid, or a per-device
    custom threshold if one was defined.

    A more complicated message file is this one, taken from a 
  Cisco 2950 switch if_stat test:


  -<begin file------------------------------

  Ifc name|Ifc speed|Ifc status
  {ifName}{ifAliasBox}|{ifSpeed}|{ifStat.color}{ifStat}{ifStat.errors}

  -<end file>-------------------------------

    In this message file, we are using a repeater table.  Repeater 
  tables are used to display repeater-type data aliases (which 
  ultimately stem from 'branch' type snmp oids).  The 'TABLE:' 
  keyword (case sensitive, no leading whitespace allowed) is what
  alerts Devmon that the next one to two lines are a repeater
  table definition.

  Devmon basically just builds an HTML table out of the repeater 
  data.  It can have an optional header, which should be specified 
  on the line immediately after the 'TABLE:' tag.  If not table 
  header is desired, the line after the table tag should be the 
  row data identifier.

  The row data identifier is the one that contains one or more
  data aliases.  The first of these aliases is referred to as the
  'primary' alias, and must be a repeater-type alias.  Any other
  repeater type aliases in the row will be keyed off the primary
  alias;  that is, if the primary aliases has leaves numbered
  '100,101,102,103,104', the table will have five rows, with the 
  first row having all repeater aliases using leaf 100, the 
  second row having all repeaters using leaf 101, etc.  Any
  non-repeaters defined in the table will have a constant value
  throughout all of the rows.

  The TABLE: key can have one or more, comma-delimited options 
  following it that allow you to modify the way in which Devmon
  will display the data.  These options can have values
  assigned to them if they are not boolean ('nonhtml', for example,
  is boolean, while 'border' is not boolean).

  The TABLE: options 
    nonhtml
      Don't use HTML tags when displaying the table.  Instead
      all columns will be separated by a colon (:).  This is
      useful for doing NCV rrd graphing in hobbit.

    plain
      Don't do any formatting. This allows repeater data (each 
      item on it's own line), without colons or HTML tables.
      One use of this option is to format repeater data with 
      compatibility with a format Hobbit already understands.
      An example is available in the disk test for the 
      linux-openwrt template.

    noalarmsmsg
      Prevent Devmon from displaying the 'Alarming on' header
      at the top of a table.

    alarmsonbottom
      Cause Devmon to display the 'Alarming on' message at the
      bottom of the table data, as opposed to the top.

    border=n
      Set the HTML table border size that Devmon will use
      (a value of 0 will disable the border)

    pad=n
      Set the HTML table cellpadding size that Devmon will use

  An example of the some TABLE options in use:

    TABLE: alarmsonbottom,border=0,pad=10

  The STATUS: key allows you to extend the first line of the status
  message that Devmon sends to BB/Hobbit. For example, if you need
  to get data to a Hobbit rrd collector module that evaluates data
  in the first line of the message (such as the Hobbit la collector
  which expects "up: <time>, %d users, %d procs load=%d.%d" you can 
  use this key as follows to get a load average graph:

    STATUS: up: load={laLoadFloat2}
  
  ----------------------------------
  -- Done!
  ----------------------------------

    That's it!  Once you've completed the five files mentioned
   above, you should, in theory, have a working template.  I would
   recommend building the template under a separate 'test' 
   installation of Devmon, as the single-node version of Devmon
   re-reads the template directory once per poll period, and
   having an incomplete or broken template will cause Devmon to
   throw error messages into its log.  

   Try extracting the Devmon tarball to somewhere like 
   "/usr/local/devmontest", and fiddle with the templates from
   there.  Run Devmon from this directory in single-node mode, 
   using a dummy bb-hosts file (even if your production Devmon
   cluster runs in multi-node mode, running the test Devmon in
   single node mode prevents you from having to create an 
   additional database for your devmon "test" installation).
   With the -vv and -p flags (i.e. devmon -vv -p), you will 
   get verbose output from Devmon, and if you have a host 
   in the bb-hosts file that matches the sysdesc in the specs
   file of the the model-vendor for the new template you created,
   you will also get textual output of your new template!
   (the -p flag causes devmon to not run in the background and
   to print messages to STDOUT as opposed to sending them to the 
   display server, and the -vv flag causes Devmon to log verbosely).

   Once you are satisfied that your template is working correctly,
   you can put it to work in your production installation.  In
   a single-node installation, this is as simple as copying the
   template directory to the appropriate sub-directory of your
   templates/ dir.  On the next poll cycle, Devmon will pick up
   the new template, and any new hosts discovered by your
   readbbhosts cron job will be added to the Devmon database
   using this new template.

   In a multinode installation, adding a new template is only
   slightly more difficult.  Copy the template directory to
   the appropriate place on the machine where you keep all your
   templates (earlier we recommended using your display server,
   and deleting all the template directories on the node machines).
   Once you have it in place, run devmon with the --synctemplates
   flag.  This will read in the templates, update the database
   as necessary, and then notify all the devmon nodes that they
   need to reload their templates.  A full template reload on
   all your machines can take up to twice the interval of your
   polling cycle, so be patient!
    
$Id: TEMPLATES 121 2009-01-23 09:15:00Z buchanmilne $