Sophie

Sophie

distrib > Mandriva > 2009.1 > x86_64 > media > contrib-release > by-pkgid > 14e7762649c6c85d29d87b731da4bd26 > files > 39

nmis-2.00-1mdv2009.1.noarch.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 

<HTML>
<HEAD>

<TITLE>NMIS Documentation</TITLE>

<META HTTP-EQUIV="Content-Language" CONTENT="en-us">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">

<STYLE TYPE="text/css">
TD {
	border-width : 1px; 
	border-style : solid; 
	border-color : #aaaaaa; 
	font-family : Arial; 
}
TABLE {
	color : Black;
	background : White;
	border-width : 1px; 
	border-style : solid; 
	border-color : #aaaaaa; 
	width : 100%;
	font-family : Arial; 
}
P {
	font-family : Arial; 
}
BODY {
	font-style : normal; 
	font-variant : normal; 
	font-size : small; 
	color : white; 
	background-color : #190032; 
	text-decoration : none; 
	font-family : Arial; 
}
A:active {
	color : red;
	background-color : White; 
	text-decoration : none
}
A:link {
	color : blue;
	background-color : White; 
	text-decoration : underline
}
A:visited {
	color : blue;
	background-color : White; 
	text-decoration : underline
}
A:hover {
	color : red;
	background-color : White; 
	text-decoration : underline
}
.heading {
	font-style : normal;
	font-weight : bold;
	font-size : x-large;
	font-family : "Arial Rounded MT Bold";
	background-color : #190032; 
	color : White;
}
#rrdtool { 
	position : absolute; 
	left : 200px; 
	top : 150px;  
}

</STYLE>
</HEAD>
<BODY>
<div class="heading">NMIS Documentation</div>

<table>
  <tr>
    <td width="33%">Last updated 21 June 2001</td>
    <td width="33%">
      <p align="center"><a href="http://www.sins.com.au/nmis/nmis-doc.html">Online
      Version</a></p>
    </td>
    <td width="33%">
      <p align="center"><a href="http://www.sins.com.au/nmis/">NMIS Home Page</a></p>
    </td>
  </tr>
  <tr>
    <td width="50%" colspan="3"> 
    <ul>
      <li><a href="http://www.sins.com.au/sins/nmis/">NMIS Home Page</a>&nbsp;</li>
      <li><a href="#Introduction">Introduction</a></li>
      <li><a href="#Concepts">Concepts</a></li>
      <li><a href="#RolesAndGroups">Roles and Groups</a></li>
      <li><a href="#Health">Health</a></li>
      <li><a href="#Events">Events</a></li>
      <li><a href="#Thresholds">Thresholds</a></li>
      <li><a href="#Updates">Updates</a></li>
      <li><a href="#Interfaces">Interfaces</a></li>
    </ul>
    </td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="Introduction">Introduction</a></b><p>
    NMIS stands for Network Management Information System.&nbsp; It is a Network
    Management System which performs multiple functions from the OSI Network Management
    Functional Areas, those being, Performance, Configuration, Fault.&nbsp; A
      primary function of NMIS is to make information about your network
      available quickly and instantly.&nbsp; Some of this network is provided
      &quot;raw&quot; other information is provided in a related manner.</p>
    <p>It started as a SNMP polling and statistics viewer front-end to Tobi
    Oetiker's <a href="http://ee-staff.ethz.ch/~oetiker/webtools/rrdtool/">RRDTool</a>.&nbsp; <a
    href="http://ee-staff.ethz.ch/~oetiker/webtools/rrdtool/">RRDTool</a> replaces <a
    href="http://ee-staff.ethz.ch/~oetiker/webtools/mrtg/mrtg.html">MRTG</a> but doesn't
    include a front end and backend to handle SNMP polling and display resulting web pages
    etc.&nbsp; The original NMIS evolved quite rapidly to meet demands of production
    environments.&nbsp;&nbsp; </p>
    <p> The backend, polling engine, uses SNMP to collect interface and health
    statistics for Cisco Routers, certain Cisco Catalyst Switches and Generic SNMP devices
    every 5 minutes.&nbsp;&nbsp; The collected statistics are stored in RRD's (Round Robin Databases)
    and ensures that devices are up, issues alerts, etc.&nbsp; The
    front end accesses the information stored in the RRD's and displays statistics the
    resulting graphs, reports, etc.&nbsp;&nbsp; </p>
    <p>Both the front and back ends are highly extensible and features are
    easy to add as the structure is learnt.&nbsp; For example the backend was just collecting
    interface statistics every poll cycle, it was easy to add collection of health (cpu,
    memory, buffer, etc) and response time, availability.</p>
    <p>NMIS uses a backend to collect data and maintain the information about
    the data.&nbsp; It relies on RRD for databases, additional tables are text
    based configuration information.&nbsp; The frontend is independant, it just
    reads information from the RRD's and text tables and displays the
    information.&nbsp; Simple.</p>
    <p>It is intended that NMIS be low maintenance once it is running, it should
    just go and go.&nbsp; More work needs to be done on this but I think it
    going well so far.</p>
    </td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="Concepts">Concepts</a></b>
      <p>The basic concept is that NMIS collects interface, CPU, Memory, buffer
      and packet statistics from Cisco Routers and Switches, it is also capable
      of supporting generic SNMP MIB 2 collection.&nbsp; Getting slightly
      deeper, NMIS pings a device every poll cycle verifies that it is
      &quot;up&quot;, this is called &quot;reachability&quot;, it holds this in
      memory.&nbsp;&nbsp;</p>
      <p>If no system information is available for the device it must be a new
      device so perform a capabilities discovery on the device, this is the
      subroutine getNodeInfo.&nbsp; Otherwise load the cached system information
      with the loadSystemFile then run the updateUptime subroutine which gets
      sysObjectID, sysUpTime and ifNumber, NMIS compares this with the cached
      information to see if the same number of interfaces are present, that the
      uptime has increased and that the sysObjectID is the same.&nbsp;&nbsp;</p>
      <p>If the number of interfaces has changed run the createInterfaceFile
      subroutine to update this information.&nbsp; (This should send an
      configuration change event.)</p>
      <p>If the sysObjectID has changed run the getNodeInfo subroutine. (This
      should send an configuration change event.)</p>
      <p>If the sysUptime is less then the cached information sysObjectID has
      changed run the getNodeInfo subroutine. (This should send an node reload
      event.)</p>
      <p>The runHealth subroutine is run, this collects CPU, Memory, buffers,
      etc, whatever is deemed necessary for that device type and stick it all in
      an RRD.&nbsp;</p>
      <p>Then the runInterfaces subroutine is run, it loads the cached interface
      information, if none exists it creates it with createInterfaceFile.&nbsp;
      Then for each interface it collects ifDescr, ifOperStatus, ifInOctets and
      ifOutOctets.&nbsp; If the ifDescr is different, the cached interface
      information must be out of date (this is how shifting ifIndex is handled)
      create it again with createInterfaceFile.&nbsp; If the ifOperStatus shows
      down when the interface is supposed to be up, raise an event.&nbsp;
      Otherwise store ifOperStatus, ifInOctets and ifOutOctets in an RRD, adding
      ifOperStatus to the total interface availability of the device.</p>
      <p>After the interfaces are complete, calculate the response time for the
      device with another ping and store some health metrics in another RRD, we
      store the reachability of the device, the interface availability of the
      device, the responsetime and create a health metric from a simple
      algorithm which weights various collections and makes up a metric to
      indicate the overall health of that device, more on this in the <a href="#Health">health</a>
      section.</td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="RolesAndGroups">Roles and Groups</a></b>
      <p>The ability exists to put nodes into two types of groups, the first
      group is a role which is core, distribution and access, the second group
      is used to group devices together for reports, and general
      information.&nbsp; It is logical hat the second group be something like
      the building name or city/suburb of the device as this helps identify
      problem areas.<p>Roles play an important part in NMIS, they allow things
      to be weighted for events and various other functions.&nbsp; The concept
      of weighting according to role is simple, if it is a core device then it
      is important and should be treated as such, if it an access device then it
      is less important.&nbsp; The idea is to try and remove the noise, ie all
      events coming in at critical and which ones really are.</td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="Health">Health</a></b>
      <p>The following statistics are considered part of the health of the
      device:</p>
      <ul>
        <li>Reachability - is it up or not;&nbsp;</li>
        <li>Availability - interface availability of all interface which are
          supposed to be up;&nbsp;</li>
        <li>Response Time;&nbsp;</li>
        <li>CPU;&nbsp;</li>
        <li>Memory;&nbsp;</li>
      </ul>
      <p>All of these metrics are weighted and a health metric is created.&nbsp;
      This metric when compared over time should always indicate the relative
      health of the device.&nbsp; Interfaces which aren't being used should be
      shutdown so that the health metric remains realistic.&nbsp; The exact
      calculations can be seen in the runReachability subroutine.</td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="Events">Events</a></b>
      <ul>
        <li>Escalation</li>
        <li>Events based on device role</li>
        <li>Stateful</li>
      </ul>
    </td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="Thresholds">Thresholds</a></b><p>The thresholds
      routine runs whenever you like, it process the collected statistics in the
      RRDs and compares the numbers to stored thresholds and if exceeded raises
      an event for that device.&nbsp; The thresholds use the device role to
      weight the events.</td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="Updates">Updates</a></b><p>Updates ensures that
      all the cached system and interface information is kept up to date.&nbsp;
      If the network is constantly changing then it should be run frequently,
      otherwise it could be run less frequently.</td>
  </tr>
  <tr>
    <td width="100%" colspan="3"><b><a name="Interfaces">Interfaces</a></b>
      <p>Interfaces which aren't in use should be shutdown (admin down) so that
      NMIS doesn't think it is supposed to manage them.&nbsp; A simple lookup is
      done on interface types to determine if NMIS should collect statistics on
      them.&nbsp; This is done during the createInterfaceFile subroutine.</td>
  </tr>
</table>
</BODY>
</HTML>