Sophie

Sophie

distrib > Mandriva > current > x86_64 > by-pkgid > 731d42b2bae9a9941314f9d780a43bc0 > files > 91

mon-1.2.0-8mdv2010.1.x86_64.rpm

The protocol for agents (remote or local monitor scripts)
to deliver failures to the mon server:

Trap consists of tag/value pairs which are separated by newlines. The
first tag must be "pro", which is the protocol version.

Tags which are understood are:

#
# MON-specific tags
# pro   protocol
# aut   auth
# typ   type (0=mon, 1=snmpv1)
# spc   specific type (TRAP_*)
# seq   sequence
# grp   group
# svc   service
# hst   host
# sta   status (opstatus)
# tsp   timestamp as time(2) value
# sum   summary output
# dtl   detail (terminated by \n.\n)
#
# SNMP-specific tags
# ent   enterprise OID
# agt   agent address
# gtp   generic trap type
# stp   enterprise-specific trap type
# tmp   sysUptime timestamp
# vbl   varbindlist (OID = value)
#

SNMP-specific tags do nothing at this time.

Rather than formulating the trap PDU yourself, it's a good idea to use
Mon::Client::send_trap. See the POD for Mon::Client for more details,
or see remote.alert for an example.

If an alert for a watch or service is delivered to a mon server and
its configuration does not include that watch or service, it will use
the default watch/service "default" to deliver the alert. If "default"
is not defined in the mon.cf, the alert will be logged and then discarded.

NOTE: alert/upalert stats are not handled specially for 'default' traps,
so if one unknown alert trap comes in, followed by a unknown upalert
from a different host, then the alert output from mon may be confusing.
Set up a default watch, and use it as a debugging guide to catch random
trap and remind you to update your mon config file.

watch default
    service default
	period wd {Sun-Sat}
	    alert some.alert
	    upalert some.alert -u

See the mon.1 man page for the list of environment variables availble to
monitor and alert programs. One particular environmet variable to note is
the MON_TRAPINTEND variable. This is a colon (:) separated watch
group / service pair which was the intended recipient when a default watch
group and service were invoked for a trap.  This hopefully gives you
some ability to figure out what to do with a trap caught by "default",
and could be exploited to allow a lazy administrator to send useful
information from alerts ;)

There is a (very simple) alert script called "remote.alert" which
delivers a failure detected locally to a remote mon process. This
allows centralization of alert handling, and it allows distributed
mon processes. Pass the mon host name via -H <host> and the port via
-P <port>.

you could use remote.alert to send a trap from one mon server to another
mon server. this can be useful for implementing a hierarchy of mon
servers, where the topmost level serves as the alert management node
for the lower leaf nodes. for example:

mon server "highlevel":

watch pr-internet
    service http_tp
        period wd {Sun-Sat}
            alert mail.alert name@address.com


mon server "lowlevel":

watch pr-internet
    service http_tp
	monitor http_tp.monitor
	interval 5m
	period wd {Sun-Sat}
	    alert remote.alert -H highlevel


when the pr-internet/http_tp service fails on the mon server "lowlevel",
it will send a trap to the mon server "highlevel", which will then send
the email alert.