Sophie

Sophie

distrib > Fedora > 18 > i386 > by-pkgid > 2f005288f02807767c03d63072cff88a > files > 87

freeipmi-1.2.4-1.fc18.i686.rpm

petalert.pl
-----------------------

This is a snmptrapd handler script to alert when Platform Event Traps
(PET) occur.  It was written because traptoemail distributed with
net-snmp-5.3.2.2 is incapable of handling multi-line hexstrings and
restricted to email alert.

This script operates in two modes, traphandle or embperl.  When in
traphandle mode, it concatenates the quoted hex string into one long
line, then builds structures to resemble embperl mode. Both modes then
invokes helper decoder, ipmi-pet(8) from FreeIPMI, parses the output
and alerts in given way like email, nagios external command, nsca, etc.


1. REQUIREMENTS

freeipmi-1.1.1 and above is required for the script to function. Both
FreeIPMI and the script imply Unix-like system, notably GNU/Linux;
Windows is not supported as of this writing, Dec 13, 2011.

Net-SNMP 5.3.2.2 and above is required to make a complete alerting
solution.  Actually only snmptrapd is related which acts as the trap
receiver.

If you prefer to running it as embedded perl handler, your version of
Net-SNMP should have embedded perl support enabled, see "Embedded Perl
Support" from snmpd.conf(8) for more infomration. Usually it's
enabled, and you can verify with the following command:

# net-snmp-config --configure-options | tr ' ' '\n' | grep perl
'--enable-embedded-perl'

If you prefer to invoking the handler directly rather than invoking it
with perl(1), make sure the script itself has execute permission. Both
cases require a working Perl installation, better Perl-5.8.8.

If you prefer to built-in email alerting, make sure Net::SMTP is
installed.

If you prefer to Nagios monitoring system, make sure the Nagios
process and snmptrapd is on the same host. Usually you don't need to
worry about write permission of Nagios external command file, because
the handler is invoked as root by snmptrapd. If that's not your case,
you need to ensure write permission on the command file.

You might prefer to other alerting methods, bad news is it is not
implemented yet. Please drop me a mail, then I might take my time to
go on with plugin support.

Paranoids might check firewall rules allowing only traffic from
trusted hosts.


2. CONFIGURATION

(Note backslash-newline concatenates adjacent lines, so put them in
one) Put a line like these in your snmptrapd.conf file:

  traphandle .1.3.6.1.4.1.3183.1.1 /usr/bin/perl \
  /usr/share/doc/freeipmi/contrib/pet/petalert.pl --mode=traphandle \
    --alert=email --sdrcache SDRCONF -- -f FROM -s SMTPSERVER ADDRESSES

Or, if you prefer embedded perl, 

  perl do "/usr/share/doc/freeipmi/contrib/pet/petalert.pl";
  perl IpmiPET::main(qw(--mode=embperl --trapoid=OID --sdrcache=SDRCONF \
    --alert=email -- -s SMTPSERVER -f FROM ADDRESSES));

where:
    only --mode is required, see "petalert.pl -h".

Make sure execute permission is granted to execute handlers, for
example,

  authCommunity execute COMMUNITY_STRING

see "ACCESS CONTROL" from snmptrapd.conf(8) for more information.  Bad
news is that you have to use numeric representation, so in addition
add "-Of -On" to snmptrapd options.

You have to enable PET on IPMI nodes as well, including LAN access,
PEF alerting, community, alert policy and destination. You may use
bmc-config and ipmi-pef-config from FreeIPMI to do those
configuration. See "IPMI NODES".

You might wish to set up PTR records for IPMI nodes, otherwise,
snmptrapd reports <UNKNOWN> to traphandle and the script will fall
back to use ip.

2.1 ACKNOWLEDGE

Platform event trap is over UDP, you might worry about trap loss. IPMI
spec allows the trap receiver to acknowledge the trap. Use --ack to
acknowledge the trap before alerting. You may need workarounds for
acknowledgement.  See BUGS. So in a acknowlege setup, it might be like
this:

  perl do "/usr/share/doc/freeipmi/contrib/pet/petalert.pl";
  perl IpmiPET::main(qw(--mode=embperl --trapoid=OID --sdrcache=SDRCONF \
    --ack -W malformedack \
    --alert=email -- -s SMTPSERVER -f FROM ADDRESSES));

2.2 NAGIOS INTEGRATION

Nagios monitoring system could be plugged into by writing to its
external command file as passive check. See ipminodes.cfg and
check_rmcping for related Nagios configuration.

Assuming Nagios process is local, use:

  perl do "/usr/share/doc/freeipmi/contrib/pet/petalert.pl";
  perl IpmiPET::main(qw(--mode=embperl --trapoid=OID --sdrcache=SDRCONF \
    --alert=nagios -- -H short -S PET NAGIOS_COMMAND_FILE));

where "-H short" means if 10.2.3.4 resolves to foo.example.com, Nagios
passive check gets foo as host; use "-H fqdn" to pass foo.example.com
to Nagios. In addition, "-S PET" sets service description.

If Nagios process is on remote host, normally you turns to NSCA which
consists of NSCA daemon on the Nagios host and the send_nsca client 
program. To alert by send_nsca,

  perl do "/usr/share/doc/freeipmi/contrib/pet/petalert.pl";
  perl IpmiPET::main(qw(--mode=embperl --trapoid=OID --sdrcache=SDRCONF \
    --alert=nsca -- --prog /usr/bin/send_nsca -H short -S PET \
    -- -H NAGIOS_HOST -c SEND_NSCA_CONF));

Notice the unattached -- appears two times in the configuration line
separating three steps of arguments processing, namely generic args, 
alert specific args, and external helper args.


3. SDR CACHE FILE MAPPING

Notice the underlying helper program ipmi-pet(8) normally depends on
some sdr cache file, either preinitialized or created on demand. If no
credential is supplied, ipmi-pet(8) simply assumes localhost and
creates sdr cache which is usually
~/.freeipmi/sdr-cache/sdr-cache-<hostname>.localhost.  You may wish to
supply preinitialized ones, then use -c sdrmapping.conf to associate
them with IPMI nodes.

The sdr cache config syntax is: every unindented line starts an sdr
cache file, followed by any number of indented lines of IPMI
nodes. Every IPMI node line may consist of multiple nodes delimited by
whitespaces. Comments follow Shell-style, trailing whitespaces are
trimmed, empty lines skipped.

For example,

|/path/to/sdr-cache-file-1
|  10.2.3.10    # comment
|
|/path/to/sdr-cache-file-2
|  10.2.3.4           # one node
|  10.2.3.5  10.2.3.6 # two nodes
|  10.2.3.[7-9]       # trhee nodes in range form
|  
^-- this is the beginning of lines

3.1 SDR CACHE INITIALIZATION

The sdr cache file can be initialized by ipmi-sel(8) and the
--sdr-cache-file option.

# ipmi-sel -h 10.2.3.4 -u root -P --sdr-cache-file=/path/to/sdr-cache-file-X
Password: 
Caching SDR repository information: /path/to/sdr-cache-file-X
Caching SDR record 125 of 125 (current record ID 125) 
ID  | Date        | Time     | Name             | Type                     | Event
1   | Dec-12-2011 | 16:41:51 | SEL              | Event Logging Disabled   | Log Area Reset/Cleared
...


4. IPMI NODES

For PET to be generated, configurations on IPMI nodes have to be done,
including LAN access, PEF alerting, trap community, alert policy and
destination. You may use bmc-config and ipmi-pef-config from FreeIPMI
to do those configuration.

However, before doing configurations and facing unexpected firmware
issues, you'd better verify that the trap receiver end works
well. Simply modify the following example traphandle input to meet
your setup, then feed it to stdin of petalert.pl like this, assuming
you prefer to alert email:

# perl petalert.pl -D :all --mode=traphandle  --sdrcache SDRCONF \
 --alert=email -- -f FROM -s SMTPSERVER ADDRESSES <<EOF
pet.example.com
UDP: [10.2.3.4]:32768
.1.3.6.1.2.1.1.3.0 60:5:11:46.26
.1.3.6.1.6.3.1.1.4.1.0 .1.3.6.1.4.1.3183.1.1.0.356096
.1.3.6.1.4.1.3183.1.1.1 "44 45 4C 4C 50 00 10 59 80 43 B2 C0 4F 33 33 58 
00 42 19 EE AB 64 FF FF 20 20 00 41 73 18 00 80 
01 FF 00 00 00 00 00 19 00 00 02 A2 01 00 C1 "
.1.3.6.1.6.3.18.1.3.0 10.2.3.4
.1.3.6.1.6.3.18.1.4.0 "public"
.1.3.6.1.6.3.1.1.4.3.0 .1.3.6.1.4.1.3183.1.1
EOF

Then normally you will get a notification about the test input. Notice
that the three-line octet string "44 45 ... C1 " includes one
whitespace at the ending of each line, that's the way snmptrapd(8)
works.

4.1 NODE CONFIGURATION

You'd better save your customization, if any, before changing
anything. Guys with factory default settings feel free to go
on. FreeIPMI is outstanding with its ability to save your overall
customization with --checkout option; there are several config
programs, of which only bmc-config(8) and ipmi-pef-config(8) are
needed in this senario.

To save configurations,
# bmc-config --checkout -f saved-bmc-config.txt
# ipmi-pef-config --checkout -f saved-pef-config.txt

Notice the checked out configuration is well commented, you can refer
to it for help about the following configuration options.

To observe differences between current configuration and pet
configuration,

# bmc-config --diff <<EOF
Section Lan_Channel
    Volatile_Access_Mode                     Always_Available
    Volatile_Enable_Pef_Alerting             Yes
    Non_Volatile_Access_Mode                 Always_Available
    Non_Volatile_Enable_Pef_Alerting         Yes
EndSection
Section Lan_Conf
    IP_Address_Source                        Static
    IP_Address                               10.2.3.4
    Subnet_Mask                              255.255.255.0
    Default_Gateway_IP_Address               10.2.3.1
EndSection
Section User2
    Username                                 root
    Password                                 CHANGE_ME
    Enable_User                              Yes
    Lan_Privilege_Limit                      Administrator
EndSection
EOF

# ipmi-pef-config --diff <<EOF
Section Lan_Alert_Destination_1
    Alert_Destination_Type                   PET_Trap
    Alert_Acknowledge                        Yes 
    Alert_Acknowledge_Timeout                5   
    Alert_Retries                            2   
    Alert_Gateway                            Default
    Alert_IP_Address                         10.2.3.254
    Alert_MAC_Address                        00:00:00:00:00:00
EndSection
Section Community_String
    Community_String                         COMMUNITY_STRING
EndSection
Section Alert_Policy_1
    Policy_Type                              Always_Send_To_This_Destination
    Policy_Enabled                           Yes
    Policy_Number                            1
    Destination_Selector                     1
    Channel_Number                           1
    Alert_String_Set_Selector                0
    Event_Specific_Alert_String              No
EndSection
EOF

Notice ipmi-pef-config(8) could be used out of band once IPMI LAN
access is OK, which means you no longer need to logon the host to do
further configuration.

# ipmi-pef-config -h 10.2.3.4 -u root -P --diff <<EOF
...
EOF
Password: 

bmc-config(8) works out of band as well with the same prerequisite.
In addition, FreeIPMI is featured by configuring many hosts in one
run, see "HOSTRANGED SUPPORT" of ipmi-pef-config(8) and others from
FreeIPMI.

When you are sure about the changes, use --commit to apply. Then you
can use bmc-config to generate a software generated platform event:

# bmc-device --platform-event="41 04 05 73 6f assertion 80 01 ff"

You should get notified about the event a few seconds later, otherwise
check vendor rules. If the node is Dell PowerEdge R610 or 1950, you 
have to make a catch-all filter for the software generated event,

# ipmi-pef-config --commit <<EOF
Section Event_Filter_40
    Filter_Type                                   Software_Configurable
    Enable_Filter                                 Yes
    Event_Filter_Action_Alert                     Yes
    Event_Filter_Action_Power_Off                 No
    Event_Filter_Action_Reset                     No
    Event_Filter_Action_Power_Cycle               No
    Event_Filter_Action_Oem                       No
    Event_Filter_Action_Diagnostic_Interrupt      No
    Event_Filter_Action_Group_Control_Operation   No
    Alert_Policy_Number                           1
    Event_Severity                                Unspecified
    Group_Control_Selector                        0
    Generator_Id_Byte_1                           0xFF
    Generator_Id_Byte_2                           0xFF
    Sensor_Type                                   Any
    Sensor_Number                                 0xFF
    Event_Trigger                                 0xFF
    Event_Data1_Offset_Mask                       0xFFFF
    Event_Data1_AND_Mask                          0x00
    Event_Data1_Compare1                          0xFF
    Event_Data1_Compare2                          0x00
    Event_Data2_AND_Mask                          0x00
    Event_Data2_Compare1                          0xFF
    Event_Data2_Compare2                          0x00
    Event_Data3_AND_Mask                          0x00
    Event_Data3_Compare1                          0xFF
    Event_Data3_Compare2                          0x00
EndSection
EOF

If you still have problems you may wish to read "Figure 17-2 Event
Filter, Alert Policy, and Alert Destination, String Relationships"
from IPMIv2 spec, and/or check your network traffic. You might be the
unlucky guy triggering unspotted firmware bugs. See BUGS.


5. LOGGING

petalert.pl supports sophisticated logging. Use "-D token" to enable
logging of specific parts, or "-D :all" to enable very verbose
output. Internally it uses Data::Dumper to represent Perl structs, you
can simply copy to files to evaluate. For example, an acknowledge
request invocation was logged as

[Sun Dec 11 13:47:51 2011] ack command line  [
          '/usr/sbin/ipmi-pet',
          [
            '--pet-acknowledge',
            '-h',
            '10.2.3.4',
            '356096',
            '44',
            '45',
            '4C',
            '4C',
...
          ]
        ]

The above request invocation could be rebuilt by

# perl -le '@v=<STDIN>; $x=eval "@v"; print join(" ", $x->[0], @{$x->[1]})."\n"'
[
          '/usr/sbin/ipmi-pet',
          [
            '--pet-acknowledge',
            '-h',
            '10.2.3.4',
            '356096',
            '44',
            '45',
            '4C',
            '4C',
...
          ]
        ]
Ctrl-D
/usr/sbin/ipmi-pet --pet-acknowledge -h 10.2.3.4 356096 44 45 4C 4C ...

Then you could simply paste the command in the shell to simulate a
manual acknowledge. Looks like acknowledge requests without previous
PET is also accepted and responded as usual.

snmptrapd(8) itself allows for logging of traps into syslog which
requires log permission, see "ACCESS CONTROL" from snmptrapd.conf for
more information.

NSCA daemon logs to syslog, set "debug=1" in nsca.cfg to get detailed
connection handling. Nagios is also able to log to syslog, set 
"use_syslog=1" in nagios.cfg to help debugging alert.


6. PET TRAFFIC

On a no-acknowledge setup, usually there should be only one packet on
behalf of the PET from the ipmi node targeting the trap receiver,
however, firmware defect was spotted resulting in additional traffic,
see BUGS.

On an acknowledge setup, there should be three packets per event, one
PET, one PET acknowledge request from trap receiver targeting the ipmi
node, and one PET acknowledge response in the other direction. More
bugs were spotted, see BUGS.

Any setup, packets could be captured like this
# tcpdump -i any -nn -vvv -s0 -w pet.pcap 'host 10.2.3.4 and udp'

Then you can browse the interactions with the help of Wireshark. 


7. BUGS

It's spotted that factory default rules of iDRAC Express on Dell
PowerEdge R610 don't match software generated events. You need to make
a catch-all filter rule to report those events. However, hardware
gernerated events are not subject to such limitation. To verify this
situation, open the case to generate a hardware generated intrusion
event. Dell PowerEdge 1950 with BMC has similiar problem. The 
difference is that 1950 has 31 filter rules, so you don't worry about
overwriting an existent one.

It's spotted that Dell PowerEdge 1950 with BMC suffers clock drift, 
remarkably SEL timestamps. bmc-device(8) from FreeIPMI could be used 
to adjust SEL time and SDR repository time,
# bmc-device --set-sdr-repository-time=now
# bmc-device --set-sel-time=now
Notice that 'now' refers to current timestamp on the host where the
commands are issued. bmc-device(8) works out of band, so simply issue
the commands on a host where clock is synchronized.

It's spotted that iDRAC Express on Dell PowerEdge R610 generates two
traps per hardware event. Notice session id from the two traps differ,
they are different traps instead of duplication, even though other
contents of payload are identical.

It's spotted that iDRAC Express on Dell PowerEdge R610 produces
malformed PET acknowledge responses. In this case, ipmi-pet exits with
timeout error "ipmi_cmd_pet_acknowledge: message timeout". You may use
'-W malformedack', which is simply passed through, to instruct the
underlying helper ipmi-pet(8) to disable such detection and to
immediately return. Timeout hurts snmptrapd because slow handler
hinders the main loop. To discover potential time consuming cases, 
use "-D perf" and observe the log.

It's spotted that some DNS servers return "localhost" on private ip
addresses rather than NXDOMAIN, in this case, snmptrapd(8) passes
"localhost" as resolved hostname to petalert.pl which is
confused. You'd better switch to a correctly configured DNS server, or
contact the administrator to solve the problem.


Kaiwang Chen
kaiwang.chen@gmail.com