Sophie

Sophie

distrib > Fedora > 14 > x86_64 > media > updates > by-pkgid > 117381deb97b1d9f63c194220b1ddc14 > files > 5

petit-1.1.1-1.fc14.noarch.rpm

Download and Information
===============================================================================
http://opensource.eyemg.com/Petit

Why
===============================================================================
Log analysis is something that all systems administrators know they need to do. 
Many of us come to this point, either because there is a problem, there is a 
security requirement from the organization, or it keeps you up all night 
wanting to know what is going on in all of that data.

Looking for best practices for log analysis on this Internet is difficult at 
best. Many years ago, I discovered a script that hashed log files by removing 
all of their numbers and replacing them with "#" characters. The results of 
this simple algorithm were phenomenal, logs could be reduced by a factor of 
ten. This was much more readable, yet left much of the quality data that I 
needed to determine if there was a problem.

In the years since I discovered that simple algorithm, I have come to discover 
many techniques on text analysis which are commonly used in linguistics and 
anthropology to analyze natural languages. This has led me to develop very 
simple best practices for analyzing logs.

The Basics
===============================================================================
 1. Logs are made up of output which are programmed by human beings. There 
are no real restraints on what is output, other than, some cultural rules on 
being professional. This makes the output from programs very much a natural i
language. This also makes the output of someones program an approximation of 
the reality of what is happening inside a program. This is important to 
remember, logs are not perfect.

 2. When a systems administrator analyzes logs by changing them, he is 
creating an approximation of an approximation of reality in side a working 
program. This is not necessarily a bad thing, especially, when the programmer 
never gives you better than his approximation of reality anyway.

 3. In practice logs are made up of certainty and uncertainty. For example, I 
know what OpenSSH puts in the log during a login, because it is common. On the 
other hand, I do not now what a Compaq DL380 G3 will put in the log when it has 
a disk controller error. This is important to remember.

 4. The basic log analysis algorithm in Petit works to remove certainty, while 
leaving uncertainty. Stated another way, Petit quantitatively removes certainty, 
thereby leaving uncertainty, which by necessity requires qualitative analysis 
from a systems administrator

 5. After the algorithm has been applied, the output must be read by a systems 
administrator to determine if it is a normal or abnormal. Then abnormal entries 
can be acted on, hopefully before there is noticeable impact to your system. 


Installing/Uninstalling
===============================================================================
Installation can be done with RPM, DEB or TAR

INSTALL
========================================

TAR
 make install

RPM
 rpm -ivh petit.rpm

DEB
 dpkg -i petit.deb

UNINSTALL
========================================

TAR
 make uninstall

RPM
 rpm -e petit

DEB
 dpkg -r petit


Building Packages
===============================================================================
Two forms of building are maintained for sanity, deb and rpm. The build scripts
for these two package managers are distributed with petit for convenience and 
are used internally by the project  as part of a larger script to help 
distribute snapshots on our site. Usage is simple, currently no make install is
supported, but may be part of future distributions.

RPM
 make rpm

DEB
 make deb


Routine Operations
===============================================================================

Hash a syslog, removing reboots and all standard filters. By default petit will 
show a sample for all entries which are found three or less times.

	petit --hash --fingerprint /var/log/messages

Hash an Apache log

	petit --hash /var/log/httpd/access_log

Get a daemons report

	petit --daemon /var/log/messages

Get a host report

	petit --host /var/log/messages

Find qualitatively important words in your log. This is especially useful to 
help determine what should be monitored in swatch.

	petit --wordcount /var/log/messages

Graph first 60 seconds in a syslog

	petit -sgraph /var/log/messages

Track a special work you are interested in by minute

	cat /var/log/messages | grep error | petit --mgraph

Show samples for each entry

	petit --hash --allsample /var/log/messages

Special Operations
===============================================================================

Create an on the fly driver for a nonstandard file format, then pipe it to Petit. 
Petit can hash files of non-standard types ok, but graphing requires the time 
values to be in the correct columns.

	cat /var/log/httpd/error_log | awk '{$1="";$5="";print}' | lt --sgraph