<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <!--Converted with LaTeX2HTML 2K.1beta (1.48) original version by: Nikos Drakos, CBLU, University of Leeds * revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan * with significant contributions from: Jens Lippmann, Marek Rouchal, Martin Wilck and others --> <HTML> <HEAD> <TITLE>Scan engine</TITLE> <META NAME="description" CONTENT="Scan engine"> <META NAME="keywords" CONTENT="clamdoc"> <META NAME="resource-type" CONTENT="document"> <META NAME="distribution" CONTENT="global"> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="LaTeX2HTML v2K.1beta"> <META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css"> <LINK REL="STYLESHEET" HREF="clamdoc.css"> <LINK REL="next" HREF="node29.html"> <LINK REL="previous" HREF="node27.html"> <LINK REL="up" HREF="node26.html"> <LINK REL="next" HREF="node29.html"> </HEAD> <BODY > <!--Navigation Panel--> <A NAME="tex2html347" HREF="node29.html"> <IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next" SRC="/usr/share/latex2html/icons/next.png"></A> <A NAME="tex2html345" HREF="node26.html"> <IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" SRC="/usr/share/latex2html/icons/up.png"></A> <A NAME="tex2html339" HREF="node27.html"> <IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" SRC="/usr/share/latex2html/icons/prev.png"></A> <BR> <B> Next:</B> <A NAME="tex2html348" HREF="node29.html">Threads</A> <B> Up:</B> <A NAME="tex2html346" HREF="node26.html">Technicals</A> <B> Previous:</B> <A NAME="tex2html340" HREF="node27.html">Security</A> <BR> <BR> <!--End of Navigation Panel--> <H2><A NAME="SECTION00062000000000000000"></A><A NAME="engine"></A> <BR> Scan engine </H2> New versions of Clam AntiVirus are using a mutation of Aho-Corasic pattern matching algorithm. This algorithm uses a finite state pattern matching automaton [<A HREF="node32.html#clr">1</A>]. The algorithm itself is a generalization of the Knuth-Morris-Pratt algorithm. Please look at <I>matcher.h</I> for data type definitions. The automaton is represented by the trie. Trie is a rooted tree with some specific properties [<A HREF="node32.html#acwww">2</A>]. Each node of the trie represents some state of the automaton. In the implementation, the node is defined as following: <PRE> struct node { int islast; struct patt *list; int maxpatlen; struct node *next[NUM_CHILDS], *trans[NUM_CHILDS], *fail; }; </PRE> [To be continued...] <P> <BR><HR> <ADDRESS> Tomasz Kojm 2002-11-21 </ADDRESS> </BODY> </HTML>