Customization of notification messages and log entries ====================================================== Mark Martinec <Mark.Martinec@ijs.si>, 2002 Since March 2002 amavisd-new provides a way to customize e-mail notification messages that are sent in response to a virus (and spam) detection, without having to resort to modifications of Perl code. Three types of messages are generated: - administrator notifications are sent to administrator; (two types: virus, spam) - sender (non)delivery notifications may be sent to the mail originator; (three types: virus, spam, rejects by outgoing MTA) - recipient warnings may be sent to envelope recipients of the e-mail containing a virus. These notifications are normally disabled since they are more of a nuisance than benefit. (only for viruses) Besides the three types of e-mail notifications, the same customization principle is applied to customize the most common and useful log entry (present even at log level 0) that is generated when an e-mail containing an unwanted content is detected. Default Template texts are glued to the end of the 'amavisd' file, separated one from another by __DATA__ lines. Please see comments in these templates. These template texts are assigned to variables: $log_templ, $notify_sender_templ, $notify_virus_sender_templ, $notify_virus_admin_templ, $notify_virus_recips_templ, $notify_spam_sender_templ, $notify_spam_admin_templ The value of these variables may be overruled by assignment or by reading into them in the amavisd.conf file, which is run at amavisd startup time. If assigning to variables, care must be taken to properly quote certain special characters (like backslash), as required by Perl quoting rules. Text read from amavisd file or from external files is not subject to Perl quoting rules. Template text is subject to simple run-time macro expansion as described in the next section. Macro expansion is performed by the routine expand(), which receives substitution text of simple macros from 'amavisd' program. expand() takes a template string as its argument, performs macro expansion, and returns resulting multiline string back to 'amavisd', which uses it to send mail notifications or to write log entries. The substitution text for the following simple macros is built-in: - to be used in forming a mail header (properly quoted addresses as required by RFC2822): %f administrator's e-mail address (typically used in 'From:' header of notification messages); %T a list of recipients to be used in 'To:' header of the notification; %C a list of recipients to be used in 'Cc:' header of the notification; %B a list of recipients to be used in 'Bcc:' header of the notification; (the %T, %C and %B lists are determined by each notification subroutine) - to be used in forming a mail body: %h dns name of this host, or configurable name (variable $myhostname) %n amavis internal message id (as seen in log entries) %b message digest of mail body (MD5, hex) %d RFC 2822 date-time (current time) %m 'Message-ID' header field body %j 'Subject' header field body %t first entry in the 'Received' trace of the mail header %H a list of all header lines (field may be wraped over more than one line); this does not include the 'Return-Path:' or 'Delivered-To:' headers, which would have been added (or will be added later) by the local delivery agent if mail would have been delivered to a mailbox. %s original envelope sender, rfc2821-quoted and enclosed in angle brackets %l (letter ell) true if sender belongs to local_domains, undef otherwise %o best attempt at determining true sender of the virus - normally same as %s %S address that will get sender notification; this is normally a one-entry list containing sender address (%s), but may be unmangled/reconstructed in an attempt to undo the address forging done by some viruses; %R a list of original envelope recipients (for use in notification body, not headers) %D a list of recipients with successful delivery status (will get mail) %N a list of recipients with UNsuccessful delivery status (will NOT get mail) with included short per-recipient delivery reports as used in the free format first MIME part of delivery status notifications %i quarantine id, e.g. name of the quarantine file, or empty ($VIRUSFILE) %q list of quarantine mailbox names, or empty if not quarantined %v output of the (last) virus checking program %V a list of virus names found; contains at least one entry (possiby empty) if a virus was found, otherwise a null list %F a list of banned file names %W a list of av scanner names detecting a virus %A a list SpamAssassin report lines %c spam level/hits (mnemonic: sCore) as provided by SpamAssassin The choice of capital letters for lists, and lower case letters for simple strings is purely a convention and is not enforced, neither do all built-in macros adhere to the convention. Further builtin macros are easy to add if special need arises, just append new key/value pairs to the hash which is passed to expand(). Besides a simple string or an array reference, a hash value may also be a subroutine reference which will be called later during macro expansion. This way one can provide a method for obtaining information which is not yet available during initial construction of the hash (such as AV scanner results), or provide a lazy evaluation for more expensive calculations. Subroutine will be evaluated in scalar context. It may return a string or an array reference. The rest of this text explains what expand() does with its arguments. EXPAND This is a simple, yet fully fledged macro processor with proper lexical analysis, call stack, implied quoting levels, user supplied builtin macros, two builtin flow-control macros: selector and iterator, plus a macro #, which discards input tokens until NEWLINE (like 'dnl' in m4). Also recognized are the usual \c and \nnn forms for specifying special characters, where c can be any of: r, n, f, b, e, a, t. Lexical analysis of the input string is preformed only once, macro result values are not in danger of being lexically parsed again. No new macros can be defined by processing input string (at least in this version). Simple caller-provided macros have a single character name (usually a letter) and can evaluate to a string (possibly empty or undef), or an array of strings. It can also be a subroutine reference, in which case the subroutine will be called whenever macro value is needed. The subroutine must return a scalar: a string, or an array reference, which will be treated as if it were specified directly. Two forms of simple macro calls are known: %x and %#x (where x is a single letter macro name, i.e. a key in a user-supplied hash): %x evaluates to the hash value associated with the name x; if the value is an array ref, the result is a single concatenated string of values separated with comma-space pairs; %#x evaluates to a number: if the macro value is a scalar, returns 0 for all-whitespace value, and 1 otherwise. If a value is an array ref, evaluates to the number of elements in the array. The simple macro is evaluated only in nonquoted context, i.e. top-level text or in the first argument of a selector (see below). A literal percent character can be produced by %% or \%. More powerful expansion is provided by two builtin macros, using syntax: [? arg1 | arg2 | ... ] a selector [ arg1 | arg2 | ... ] an iterator where [, [?, | and ] are required tokens. To take away the special meaning of these characters they can be quoted by a backslash, e.g. \[ or \\ . Arguments are arbitrary text, possibly multiline, whitespace counts. Nested macro calls are permitted, proper bracket nesting must be observed. SELECTOR lets its first argument be evaluated immediately, and implicitly protects the remaining arguments. The first argument chooses which of the remaining arguments is selected as a result value. The result is only then evaluated, remaining arguments are discarded without evaluation. The first argument is usually a number (with optional leading and trailing whitespace). If it is a non-numeric string, it is treated as 0 for all-whitespace, and as 1 otherwise. Value 0 selects the very next (second) argument, value 1 selects the one after it, etc. If the value is greater than the number of available arguments, the last one is selected. If there is only one alternative available but the value is greater than 0, an empty string is returned. Examples: [? 2 | zero | one | two | three ] -> two [? foo | none | any | two | three ] -> any [? 24 | 0 | one | many ] -> many [? 2 |No recipients] -> (empty string) [? %#R |No recipients|One recipient|%#R recipients] [? %q |No quarantine|Quarantined as %q] Note that a selector macro call can be used as a form of if-then-else, except that the 'then' and 'else' parts are swapped! ITERATOR in its full form takes three arguments (and ignores any extra arguments): [ %x | body-usually-containing-%x | separator ] All iterator's arguments are implicitly quoted, iterator performs its own substitutions, described below. The result of an iterator call is a body (the second argument) repeated as many times as there are elements in the array denoted by the first argument. In each instance of a body all occurrences of token %x in the body are replaced with each successive element of the array. Resulting body instances are then glued together with a string given as the third argument. The result is finally evaluated as any top-level text for possible further expansion. There are two simplified forms of iterator call: [ body | separator ] or [ body ] where missing separator is considered a null string, and the missing formal argument name is obtained by looking for the first token of the form %x in the body. Examples: [%V| ] a space-separated list of virus names [%V|\n] a newline-separated list of virus names [%V| ] same thing: a newline-separated list of virus names [ %V] a list of virus names, each preceeded by NL and spaces [ %R |%s --> <%R>|, ] a comma-space separated list of sender/recipient name pairs where recipient is iterated over the list of recipients. (Only the (first) token %x in the first argument is significant, other characters (e.g. whitespace) are ignored.) [%V|[%R|%R + %V|, ]|; ] produce all combinations of %R + %V elements A combined example: [? %#C |#|Cc: [<%C>|, ]] [? %#C ||Cc: [<%C>|, ]\n]# ... same thing evaluates to an empty string if there are no elements in the %C array, otherwise it evaluates to a line: Cc: <addr1>, <addr2>, ...\n The '#' removes input characters until and including newline after it. It can be used for clarity to allow newlines be placed in the source text but not resulting in empty lines in the expanded text. In the second example above, a backslash at the end of the line would achieve the same result, although the method is different: \NEWLINE is removed during initial lexical analysis, while # is an internal macro which, when called, actively discards tokens following it, until NEWLINE (or end of input) is encountered. Whitespace (including newlines) around the first argument %#C of selector call is ignored and can be used for clarity. These all produce the same result: To: [%T|%T|, ] To: [%T|, ] To: %T See further practical examples in the supplied notification messages at the end of file 'amavisd'.