Sophie

Sophie

distrib > Mandriva > 9.2 > i586 > by-pkgid > 960682759d6b16198e74613b5f73b4e2 > files > 24

amavisd-new-0.20030616-9mdk.noarch.rpm

This file README.performance is part of the amavisd-new distribution,
which can be found at http://www.ijs.si/software/amavisd/

Updated: 2002-05-13, 2002-08-01, 2003-01-09


Here are some excerpts from my mail(s) on the topic of performance.

  Mark

[...]
| What I use now is freebsd+postfix+amavisd+sophie,

Good choice in my opinion.

Hopefully hardware matches expectations,
fast disks and enough memory are paramount.

You may want to put Postfix spool on different disk than /var/amavis,
where amavisd does mail unpacking.

| is there any suggested configuration for this
| environment? Especially if my server is a high loaded
| busy mail hub/gateway? Any parameters for performance tuning?

| Do I need to increase this number to fit a busy server?
| Or any other related parameters should I notice?

How many messages per day are we talking about?

Both the amavisd child processes, and Postfix smtpd services
consume quite some chunks of memory, so the memory size
can determine how many parallel processes you can run.

Note that the compiled Perl code sections in amavisd-new processes
occupy the same memory, if fork on your Unix system uses copy-on-write
for memory pages, as most modern Unixes do.

I would start small, e.g. by 2 or 3 child processes per CPU
(parametere $max_servers), then see how machine behaves.
If you see heavy swapping or load regularly going beyond 2 or 3 (per CPU),
decrease the number of parallel streams, otherwise increase it - gradually.
This number is probably the most important tuning parameter.
Going beyong 10 usually brings no more improvements in overall system
throughput, it just wastes memory.

If this does not come close to your needs, you may want to place
amavisd-new with Sophie on a different host than Postfix.
They can now talk via SMTP so there is no advantage in having
both MTA and amavisd on the same host.

Actually there are now three quite independent modules,
which can share the same host, or not:
  incoming Postfix (MTA-IN) -> amavisd+Sophie -> outgoing Postfix (MTA-OUT)

Both MTA-IN and MTA-OUT can be the same single Postfix, but need not be.
If you decide to split MTA-IN and MTA-OUT, you can position
one of them on the same host as amavisd, although I guess it
would be better to either have three boxes, or have MTA-IN
and MTA-OUT be a single Postfix, as in the normal setup,
while optionally moving amavisd+Sophie to a different host.

As amavisd-new is just a regular SMTP server/client to Postfix,
you can use the usual load sharing mechanisms as available for
normal mail delivery, like having multiple MX records for the
content filter.

[...]

| I would like to know the possibility of email loss? Especially
| under unawareness! What if amavisd or sophie suddenly/abnormally
| terminated? Is there any recovery procedures should be take?

Mail loss should not be possible any longer (except with total disk failure).
I am continually testing some awkward situations like disk full,
process restarts, child dies, even programming errors :) ... .
Amavisd never takes the responsibility for mail delivery away from MTA,
it just acts as an intermediary between MTA-IN and MTA-OUT.
Only when MTA-OUT confirms it has received mail, the MTA-IN does
a SMTP session closedown with a success status code. All breakdowns
and connection losses are handled by MTA, and Postfix is very good
in doing it in a reliable way.

The only cause of concern is DoS in some unpackers. This part of code
in amavisd-new is still mostly the same as in the amavisd version,
and although it does exercise some care, there is still a lot
to be desired. I do not consider a clever thing to first let
unpacker run for 20 minutes unpacking some strange archive,
ten abort everything and tell MTA to retry later.

Let me tell you a heretic secret: if your AV scanner (e.g. Sophie)
can handle all archives used by current viruses (except MIME decoding,
which is done by amavisd), it is quite safe, good and fast
to set $bypass_decode_parts to 1 (see amavisd.conf).

And more: later Postfix versions can do the MIME syntax checking
and enforce 7bit header RFC 2822 requirements (see parameters like:
  $ postconf | egrep 'mime|[78]bit' ) so you can block invalid MIME
even before it hits the MIME::Parser Perl module.

Instead of wasting 5 minutes for some particularly nasty archive,
Sophie can do it in 5 seconds !!!  I have yet to see a virus (in the
wild) that sophos would ONLY detect if first unpacked by amavisd.

This does not take care of manual malicious intents,
but one can always bring in a virus on a floppy, or download it
some other way (e.g. PGP encrypted), if one really wants to.


---------
See article by Cor Bosman for a high-end installation:
  http://www.xs4all.nl/~scorpio/sane2002/paper.ps

---------
Limit the number of AV-scanning processes, don't let MTA run
arbitrary number of AV-scanning processes. Also limiting based on CPU load
is not a good idea in my opinion - set the fixed limit based on the number
of concurrent AV-checking processes you host (memory,disk,cpu) can handle,
not on the current load or mail rate, otherwise when the situation goes
bad, it is more likely it will go bad all the way - disk and memory
thrashing is the last thing you desire when load goes high.

---------
| I have a question about how to distribute amavisd-new directories across 
| different disks for optimal performance.  There are usually 4 directories 
| in the amavisd-new mail path.  
| 1) The amavis TEMPBASE directory (Where incoming emails are scanned)
| 2) The postfix queue directory
| 3) The directory for amavis and mail system logs
| 4) The directory where mail is delivered.
| What would be the best distribution of these directories over multiple disks? 
| Obviously, having each one on a different disk would be best.  However, if 
| you only have 3 disks to use, which two services should be combined?  If you 
| have only two disks, which services should be put together?

Let amavisd-new log via syslog and make sure your syslogd does
not call flush for every log entry (as Linux does by default,
but is configurable per log file). This way the disk with log files
becomes non-critical.

The disk with Postfix mail queue is likely to be most heavily
beaten by file creates/deletes. I would put it on its own disk.

The $TEMPBASE (amavis work directory) is probably not as heavily
exercised (in the SMTP-in/SMTP-out amavisd-new setup, as with Postfix),
unless your mail messages often contain many MIME parts that need
to be decoded. If you can afford it, it can even reside on a
RAM disk / tempfs or with delayed-syncing without risking any mail loss.