Sophie

Sophie

distrib > Fedora > 14 > x86_64 > by-pkgid > d16675692219c3ec9e04fa6722b2e301 > files > 19

dspam-3.9.0-9.fc14.x86_64.rpm

Version 3.9.0
-------------

[20091215:2335] sbajic: Adding contrib to distribution

[20091215:1710] sbajic: Fixing small typo in pt-BR template

[20091215:0330] sbajic: Removed LDAP driver (replaced by ExtLookup)

[20091215:0100] sbajic: Replacing non ascii characters in Romanian template with HTML characters entries

[20091214:1420] sbajic: Web UI changes: Fix display for history page selection and add and use function to transform some special characters (sender and subject) into HTML character entries

[20091212:0550] sbajic: Adding more comments to conigure.pl.in for the Web UI

[20091212:0545] sbajic: Fixing various issues in the CGI scripts for the Web UI

[20091209:0130] sbajic: Fixing issue with empty historical data by preinitializing empty variables for the history chart/view

[20091206:1840] sbajic: Fixing problem with fallbackDomains

[20091206:1730] sbajic: Prettifying dspam.conf and extending the notes on some configuration options

[20091205:1550] sbajic: Fixing HTML tags and some translation errors in German Web UI template

[20091204:1430] sbajic: Removed old and obsolete .cvsignore files

[20091204:1255] sbajic: Translating some left English elements in the Spanish templates for Web UI

[20091203:0255] sbajic: Adding Spanish templates for Web UI (translated by Daniel Sánchez Pearson <danielsanchez@hachete.com>)

[20091202:0125] sbajic: Gentoo patch 8: Adding "virus" option to TrackSources

[20091201:2310] sbajic: Fixed m4 macros for SQLite/SQLite3

[20091129:1650] sbajic: Fixed and updated man files

[20091128:1000] sbajic: Replaced umlauts in German template to use HTML character encodings

[20091128:1000] sbajic: Fixed HTML markup error in pt-br/nav_fragment.html

[20091128:0030] sbajic: Fixed HTML markup error in dspam.cgi

[20091126:0050] sbajic: Adding Brazilian Portuguese templates for Web UI (translated by Felipe Szczesny Rout <felipe.rout@al.rs.gov.br>)

[20091125:0940] sbajic: Fix Web UI access in admin area to userdir in case of domainscale and a username beginning with @ or in case of empty domain

[20091125:0930] sbajic: Fix Web UI access to userdir in case of domainscale and a username beginning with @

[20091118:1700] sbajic: Require @ in username for fallbackDomains to work and don't allow calling ctx_init() without or with empty username

[20091117:1705] sbajic: Enhancing HTML message processing

[20091115:2050] sbajic: Updating documentation

[20091115:1250] sbajic: Fixing missing <HTML> start tag in nav_fragment.html for EN, FR and RO

[20091114:1140] sbajic: Fixing bugs in _ds_get_nextuser(), _ds_get_nexttoken() and _ds_get_nextsignature() in the PostgreSQL driver

[20091114:0120] sbajic: Fixing compile failure of the PostgreSQL driver tool dspam_pg2int8

[20091113:2225] sbajic: Adding documentation into MySQL driver source code for _ds_get_nextuser(), _ds_get_nexttoken() and _ds_get_nextsignature()

[20091113:2155] sbajic: Fixing bug in _ds_get_nextuser() and _ds_get_nexttoken() in MySQL driver

[20091113:2050] sbajic: Fixing bug in _ds_get_nextsignature() in MySQL driver

[20091113:1410] sbajic: Reverting one block from submit ec9a6de09f178eaf838aad5b66b07b15951d6ef3 that breaks decoding base64/quoted-printable

[20091113:0400] sbajic: Fixing memory leak in dspam_merge.c

[20091113:0345] sbajic: Fixing file descriptor leak in PostgreSQL driver

[20091113:0340] sbajic: Fixing file descriptor leak in MySQL driver

[20091113:0310] sbajic: Speeding up various functions in dspam.c

[20091112:2325] sbajic: Fixing bug in quoted printable decoding

[20091105:2025] sbajic: Adding German templates for Web UI

[20091105:1655] sbajic: Added missing files into various make files

[20091104:0255] sbajic: Removed non properly working OSB for PValue

[20091104:0255] sbajic: Speeding up _ds_calc_result()

[20091104:0145] sbajic: Updating make files to include changes mentioned in log entry [20090818:0100]

[20091103:1843] sbajic: Typo fixes in the french WebUI (patch provided by Julien Valroff <julien@kirya.net>)

[20091103:0945] sbajic: Fixing build issues in Mac OS X

[20091029:0810] sbajic: Fixing build issues in Mac OS X

[20091028:1018] sbajic: Fixing build issues of dynamically linked libraries

[20091028:0028] sbajic: Changes in External Lookup
* Removing not needed header files (agent_shared.h)
* Decoupling dependency to OpenLDAP (if the libraries are there
  then LDAP lookups are enabled in External Lookup else LDAP
  lookups are disabled in the External Lookup module)

[20091027:1300] sbajic: Fixing various issues (memory leaks and logical errors) in PostgreSQL driver

[20091016:1022] sbajic: Fixing (potential) branching on uninitialized variable in SQLite3 driver

[20091016:0116] sbajic: Removing unused variables in SQLite3 driver

[20091016:0112] sbajic: Fixing (potential) call to readdir with NULL argument in SQLite3 driver

[20091012:1012] sbajic: Fixing compiler warnings

[20091012:1010] sbajic: Fixing compiler warnings

[20091012:1005] sbajic: Fixing compiler warnings

[20091012:1000] sbajic: Fixing compiler warnings

[20091012:0905] sbajic: Fixing compiler warnings

[20091012:0903] sbajic: Fixing compiler warnings

[20091012:0847] sbajic: Fixing compiler warnings

[20091012:0832] sbajic: Fixing compiler warnings

[20091012:0824] sbajic: Fixing compiler warnings

[20091012:0743] sbajic: Fixing compiler warnings

[20091009:2228] sbajic: Fixing compiler warnings

[20091009:2218] sbajic: Fixing compile error introduced by last commit

[20091009:2211] sbajic: Fixing compiler warnings

[20091009:2203] sbajic: Fixing compiler warnings

[20091009:2155] sbajic: Fixing compiler warnings

[20091009:2125] sbajic: Removing unused variables in retrain_message()

[20091009:2120] sbajic: Removing unused variables in process_message()

[20091009:2115] sbajic: Removing unused variables in process_message()

[20091009:2115] sbajic: Fixing (potential) dereference of null pointer in daemon.c

[20091009:2110] sbajic: Fixing (potential) dereference of null pointer in dspam_clean.c

[20091009:2055] sbajic: Fixing (potential) dereference of null pointer in dspam_clean.c

[20091009:2048] sbajic: Fixing (potential) call to strncasecmp with NULL argument in daemon.c

[20091009:2035] sbajic: Fixing (potential) call to readdir with NULL argument in hash driver

[20091008:2321] sbajic: Fixing variable declaration in MySQL driver introduced in last commit

[20091008:2311] sbajic: Code cleanup (limiting scope of variables)

[20091008:2122] sbajic: Fixing return code in send_socket()

[20091008:2110] sbajic: Removing unused code in client_process()

[20091008:2045] sbajic: Removing unused code in base64decode() when using NCORE

[20091008:2045] sbajic: Removing unused variables in base64decode() when using NCORE

[20091008:2045] sbajic: Removing code in _ds_calc_result() (speeding up Robinson algorithm)

[20091008:2038] sbajic: Removing unused variables in csscompress()

[20091008:2031] sbajic: Removing unused variables in cssstat()

[20091008:2025] sbajic: Removing unused variables in bnr_hash_set()

[20091008:2020] sbajic: Removing unused code from _ds_calc_result()

[20091008:2010] sbajic: Removing unused variables from MySQL driver in _ds_get_nextuser()

[20091008:1915] sbajic: Removing unused variables in bnr_hash_value()

[20091008:1908] sbajic: Removing unused variables in find_signature()

[20091008:1900] sbajic: Fixing broken return code in process_users()

[20091008:1850] sbajic: Removing unused variables in _ds_ff_pref_load()

[20091008:1840] sbajic: Removing unused variables in _ds_degenerate_message()

[20091005:2138] sbajic: The README has twice a list of members. Updated the second list to be in sync with the first one.

[20091005:2135] sbajic: Changing how control tokens are matched when computing probability for Markovian weightening and OSB/OSBF/WINNOW weightening

[20091005:1600] sbajic: Adding Hugo Monteiro to the README

[20090925:2005] sbajic: Take care of internal tokens (control, whitelist, frequency and BNR) when computing probability for Markovian weightening and OSB/OSBF/WINNOW weightening

[20090924:2356] sbajic: Adding OSB/OSBF/WINNOW weightening

[20090917:1040] sbajic: Fixing memory leak in dspam.c (closing SF Bug ID #2853124)

[20090917:1025] sbajic: Fixing missing close of sockets in has_virus() and feed_clam() (closing SF Bug ID #2853156)

[20090917:1010] sbajic: Fixing freeing of unallocated pointers (closing SF Bug ID #2853164)

[20090912:0925] sbajic: Fixing LOG/LOGFILE writing (closing SF Bug ID #2837832)

[20090910:2105] sbajic: Fixing memory leak in dspam.c (closing SF Bug ID #2853133)

[20090910:2100] sbajic: Fixing missing {} block in dspam.c (closing SF Bug ID #2853138)

[20090910:2045] sbajic: Fixing null pointer deallocation in libdspam (closing SF Bug ID #2853177)

[20090910:2040] sbajic: Fixing memory leak in libdspam (closing SF Bug ID #2853181)

[20090910:2035] sbajic: Fixing memory leak in libdspam (closing SF Bug ID #2853188)

[20090821:0115] sbajic: Fixing CSS and JavaScript issues in IE7/IE8 (The absolute worst browser when it comes to supporting the standards is Internet Explorer!) (closing SF Bug ID #2841370)

[20090819:1520] sbajic: Fixing one more compiling issue under uClibc

[20090819:1520] sbajic: Fixing compiling issue under uClibc

[20090818:1650] sbajic: Fixing one missed label to be swapped out in the CGIs

[20090818:1055] sbajic: Fixing typo in dspam.conf

[20090818:0120] sbajic: Adding Romanian and French templates to be build by regular make file

[20090818:0115] sbajic: Adding French templates for Web UI (provided and translated by Julien Valroff <julien@kirya.net>)

[20090818:0100] sbajic: Swap out text/labels from the CGIs into external file (patch from Julien Valroff <julien@kirya.net> slightly modified)

[20090815:1400] sbajic: Fixing memory leak in dspam.c

[20090804:1150] sbajic: Adding Hebrew templates for Web UI (translated by Dudi Goldenberg)

[20090803:0730] sbajic: Fixing compiler warnings

[20090803:0655] sbajic: Fixing dynamic build of storage drivers

[20090802:2010] sbajic: Fixing resource leak in hash driver

[20090802:1915] sbajic: Added script to contrib for purging old tokens and log entries

[20090802:1745] sbajic: Fixing dynamic build of storage drivers (closing SF Bug ID #2811139)

[20090802:1630] sbajic: Fixing location for notification files (closing SF Bug ID #2825171)

[20090801:2057] itetcu: contrib/Lotus Notes --> contrib/lotus_notes

[20090801:1455] sbajic: Adding German language files for Thunderbird plugin

[20090729:1600] sbajic: Fixing typos and logical errors from last commit (closing SF Bug ID #2829650)

[20090728:1600] sbajic: Fixing segmentation fault when using groups and the calling username does not have a '@' character

[20090726:1215] sbajic: Fixing dspam.cgi error from last commit

[20090724:2300] sbajic: Fixing bug when retraining (with signature) a virus tagged message (closing SF Bug ID #2826644)

[20090724:0120] sbajic: Fixing bug in hash driver that prevented retraining with signature

[20090720:2345] sbajic: Setting/reverting libdspm version back to 3.6.0 since no ABI/API changes have been done on libdspam for 3.9.0

[20090720:2330] sbajic: Fixing typos and errors from last commit (more logging and updates to Web UI)

[20090720:0100] sbajic: Added more logging tags and updated Web UI to support them

[20090719:2130] sbajic: Adding multi-language support to Web UI

[20090719:1750] sbajic: Fixing version display in Web UI

[20090712:2325] sbajic: Code cleanup (limiting scope of variables) and fixing resource leak in dspam_pg2int8

[20090711:1655] sbajic: Fixing bug in cssconvert and csscompress (closing SF Bug ID #2819960)

[20090711:0910] sbajic: Casting call in _ds_degenerate_message() to _ds_strip_html() function

[20090707:0930] sbajic: Fixing bug in src/hash_drv.c (closing SF Bug ID # 2817736)

[20090705:1910] sbajic: Removing 3th parameter in calls to _ds_find_header()

[20090705:1905] sbajic: Removing parameter from _ds_find_header() for case insensitivity

[20090705:1850] sbajic: Adding more debug output if verbose is turned on

[20090705:1800] sbajic: Fixing typo in src/tokenizer.c

[20090705:1435] sbajic: Fixing comparing for header field names to be case insensitive (see RFC822 section B.2)

[20090701:1430] sbajic: Fixing a case where a memory allocation error in decode.c was not handled correctly

[20090629:2145] sbajic: Adding script for Lotus Notes into contrib directory
 * Added Lotus Notes/Domino LotusScript Library (libDSPAMReporting.lss) into
   the contrib directory. The LotusScript Library can be used together with
   IBM Lotus Notes/Domino for retraining messages with the Lotus Notes
   Client.

[20090628:1350] sbajic: Closing SF Bug ID #2813474

* Applying patches submitted by Andreas Schneider <mail@cynapses.org>:
  * Fix a build warning in util.h/util.c
  * Check return values of system functions for the hash driver
  * Fix a compile warning that init_pwent_cache() has no prototype

[20090627:0404] sbajic: Removing Mac OS X patch for the duplicate symbol issue

[20090627:0228] sbajic: Fixing compiler warnings in cssclean.c and util.c

[20090627:0222] sbajic: Fixing broken return value in bnr_list_node_create()

[20090627:0204] sbajic: Fixing compiling issue (duplicate symbol _agent_config in .libs/read_config.o and .libs/pref.o) on Mac OS X

[20090627:0157] sbajic: Added csscompress to be build for the hash driver

[20090627:0143] sbajic: Fixing PostgreSQL driver
* Fixing SQL query in _ds_getall_spamrecords when using lookup_tokens

[20090625:0242] sbajic: Fixing resource leak in dspam.c

[20090625:0115] sbajic: Code cleanup in MySQL driver

[20090624:1622] sbajic: Fixing typo in configure.pl.in (SF Bug ID #2811562)

[20090623:2300] sbajic: Speed up of PostgreSQL driver

* Avoid calling _pgsql_drv_token_type() in _ds_get_nexttoken(). The token
  field is either NUMERIC(20) or BIGINT for the dspam_token_data table per
  made connection to the PostgreSQL database. No need to check that for
  each and every token in _ds_get_nexttoken().

[20090623:2255] sbajic: Code cleanup in MySQL and PostgreSQL driver

[20090623:2157] sbajic: Removing not needed index from PostgreSQL schema

[20090620:2130] sbajic: Updating documentation for Postfix integration

[20090617:0100] sbajic: Code cleanup in PostgreSQL driver

[20090615:0300] sbajic: Fixing typo in MySQL documentation

[20090615:0215] sbajic: MySQL driver changes

* Close properly the diction when exiting
* Update documentation for MySQL driver

[20090615:2355] sbajic: SQLite3 driver, purge script and documentation update

Change the SQLite3 driver to be closer to the changes recently done in the
MySQL and PostgreSQL driver. The changes for the SQLite3 driver include:
* Set all statistical counters to be unsigned
* Fix memory leaks
* Reduce memory footprint
* Fix issues with escaped SQL queries
* Add more debug output (only if debug is enabled)
* Micro speed and memory consumption improvements by removing not needed
  whitespace from SQL queries.
* Update SQLite3 purge script
* Update documentation for SQLite driver

Database schema has not changed. Users from pre 3.9.0 can just use the new
driver without the need to run migration commands on the database.

[20090611:0140] sbajic: Reduce calls to dspam binary in the WebUI

[20090610:2255] sbajic: Fixing build problems when building on uClibc [Bug ID #2803122]

* Fixed use of inet_ntoa_r on uClibc

[20090606:1700] sbajic: Fixing potential free of null pointer in pwent cache

[20090606:1043] sbajic: Fixing build problems when building against SQLite3 3.6 series

* Fixed issues with m4 macro when building against SQLite 3.6.x [Bug ID #2774657]

[20090605:0925] sbajic: Fixing access of deallocated variables in hash driver

* Fixed using of deallocated variable in _ds_get_nextuser

[20090604:1515] sbajic: Fixing memory leaks and using of deallocated variables in SQLite/SQLite3 driver

* Fixed memory leak in _ds_set_signature (SQLite3 driver)
* Fixed memory leak in _ds_get_nextsignature (SQLite driver)
* Fixed using of deallocated variable in _ds_get_nextuser (SQLite3 and SQLite driver)

[20090602:1329] sbajic: Fixing primary dspam processing agent

* Properly terminating primary dspam processing agent when called in client mode

[20090602:0244] sbajic: Fixing primary dspam processing agent

* Properly terminating primary dspam processing agent
* Reducing still reachable memory to be on par with the lightweight client-only call

[20090602:0133] sbajic: MySQL driver changes

* Fixing loop in MySQL driver introduced by latest MySQL driver patch

[20090602:0124] sbajic: Fixing compiler warnings

[20090602:0043] sbajic: PostgreSQL and MySQL driver changes

Changing the PostgreSQL and MySQL driver source to be more congruent with each
other. 

[20090601:1610] sbajic: PostgreSQL driver and schema updates

Finally things like 'dspam_admin change preference myuser "trainingMode" "toe"'
work without writing garbage in the PostgreSQL table if the "myuser" did not
existed before calling dspam_admin. The changes in the PostgreSQL driver push
the driver to be on par with the recent changes done in the DSPAM MySQL driver.
The changes for the PostgreSQL driver include:
* Set all statistical counters to be unsigned
* Fix memory leaks
* Reduce memory footprint
* Fix issues with escaped SQL queries
* Add more debug output (only if debug is enabled)
* Micro speed and memory consumption improvements by removing not needed
  whitespace from SQL queries.
* Update PosgreSQL database schema (extending the amount of users DSPAM can
  handle with the PostgreSQL driver from 32767 to 2147483647 users)
* Update PosgreSQL purge scripts

The documentation for upgrading to DSPAM 3.9.0 has informations how to update
existing DSPAM PostgreSQL schema from pre-3.9.0 to the new 3.9.0 schema.

[20090530:1925] sbajic: Enhancing _ds_strip_html

* Adding more tags to be stripped/handled in _ds_strip_html

[20090530:1210] sbajic: Fixes for bug id #2796390

* Adding function _ds_strip_html to strip html tags and decode a bunch of html encoded characters
* Replacing stripping of html in _ds_degenerate_message with call to new function
* Speeding up and enhancing html message processing

[20090530:1020] sbajic: Fixes for bug id #2796390

* Adding function _ds_decode_hex8bit
* Replacing decoding of hexadecimal 8-bit in _ds_degenerate_message with call to new function
* Speeding up hexadecimal 8-bit decoding

[20090530:0145] sbajic: Fixes for bug id #2796390

* Speeding up and fixing function _ds_decode_quoted()
* Adding function _ds_hex2dec()

[20090529:2350] sbajic: MySQL driver updates

* Updating _ds_setall_spamrecords to honor MySQLs max_allowed_packet for the insert query.
* Updating _ds_setall_spamrecords to honor MySQLs max_allowed_packet for the update query.
* Micro speed and memory consumption improvements by removing not needed whitespace from SQL queries.

[20090525:1240] sbajic: Closing bug id #2796340

Applying patch submitted by Andreas Schneider: Don't exit() if malloc fails.

When a library function calls exit(), it prevents the calling program from
handling the error, reporting it to the user, closing files properly, and
cleaning up any state that the program has. It is preferred for the
library to return an actual error code and let the calling program decide
how to handle the situation.

[20090525:1215] sbajic: Updating MySQL related items

* Purge scripts
* Schema
* Documentation regarding upgrade to 3.9.0

[20090525:1110] sbajic: Speeding up sbph/osb

Slightly speed up algorithm for calculating complexity and algorithm
for calculating sparse.

[20090525:0240] sbajic: Speeding up sbph/osb

Using fast exponentiation (reducing existing algorithm complexity from
O(n-1) to O(log n)) for sparse token and bitpattern generation.

[20090524:2315] sbajic: Reverting back Markovian weight calculation

Use again a bunch of "if" statements instead of calculating Markovian
weight. It turns out that using "if" statements is way faster then
calling the pow() function. Reordered the "if" statements in order to
speed up weight calculation.

[20090523:2240] sbajic: MySQL driver fixes

Fixing typo in last MySQL patch

[20090523:1940] sbajic: Improper client exit

Fix improper exit in client code.

[20090523:1920] sbajic: Gentoo patch 30 (b.g.o id #231175)

Cut the domain (including the at sign) from recipients.

[20090523:1535] sbajic: Simplify Markovian weight calculation

Calculate Markovian weight instead of using a bunch of if conditions

[20090523:1530] sbajic: Improve buffer management

Speedup/Improve buffer management

[20090523:1440] sbajic: MySQL driver changes

* Setting all statistical counters to be unsigned
* Fixing memory leaks
* Enhancing speed for MySQL (especially for >= 4.1)
* Honor MySQL 'max_allowed_packet' setting
* Reduce memory footprint
* Updating MySQL database schema

[20090523:0400] sbajic: MySQL driver changes

Adding more debug output (in case of errors) for the MySQL driver

[20090523:0300] sbajic: MySQL driver changes

Added support for MySQL >= 5.0.3 client libraries

[20090523:0200] sbajic: Closing Bug-ID: 2527286 and 2527289

* Added ability to swith RBLInoculate as a user preference.
* Added ability to ignore RBLLookups as a user preference.
See ignoreRBLLookups and RBLLookups in README

[20090523:0000] sbajic: Code cleanup / fixing memory leaks in libdspam

* Fixing some memory leaks in libdspam.c
* Cleaning up code in libdspam.c
* Total Innocent and Total Spam set to be unsigned long instead of long
* Fixing calculation for robinson to use unsigned long instead of signed int
  for spam_hits + innocent_hits
* Adding more verbose output in case of CTX errors

[20090523:0000] sbajic: Fixing libdspam objects and structs

Setting the 'length' of the DSPAM signature struct to be unsigned long. There
is no need to have that value signed long.

[20090523:0000] sbajic: Fixing memory leaks / enhancing stats output

* Fixing a bunch of memory leaks in dspam.c
* Fixing a bunch of memory leaks in tokenizer.c
* Adding PPV (Positive predictive value) to the output of dspam_stats

[20090522:2130] steeeeeveee: Closing Bug-ID: 2692425

Blacklist if RBL lookup return in the 127.0.0.0/8 network

[20090522:2100] steeeeeveee: Fixing output of syntax/switches for DSPAM

* Added undocumented (but available) switches to DSPAM agent syntax output:
    --debug
    --mail-from=
    --rcpt-to
    --signature=
    --help
    --version

* Added undocumented (but available) mode "unlearn" into the syntax output

* Only "no" (noise), "wh" (whitelist) and "tb" (training buffer) are supported
  features. The features "ch" (chain/chained) and "sbph" (Sparse Binary Polynomial
  Hashing) where long time ago moved to tokenizers and are not any more available
  as features.

[20090522:2030] steeeeeveee: Fixing memory leaks

Fixing a memory leak condition in libdspam.c

[20080507:1500] mjohnson: Added external_lookup.* source files

These files were missed in the previous round of CVS updates. 

[20080503:1400] mjohnson: Dspam train with MBOX files

Submitted by Vadim Zeitlin. Allows dspam_train to work with both maildir-like
directories and also MBOX folders. 

[20080503:1400] mjohnson: Dspam dump fixes

Submitted by Vadim Zeitlin. Allows dspam_dump to be used by normal
(non-trusted) users for their own usernames, while trusted users can use this
for other usernames as well. Also updated the manpage to more correctly
reflect the current usage syntax.

[20080503:1400] mjohnson: Dspam train fixes

Submitted by Vadim Zeitlin. Proper string comparison of spam_dir and ham_dir
using 'ne' instead of '!='. Fixed disparity between operation and man page for
username parameter.

[20080503:1100] mjohnson: MySQL driver bug fixes

Corrects error handling of name logging for both user and group names.

[20080503:1100] mjohnson: External user lookup

Submitted by Hugo Monteiro. Allows dspam to lookup a username for the dspam
database in an external source. This extends the existing
ldap_client interface, and also provides a generic interface for scripted
programs. More detail and the latest versions. This allows the administrator 
to define custom usernames to be used, not only system usernames or email addresses. 

This gives the possibility to change a users username and/or email address
without having to worry about losing or migrating the users DSPAM data. It
also mitigates the problem of user address and email alias matching in mail 
gateway/anti-spam appliance types of setup.

[20080502:1600] mjohnson: WebUI changes

Submitted by Kyle Johnson. Number of changes including:
* Moved the Version listing to underneath of the dspam logo.
* Completely removed the #footer.
* Fixed the navigation list to display correctly (correct use of css (floats and clears)).
* Moved the history page numbers ([ 1  2  3  4  5  6  7  8  > ]) into their own thingy ($HISTORYPAGES$).
* Moved the "Sort by:" links in the Quarantine tab into the table headers ($SORT_QUARANTINE$).

[20080502:1600] mjohnson: Post quarantine method

Submitted by Scott Worley. Changes from Get to Post method

[20080502:1600] mjohnson: Untrusted users statistics

Submitted by Vadim Zeitlin. Untrusted users can see their own statistics.

[20080502:1600] mjohnson: MySQL UIDinSignature option

Submitted by Xavier De Cock. Looks for the userid in sig first, then tries the
username.

[20080502:1530] mjohnson: Default username for webui patch

Submitted by Hugo Monteiro. Uses 'default' username for default preferences.

[20080310:0556] steveb: Allow daemon to listen to specific TCP host

User-configurable preference ServerHost added (set to '127.0.0.1') to allow using
specific TCP host when running in daemon mode. Not specifying ServerHost will
bind to all available interfaces when running in daemon mode and using TCP sockets.

[20080202:1330] mjohnson: PostgreSQL patch 8+ performance patch

Submitted by Kenneth Marshall. 

[20080201:1300] mjohnson: shift-click multiple row-selection

Submitted via web interface. Allows users to select multiple rows by clicking
on the initial row, holding shift, and clicking on the final row.

[20080201:1300] mjohnson: Select 200 patch

Submitted via web interface. Adds a "select 200" button to the quarantine page. 

[20080201:1300] mjohnson: configure.ac fit

Submitted by Niki Guldbrand. Removed some junk from a previous merge. 

[20071220:1800] mjohnson: Feature request #23 - dspam_notify

Submitted by Kyle Johnson. Also added Daily Quarantine Summary option. 

[20071220:1730] mjohnson: Gentoo bug 201656

Submitted by Alin Nastac. Hash driver work. 

[20071213:1300] mjohnson: Advertise 8-bit MIME support 

Submitted by Aleksander Kemanik.

[20071213:1300] mjohnson: Write checking for hash driver 

Submitted by Boguslaw Juza - Uses ERR_IO_FILE_WRITE instead of
ERR_IO_FILE_OPEN, improved error handling

[20071213:1600] mjohnson: Lock timeout

Submitted by Boguslaw Juza - Now uses signal.h

[20071213:1300] mjohnson: CSS clean copy header 

Submitted by Boguslaw Juza - Copies the older header into the new header for
the hash driver

[20071213:1300] mjohnson: Hash time stamp

Submitted by Boguslaw Juza - Adds a README and (heavy) optional flag to hash
driver cssclean

[20071213:1300] mjohnson: Feature Request 22

Submitted by Kyle Johnson - Add version number to bottom of web interface

[20071213:1300] mjohnson: Gentoo patch 29

Submitted by Steve - Debug fix for BNR

[20071213:1300] mjohnson: Gentoo patch 28

Submitted by Alin Nastac - Fixes to the hash driver

[20071213:1300] mjohnson: fix-unused-variables warnings

Submitted by Steve to fix the warnings generating by the make-daemon-quiet
patch.

[20071206:1530] mjohnson: Gentoo patch 26 

Relaxed group member matching

[20071206:1530] mjohnson: Gentoo patch 25 

Improved error handling on shutdown

[20071206:1530] mjohnson: Gentoo patch 24 

Ensure notification paths exist

[20071206:1530] mjohnson: Gentoo patch 23 

MySQL reconnect patch

[20071206:1500] mjohnson: Gentoo patch 21 

On client sending errors, send partial message with \r\n instead of \n

[20071206:1500] mjohnson: Gentoo patch 20 

Training skips broken tests and signatures, rather than exiting

[20071206:1500] mjohnson: Gentoo patch 19

Look for the last @ symbol in addresses (not the first)

[20071206:1500] mjohnson: Gentoo patch 18

Optimize database tables with dspam tokens and signature data

[20071206:1500] mjohnson: Gentoo patch 17

Deprecated LDAP

[20071206:1500] mjohnson: Gentoo patch 16

Pkglibdir in Makefile.am

[20071206:1500] mjohnson: Gentoo patch 15

Hash driver and cssclean.cgi

[20071206:1500] mjohnson: Gentoo patch 14 

Improved line-ending for CGI scripts

[20071206:1500] mjohnson: Updated copyright in configure.ac 

NodalCore(r) and Sensory Networks

[20071206:1500] mjohnson: Gentoo patch 13 

Improve config autodetection in configure.ac

[20071206:1330] mjohnson: Gentoo patch 12 

Take the configuration directory value from sysconfdir

[20071206:1330] mjohnson: Gentoo patch 11 

Preserve uid, gid, and permissions on the logfile during rotation

[20071206:1330] mjohnson: Gentoo patch 10 

Updates the storage driver location and trust setting in dspam.conf.in.

[20071206:1300] mjohnson: Gentoo patch 9

Enable domain quarantine.

[20071206:1300] mjohnson: Gentoo patch 7

Set the default pid file to be /var/run/dspam/dspam.pid if not specified.

[20071206:1300] mjohnson: Gentoo patch 6

Fix quotes in PGSQL driver.

[20071206:1300] mjohnson: Gentoo patch 5

Print dspam startup message to stdout instead of stderr

[20071206:1100] mjohnson: Gentoo patch 4

Silences dspam daemon startup

[20071206:1100] mjohnson: Gentoo patch 3

Fixed warnings for MySQL and PGSQL.

[20071206:1100] mjohnson: Gentoo patch 2

Link dynamically instead of statically.

[20071206:1100] mjohnson: Gentoo patch 1

Moved manpages into the right sections.

[20071206:1100] mjohnson: Manpages

Updated copyright to 2007 and changed website to dspam.nuclearelephant.com.

[20071205:1600] mjohnson: FilteringHistory

Martin Mares posted on 05-25-07. Feature Request #34. Filtering now all | spam
| innocent | whitelisted

[20071205:1600] mjohnson: 8-bit headers patch

Martin Mares posted on 05-25-07. All characters in X-DSpam-Factors headers except for obviously safe
regions encoded using %xx syntax.

[20071205:1600] mjohnson: Virus not spam patch

Julien Valroff posted on 11-29-07 for 3.6.8. Updated. 

[20071204:1200] mjohnson: dspam.cgi 'Retrain Checked' bug for false positives

Remi Broemeling discovered on 10-26-07. Combination of
checkboxes and 'Retrain Checked' button fails for messages marked Spam that
should be Innocent. 

[20071204:1330] mjohnson: process_parseto bug

Doug Hardie discovered on 06-09-07. Incorrectly parses forwarded messages with
lines like 'To: spam-trash'. 

[20071204:1530] mjohnson: libdspam linking

Andreas Schneider posted on 10-22-07. Links dspam modules against libdspam
for compiling with shared storage drivers.

[20071204:1530] mjohnson: read_config support for libdspam

Andreas Schneider posted on 10-22-07. 

[20071204:1530] mjohnson: No such user bug

Andreas Schneider posted on 10-23-07. dspamc wasn't declaring needed entity
variables (name, uid) for the agent.

[20071204:1600] mjohnson: Enables '+' in addresses, ignores case

Benjamin Donnachie posted on 05-30-07. 

[20071204:1600] mjohnson: Improved random seed initialization

Martin Mares posted on 05-25-07.


Version 3.8.0
-------------

[20061210.1435] jonz: fixed message corruption problems with direct delivery

when using direct delivery (e.g. DeliveryHost), certain servers require a
linefeed after carriage return otherwise the message will become malformatted.

[20060818.0700] jonz: added msg tagging support

added ability to add tagline to messages based on their classification; see
tagSpam and tagNonspam preferences in README

[20060607.1200] jonz: removed depricated oracle driver

removed outdated oracle driver; no maintainer, lack of interest

[20060606.0000] jonz: added ldap client to build

added ldap client headers to makefile, would not build on some systems

[20060601.0500] jonz: fix for dynamic storage drivers api

fixed _ds_pref_del call to storage library

[20060601.0300] jonz: webui history fix for 12:00 noon

bugfix to display 12 noon as 12p, not 12a

[20060530.0145] jonz: added connect check for pgsql

added a connection check for pgsql, to reconnect on failure in daemon mode

[20060530.0130] jonz: added logging of viruses

added logging of viruses (and the source) to agent

[20060527.1700] jonz: added HashPctIncrease option in dspam.conf

HashPctIncrease: Increase the next extent size by n% from the size of the
last extent. The default behavior, when HashPctIncrease is not used, is to
always use HashExtentSize with no increase. This is useful in accommodating
systems where the default HashExtentSize can be too small for certain
high-volume users. 

[20060527.1530] jonz: cache runtime user information

added caching of runtime user information, so this information is not polled
every message when running in daemon mode. also elimiates the need for
getpwuid_r when running in daemon mode (unless using mysql or pgsql), 
which some operating systems do not have.

[20060527.1530] jonz: moved TIME_ME into DEBUG

when debug is active, TIME_ME automatically runs, reporting processing time
to debug

[20060526.1900] jonz: fix for library TIME_ME measurements

fixed bug where negative processing times were reported using TIME_ME

[20060526.1600] jonz: turned off locking when not using syslog or logging

no need to lock on LOG() when not logging

[20060526.0230] jonz: rewrite for hash_drv offset caching

rewrote offset caching in hash_drv; fixed some bugs which may have caused
a crash on extent addition

[20060525.1100] jonz: fix for segfault on undefined DeliveryHost or ClientHost

fix for segfault in daemon mode when DeliveryHost or ClientHost is not 
specified

[20060524.0300] jonz: added --client support for dspam_train

use --client after username

[20060523.0300] jonz: more code optimizations

various optimizations to:
- tokenizer core
- hash_drv driver (store offset for writes)
- libdspam (preference lookups)
- optimizations for osb/sbph

[20060522.0300] jonz: added ProcessorURLContext

ProcessorURLContext creates Url* context-specific tokens for URLs; this is
the default in previous (and current) versions

[20060522.0300] jonz: optimized osb/sbph tokenizer

replaced several strlcat's with simple len counting to eliminate thousands of
unnecessary calls to strlen() and speed up osb/sbph tokenization process

[20060519.0300] jonz: fix for segfault in vsyslog()

fix segfault caused by bad use of va_args when vsyslog is called

[20060519.0130] jonz: fix for segfault in dlopen() failure

fixed bug causing segfault when dlopen() to storage driver library fails.
dspam still won't work any better if dlopen is failing but huzzah.

[20060519.0100] jonz: fix for performance template / local domain

added fix to display correct local domain in performance template, and only
display local domain if the username doesn't include an @ sign

[20060517.0700] jonz: fix for preference delete

fixed infinite loop on all non-preference-extension calls to delete a preference

[20060516.0200] jonz: changed SupressWebStats

SupressWebStats is now WebStats in dspam.conf, and setting is inverted.

[20060516.0200] jonz: fix for agent flags

discovered that agent flags required a 64-bit variable to hold all flags, but
only 32-bit variable was being used; this may have caused unpredictable
behavior when using SBPH, "unlearning" a message, or processing summaries.

[20060516.0200] jonz: added OSB tokenizer

osb (orthogonal sparse bigram) is similar to sbph, however only bigrams are
used to form sparse tokens; this uses far fewer resources than sbph with
very similar results

[20060516.0200] jonz: interface change: added tokenizer variable

added tokenizer variable to DSPAM_CTX and added following tokenizer flags:

      DSZ_WORD                Use WORD (uniGram) tokenizer
      DSZ_CHAIN               Use CHAIN (biGram) tokenizer
      DSZ_SBPH                Use SBPH (Sparse BP Hashing) tokenizer
      DSZ_OSB                 Use OSB (Orthogonal Sparse biGram)

WARNING: This is an API change and constitutes a new major version. Third
         party applications may fail to compile/run against this.

[20060414.1145] jonz: fix for segfault on log write err

when using --with-logfile, if file cannot be opened, dspam segfaulted

[20060513.1100] jonz: fixed compiler warnings on sqlite drivers

signed-ness warnings, nothing significant

[20060514.0900] jonz: discontinued support for berkeley db

deprecated bdb drivers finally removed from distribution

[20060512.2105] jonz: copyright modifications

reassignment to Jonathan Zdziarski instead of using my corporate face

[20060512.2100] jonz: removed some legacy piecess

- removed dspam_corpus (replaced by newer dspam_train)
- removed dspam_genaliases (replaced by parse-to-headers, virtual users, etc)

[20060512.0100] jonz: segfault fix for UIDInSignature

fixed a critical bug that can cause segfaults when correcting messages using
UIDInSignature options. database handle is refreshed, but new pointer is never
used.

[20060510.0800] jonz: fix to recognize trainPristine "off" in preferences

preference turned "off" should override config turned "on"

Version 3.6.8
-------------

[20060606.0000] jonz: fixes for pgsql_drv

fixed bugs from last release causing pgsql to fail on connection

[20060606.0000] jonz: added ldap_client headers to build

some operating systems refused to build ldap client due to missing header
in makefile

Version 3.6.7
-------------

[20060602.2300] jonz: fix for UIDInSignature with groups

fixed a bug causing the wrong uid to be written when UIDInSignature is used
in conjunction with groups

[20060530.0145] jonz: added connect check for pgsql

added a connection check for pgsql, to reconnect on failure in daemon mode

[20060530.1100] jonz: fix for incorrect reporting of X-DSPAM-Probability

fixed a bug causing X-DSPAM-Probability to be misreported when using multiple
algorithms

[20060525.1100] jonz: fix for segfault on undefined DeliveryHost or ClientHost

fix for segfault in daemon mode when DeliveryHost or ClientHost is not
specified

[20060519.0300] jonz: fix for segfault in vsyslog()

fix segfault caused by bad use of va_args when vsyslog is called

[20060519.0130] jonz: fix for segfault in dlopen() failure

fixed bug causing segfault when dlopen() to storage driver library fails.
dspam still won't work any better if dlopen is failing but huzzah.

[20060517.0700] jonz: fix for preference delete

fixed infinite loop on all non-preference-extension calls to delete a preference

[20060516.0200] jonz: fix for agent flags

discovered that agent flags required a 64-bit variable to hold all flags, but
only 32-bit variable was being used; this may have caused unpredictable
behavior when using SBPH, "unlearning" a message, or processing summaries.

Version 3.6.6
-------------

[20060513.1100] jonz: fixed compiler warnings on sqlite drivers

signed-ness warnings, nothing significant

[20060514.0900] jonz: discontinued support for berkeley db

deprecated bdb drivers finally removed from distribution

[20060512.2105] jonz: copyright modifications

reassignment to Jonathan Zdziarski instead of using my corporate face

[20060512.2100] jonz: removed some legacy piecess

- removed dspam_corpus (replaced by newer dspam_train)
- removed dspam_genaliases (replaced by parse-to-headers, virtual users, etc)

[20060512.0100] jonz: segfault fix for UIDInSignature

fixed a critical bug that can cause segfaults when correcting messages using
UIDInSignature options. database handle is refreshed, but new pointer is never
used.

[20060510.0800] jonz: fix to recognize trainPristine "off" in preferences

preference turned "off" should override config turned "on"

Version 3.6.5
-------------

[20060421.1645] jonz: do not quarantine when delivering summary

bugfix to prevent quarantining of message when delivering summary

[20060421.1630] jonz: pgsql performance enhancements

improvements to purge scripts and object creation script

[20060419.1300] jonz: admin graph fixes

prevents carriage returns in subjects/fromlines from being written
improves parsing of admin graphs to avoid "last day stackup" scenario 

[20060419.1200] jonz: webui patch

Applied patch submitted by Stefan Huelswitt <s.huelswitt@gmx.de>

Using HTTP redirect to redirect the browser back to the original template
  after the user has executed a link. Doesn't allow the browser to show the URL
  location with the embedded command e.g. retrain=spam. Protects the user from
  accidental re-execute of the command by reloading the page.
Changed DisplayHistory to scan the entire logfile first and only then decide
  which information has to be displayed based on history_page. Avoids wrong
  display due to incomplete information available.
Discard 'resent' messages in history display.
Quote single-quote ' (mostly from subject) in javascript command for fragment
  display. Prevents execution of the command.
Added links to previous/next history page.
Added 'history_page=1' to all templates for consistency.
Apply MAX_COL_LEN to history display as well.
Added DATE_FORMAT to configure.pl to allow customized date format in history
   and quarantine (using strftime).
Added OPTMODE to configure.pl to customize preferences tab for OptIn, OptOut
  or no selectable option.
Touching a mailbox timestamp file every time the user displays the quarantine.
This file can be used in a report_quarantine script.
Removed 'sortby=Rating' from all templates as it renders SORT_DEFAULT useless.
In quarantine, keep selected sort method after processing mails.
In GetPrefs, read default prefs first and overlay them with user prefs.

[20060418.0830] jonz: dspam.cgi to use MAX_COL_LEN

MAX_COL_LEN used for calculating column length in WebUI

[20060418.0830] jonz: dspam_admin patch

corrects the output of "dspam_admin aggr pref"

[20060418.1435] jonz: bugfix for flat preference read

fixed a bug causing writing of flat-file preferences to fail on some systems

[20060418.1435] jonz: fix for segfault on clamav connect error

fixed a bug where certain problems establishing connectivity to clamav can
segfault dspam

[20060412.0900] jonz: fix for segfault on empty username

fixed a bug where a NULL username can sneak in and cause a segfault on strdup

[20060331.0800] jonz: fix for ClamAV

applied patch to fix clamav issues

[20060324.0845] jonz: fragment overwrite bug

fixed a bug where a fragment file is overwritten on retrain

[20060324.0845] jonz: fixed invalid read/segfault

dspam.c:3284

[20060222.0830] jonz: fixed segfault on bad configuration

fixed a segfault which can occur if TrainingMode is not specified in dspam.conf

[20060216.1545] jonz: added syslog and logfile flags

added --disable-syslog function to turn off syslogging
added --with-logfile= funciton to define a flat file for logging
 
[20060215.1230] jonz: dspam_stats to total all stats displayed with -t

dspam_stats now displays a total of all stats included in the original query
when -t is used

[20060215.1230] jonz: Markovian result used as X-DSPAM-Confidence

X-DSPAM-Confidence is set using markovian result, whenever markovian
pvalues are used. 

[20060215.1200] jonz: bugfix for dspamc and --deliver=summary

fixed a bug causing --deliver=summary to return no output when used in dspamc

[20060215.1200] jonz: support for read/write servers in mysql_drv

added support for separate read/write servers to be used with mysql_drv. see
dspam.conf for more information.

Version 3.6.4
-------------

[20060211.1515] jonz: added index support for dspam_train

added support for training using an index file to define the order of ham/spam
by specifying dspam_train [username] -i [indexname]. format of index file is
"class filename" where class can be spam/nonspam.

[20060209.1930] jonz: cgi mass retraining patch

applied mass retraining patch submitted by Cove Schneider <cove@wildpackets.com>
 
[20060207.0400] jonz: cgi improvements

- added Undo option to undo retraining
- added support for existing storeFragments option to recall message in history

[20060206.1430] jonz: documented user preference options

documented all available user preferences in 2.5 of README
 
[20060202.1630] jonz: added ClassAlias options

added ClassAlias options to dspam.conf to alias spam/nonspam classes

[20060202.1200] jonz: bugfix for segfault on UIDInSignature with bad UID

fixed a bug which causes a segfault when using UIDInSignature if a bad uid
is specified in the signature
 
[20060131.0830] jonz: bugfix in --classify in client/server mode

fixed a bug causing no output when using --classify in client/server mode

[20060129.0000] jonz: dramatic reduction of token separators

changed token separators in config.h, made noticeable improvement in accuracy
across a few different corpora. old delimiters are still there if we need
to change back.

[20060124.0830] jonz: added dspam_train

a true training and testing mechanism, useful for building pretrained databases
or training a user with their own corpus. also provides a test jig for 
measuring efficiency/accuracy with a corpus over a configuration. 

[20060124.0830] jonz: fixes for dspam_corpus

fixes for dspam_corpus:
- uses default settings for features and training modes, instead of its own
- now requires --spam or --nonspam arguments

[20060124.0830] jonz: removed neural networking functions

experimental, needed a rewrite, no support, and high maintenance

[20060122.1700] jonz: more enhancements to accuracy

- extended range for probabilities from .01/.99 to .0001/.9999
- if a single-corpus token would have a stronger probability with one hit than
none, use the stronger probability

[20060121.1835] jonz: code cleanup / performance improvements

cleaned up text preprocessors (decoders and html scrubbers), avoided using
repeated strlen() functions which were consuming around 25% of the total
processing time. renamed _ds_message_block to _ds_message_part (no reason).

[20060120.0500] jonz: packaging problem with preferences-extension support

packagers trying to build all available storage drivers, but using 
preferences-extension support would end up with a bombed build if they included
any drivers that didn't support preferences-extensions. this has been
corrected so that each driver has a stub to the flat-file preferences code,
which will be called if preferences extensions are disables or unsupported
for that driver. in other words, it should be possible to build all drivers
now with one build, even using preferences-extensions.

Version 3.6.3
-------------

[20060117.0608] jonz: enhancements to accuracy, performance

made several optimizations to enhance accuracy and performance:
- rewrote some routines that were strdup'ing message body repeatedly
- changes to tokenization and probability assignment make a noticeable
  difference in accuracy

[20060113.0400] jonz: change for dspam_stats output

updated dspam_stats "-S" output to use more widely accepted readings:
SHR: Spam Hit Rate (true positive rate)
HSR: Ham Strike Rate (false positive rate)
OCA: Overall Classification Accuracy

[20060111.0830] jonz: bugfix for commandline agent error

fixed bug causing "no trusted delivery agent configured" error when calling
dspam without an agent configured, but not delivering - or when using
--classify

[20060110.0830] jonz: bugfix for ChangeUserOnParse

fixed minor bug causing ChangeUserOnParse to format incorrectly

[20060110.0830] jonz: patch to support multiple users with logrotate

applied patch by Norman Maurer <nm@byteaction.de> to add large-scale support
to dspam_logrotate

[20051213.1200] jonz: memory leak in bayesian noise reduction

corrected a memory leak generated when using bayesian noise reduction

[20051201.0000] jonz: fix for ldap calls

fix to close connections to ldap after calls
fix to fail database creation on ldap failure

Version 3.6.2
-------------

[20051124.0900] jonz: bugfixes for token value calculations

two bugs in how token values are calculated caused a significant rise in
false positives. this is now fixed, cutting false positives nearly in half. 

[20051123.1905] jonz: fix for get_nexttoken and hash_drv

fixed calloc(0) oddity in get_nexttoken in hash_drv

[20051123.0231] jonz: fix for hash_drv in daemon mode

when hash_drv is used in daemon mode without HashConcurrentUser option, 
segfaults can occur due to a failure to initialize the locking subsystem.

Version 3.6.1
-------------

[20051029.2230] jonz: fix for parsetoheaders

fix for parsetoheaders which could have caused a segfault on malformatted "To"
header

[20051029.2230] jonz: qmail support for tracksources

applied patch contributed by Doug Miller <dnm@prentrom.com> to add support
for parsing qmail headers

[20051025.0130] jonz: added check for strcasestr

for some operating systems that have strcasestr, use the os's version instead
of our own
 
[20051026.0130] jonz: fix for x-dspam-reclassified heading

added fix for x-dspam-reclassified heading, which appears blank after
corrective training

[20051025.0800] jonz: plused-detail to work with domains

plused-detail would previously chop off anything after the +, which presented
a problem for systems using a full email address as a username. this fix
will cause user@domain.com to be used if user+mailbox@domain.com is specified,
and plused-detail support is enabled. 

[20051025.0800] jonz: fixed 8-byte alignment for hash databases

using hash_drv on 64-bit processors caused crashes due to the structures in
the file not being 8-byte aligned. added cssconvert tool to convert all 3.6.0
databases to aligned format.

[20051022.0945] jonz: added train.pl script

added train.pl script to scripts/ as an example of how to properly train.
updated markovian discrimination documents

[20051020.0800] jonz: added processorBias preference

added preference option to set processorBias

[20051020.0800] jonz: fixed daemon-mode streaming issues

fixed minor bugs causing trailing periods to be outputted after summaries
causing streaming tools to break

[20051020.0800] jonz: fixed document source bug

fixed a bug causing all documents processed with "DataSource document" to
fail

[20051019.0300] jonz: fixed segfault on malformed Content-Type header

fixed a segfault caused by invalid read on malformed Content-Type header

[20051017.0800] jonz: fix for history in dspam.cgi

fixed a typo causing the history to display blank in dspam.cgi

Version 3.6.0
-------------

[20051016.1930] jonz: automatic whitelisting now trusted sender system

Instead of senders having to send you zero spam in order to be whitelisted, 
they will have to send you less than one spam for every fifteen legitimate 
messages. 

[20051013.0820] jonz: fix for header truncation bug

fixed condition where headers could be truncated if > 4k in length

[20051002.2100] jonz: dynamic storage driver support for linux and freebsd

added fix for linux and freebsd to work with dynamic storage drivers; added
-rdynamic to LDFLAGS

[20051001.1100] jonz: fixed dspam_merge tool

fixed (rewrote) dspam_merge tool to work correctly.

[20050930.0300] jonz: fixed spurious tabs in user log

removed tabs from subject and sender in user log to avoid cgi malformatting

[20050930.0300] jonz: added hash autoextend, csscompress tool

added hash autoextend options to make hashes automatically grow.
added csscompress tool to compress extents.
made hash_drv default driver.

[20050929.0700] jonz: added scan for dlopen/-ldl

added a check to see if -ldl is needed to use dlopen/dlsym/etc

[20050929.0700] jonz: fixed termination boundary bug

fixed a bug causing a termination boundry to be written at end of html
segment instead of on separate line, under certain conditions

[20050928.2100] jonz: fixed bugs in _ds_pref_set dynamic functions

bug causing segfault, was calling the driver inappropriately

[20050928.2100] jonz: tokenizer tweak: don't ignore digits

stopped ignoring digits to further improve accuracy (old, paranoid code)

Version 3.6.rc3
---------------

[20050928.1800] jonz: removed token reassembly from ngram tokenizer

removed token reassembly (individual letters, and chained letters appear to
be 1% more accurate during training)

[20050928.1800] jonz: added signature to deliver=summary

added signature=[sig] when specifying --deliver=summary

[20050925.2200] jonz: added infinite improbability drive

added infinite improbability drive, ImprobabilityDrive on

[20050924.2040] jonz: removed legacy algorithm switches from configure

removed vintage (old) algorithm switches from autoconf, e.g. 
--enable-chi-square as they are now defined in dspam.conf. developers should
now start using the CTX->algorithms context member and the DSA_ and DSP_
selections

[20050924.2030] jonz: renamed SM/IM in dspam_stats

renamed SM/IM in dspam_stats to TP (true positives) and TN (true negatives)

[20050924.1345] jonz: relicensed; bound to GPL v2

after reading rms' latest responses hinting at the GPL3, which to some
degree mandates feature continuity for commercial use and other limits on
use rather than distribution, I thought I would take the opportunity while 
I have it to bind to the GPL v2. should the GPLv3 turn out to be sane, we can 
always open it back up to license under it.  I have removed the "either 
version 2 or later versions" and replaced it with "version 2". ianal, but
since no later versions existed during the period which dspam was available 
under old terms, it is my understanding that previous versions of dspam may 
similarly _not_ be applied to the GPLv3 as it would create a temporal paradox.
even if this is incorrect, 3.6 will be released specifically under the GPLv2.

[20050923.2100] jonz: renamed css_drv to hash_drv

to avoid confusion, the CRM Sparse Spectra driver is now known as simply the
hash driver, and is configured as hash_drv.

[20050923.0800] jonz: added bias mode for markovian discrimination

added support for ProcessorBias to markovian discrimination calculations.
NOTE: there is a significant change in results, and you may wish to leave bias
      turned off (the way Bill intended it). See these sample results:

Without Bias:

sample          TP:   956 TN:   992 FN:    49 FP:    13 SC:     0 IC:     0

With Bias:

sample          TP:   863 TN:  1004 FN:   140 FP:     3 SC:     8 IC:     0

[20050923.0800] jonz: added ProcessorBias and TestConditionalTraining

added these options to dspam.conf, instead of using as configure options.
be sure to add these to your existing dspam.conf to avoid changes in dspam's
behavior (see UPGRADING).

Version 3.6.rc2
---------------

[20050922.0400] jonz: added MySQLSupressQuote for MySQL 4.1 quoting bug

documented MySQL quoting bug in some versions of 4.1 (see doc/mysql.txt), and
added MySQLSupressQuote option to compensate, or alternatively you could just
upgrade to a better version of MySQL

[20050920.1800] jonz: added persistent mode support for css_drv

NOTE: css_drv has since been renamed to hash_drv

added daemon mode support for css_drv when using CSSConcurrentUser (permanently
mmap's user's database into memory). no need for daemon mode when NOT using this
feature so don't bother, but is very fast if you are using a single global 
css file.

[20050916.0800] jonz: added override for css_rec_max to dspam.conf

CSSRecMax can now be configured in dspam.conf

[20050914.2215] jonz: added DataSource and ProcessorWordOccurence options

if "DataSource document" is used in dspam.conf, all input will be treated as
a message "body" (e.g. a document) rather than split up between headers and
body. this is useful if classifying things other than email.

if "ProcessorWordOccurence" is set to 'occurrence', all word counts are
based on occurrence rather than per-message. this may be useful when
classifying large documents.

Version 3.6.rc1
---------------

[20050914.0700] jonz: applied history paging patch

applied patch contributed by Norman Maurer <nm@byteaction.de> to add paging
to webui history

[20050913.0700] jonz: added css tools

in tools.css_drv:
  cssgrow - grow (or shrink) the capacity of a css file
  cssstat - report css file statistics (records free, used, max, etc)
  cssclean - free all tokens which have only been seen once
    (there is no date in css files, so only run this once every 30-90 days)

[20050910.2145] jonz: css_drv using single .css file

moved counters for both spam and nonspam into single .css file, saving
considerable disk space by removing extra set of keys. also added header
to css files containing record count, for cssgrow tool.

NOTE: new format is incompatible with older development versions

[20050910.1400] jonz: minor dspam_stats output changes

SM and IM changed to FN and FP, respectively

[20050910.1400] jonz: commandline enhancements

added --class=nonspam which is the same as --class=innocent, depending on how
users like to specify

added --deliver=stdout which is a shortcut for --deliver=innocent,spam --stdout

[20050910.1400] jonz: completed markovian discrimination

Completed markovian discrimination algorithms and implemented 'naive' 
combination. Accuracy tests show significant improvement.

[20050830.1500] jonz: RCPT TO and Broken Case bugfix

fixed a bug where Broken Case wasn't working when in LMTP RCPT mode

Version 3.6.beta.2
------------------

[20050827.2130] jonz: dynamic storage driver library support

support for dynamic storage driver library builds (including multiple driver
builds for packagers) is now supported through the existing 
--with-storage-driver function. specifying a single storage driver, such as:

--with-storage-driver=mysql_drv

will build a statically linked storage driver (the same behavior as previous
versions of dspam). specifying multiple drivers, however, will build dynamic
libraries - one of which can then be dynamically loaded at run-time by
setting the StorageDriver parameter in dspam.conf (see the new dspam.conf for
more information). for example:

--with-storage-driver=mysql_drv,pgsql_drv,ora_drv

NOTE: required parameters for all activated drivers must be provided

users wishing to build only one storage driver, but dynamically loaded instead
of statically linked, may supply the same driver name twice to enable 
dynamic loading:

--with-storage-driver=mysql_drv,mysql_drv 

[20050822.2350] jonz: added new X-DSPAM-Result / X-DSPAM-Reclassified values

The following values may now appear under X-DSPAM-Result / X-DSPAM-Reclassified
Spam, Innocent, Whitelisted, Blacklisted, Blocklisted, Virus

This is made possible by the addition of CTX->class (char[32]), whose constants
are made available in language.h. Previous CTX->result options specifying 
DSR_ISWHITELISTED and DSR_ISVIRUS values are now gone, leaving only 
DSR_ISINNOCENT and DSR_ISSPAM for backward-compatibility (whitelisted
messages will be marked as DSR_ISINNOCENT). Future applications should use
CTX->class instead of CTX->result for more specific results info.

[20050822.1000] jonz: added optOutClamAV option to opt out of virus scanning

User-configurable preference optOutClamAV added (set to 'on') to opt out of
A/V, if active.

[20050815.0800] jonz; added custom tags to dspam.conf for virtual uids

added tags to dspam.conf to use custom table/field names for dspam_virtual_uids
allowing you to use a postfix table or some other table for username/uid
lookups.

Version 3.5.3
-------------

[20050803.0700] jonz: changes to pgsql drive for 8.0+

changes to the pgsql driver have been made increasing performance by three
times on versions 8.0+. the driver will auto-detect which version you are
running in and take advantage of a new lookup_tokens function. 

IMPORTANT: If you're upgrading dspam on your pgsql 8.0+ system, you must
create the lookup_tokens function added to the pgsql_objects.sql script or
everything will break miserably.
 
[20050801.0600] jonz: updated postfix documentation

added model using postfix's content filter

[20050801.0600] jonz: fixed bug in mysql driver

fixed a bug causing a segfault in mysql_drv when certain preferences are
null or have null values
  
Version 3.5.2
-------------

[20050714.0715] jonz: added extern "C" for C++ compilers

added extern "C" to libdspam.h for C++ compilers

[20050714.0715] jonz: more work on css_drv

more work on css_drv; functionality added to make dspam_stats work

[20050714.0715] jonz: added WEB_ROOT to CGI

added WEB_ROOT to configure.pl, location to webui's htdocs contents

[20050714.0715] jonz: sqlite 3 purge script changes

added a sqlite 3 specific purge script in tools.sqlite_drv

[20050702.0030] jonz: cgi review and fixes

reviewed cgi, made minor fixes to history display and admin graphs

Version 3.5.1
-------------
  
[20050608.0330] jonz: fixed socket file descriptor leak on delivery failure

fixed a file descriptor leak which occurs on socket delivery connect failure,
leading to the daemon exiting when it has exceeded its file descriptor limit

[20050607.2200] jonz: added preliminary ldap support

added ldap verification function which will verify existence of user on an ldap
server before adding to virtual uid table. dspam links to openldap libraries and
is also compatible with os x's openldap libs. use --enable-ldap to enable. see
dspam.conf for more information.

STATUS:
  functionality is presently limited to lookup of dspam username only. user and
  domain as separate variables will come later. additional ldap mode to support
  actual uid lookups on ldap server will be added later as well.

  authentication also coming later

[20050521.1600] jonz: smtp/lmtp delivery to accept all 2xx delivery responses

DBmail appears to be the only MTA that sends a 215 response to a successful 
delivery, instead of 250. After reviewing RFC, the deliver_socket() code has 
been modified to accept any 2xx message in response to a final delivery 
instruction.

[20050521.0730] jonz: improved performance of css_drv

made additional performance improvements to css_drv. plan to finish up driver
later this weekend.

Version 3.5.0
-------------

[20050519.0600] jonz: made preferences case-insensitive

user preferences now case insensitive

[20050512.0400] jonz: added experimental css_drv (CRM114 sparse spectra driver)

added *experimental* css_drv which is a flat-file based driver using Bill
Yerazunis' "sparse spectra" approach to storage. each user will have two 16MB
fixed-size files containing up to a million tokens (this will be adjustable
later). 

[20050511.2045] jonz: added markovian weighting functions

added functions to support markovian weighting (the algorithm used in
crm114). markovian discrimination has shown to be very effective at 
classifying text, and with the new css_drv driver, performs very fast. 

note: dspam_dump and dspam_clean -p presently do not function with markovian
discrimination, because weights are not stored in the database (they are
computed when a message is processed). dspam_dump will inaccurately show a 
value of 0.5000 for all tokens. do not run dspam_clean -p at all when using
markovian weighting, as the band of interestingness is entirely different
and dspam_clean -p is likely to wipe out /all/ markov-based tokens.

finally, see doc/markov.txt for more information on configuring dspam as a
markovian filter
 
[20050505.0800] jonz: added plused detail support

applied patch submitted by Arnaldo Viegas de Lima <arnaldo@viegasdelima.com>
to add plused-detail support (username+mailbox). 

[20050501.0800] jonz: LMTP error codes to include more descriptive text

LMTP error codes should include relevant text from error where possible

[20050501.0800] jonz: added preliminary support for Clam/AV

added support for clamd virus checking via TCP via ClamAVHost and ClamAVPort
properties (see dspam.conf. added ClamAVResponse to control how DSPAM should
act if a virus is found.

use --enable-clamav to enable

[20050426.2300] jonz: added storeFragments preference

when set to 'on', dspam will store 1k of the message body in user.frag/sig.frag

[20050425.0700] jonz: changed copyright notice

changed copyright notice to reflect my company name change (I am ditching
Network Dweebs Corp. in exchange for Jonathan A. Zdziarski)

[20050422.0700] jonz: got rid of report_error and report_error_printf

all logging now calls LOG() which syslogs, prints to stderr, and debug
LOGDEBUG remains

updated language.h with more sensible error codes partitioned by service
components.

[20050419.0200] jonz: got rid of retrain.log entirely, reworked logs

reworked the system and user logs to be a little more useful and removed the
retrain.log entirely. it's pretty simple:

1 entry for a message process
1 entry for a retrain using same signature id
1 entry for a delivery error using same signature id

the message id is also stored to detect resends. updated dspam.cgi.

[20050417.1400] jonz: added support for domain-based delivery hosts

you can now configure:

DeliveryHost.domain.com	a.b.c.da

in dspam.conf

[20050415.0700] jonz: added RBLInoculate option

default is now changed to handle RBL spam like all other spam, but when
setting RBLLearnAsSpam to "on", RBL spam will be inoculated

[20050414.1900] jonz: added extended logging to system.log

added recipient and beginnings of a "status" or remarks section for system.log.

[20050413.2005] jonz: added PgSQLUIDInSignature option

applied patch contributed by Alexandre Biancalana to mirror UID support for
postgres

[20050413.2005] jonz: added MySQLUIDInSignature option

for mysql users who want to put the user id in the signature (and effectively
have only one spam address for all users), this feature uses uid,signature
instead of just signature.

[20050412.0800] jonz: fix for cgi during time zone changes

applied patch for cgi losing data during time zone changes

[20050410.2000] jonz: adjusted mysql/pgsql objects/purge scripts for performance

made adjustments based on performance without token counter indexes

[02050409.1400] jonz: added 'date' to quarantine display, reversed ordering

added 'date' field to quarantine display using X-DSPAM-Processed header.
reversed 'Date' ordering to list most recent at top

[20050405.1815] jonz: minor code cleanup

minor cleanup of static -1 values in code, replaced with #define'd values

removed configure help strings and checking notifications for developer-only
features. will phase out in a few versions.

[20050402.1945] jonz: changed domain-based groups to use *@domain

domain-based groups have been changed to require *@domain.tld instead of
@domain.tld in the group file. 

[20050401.0100] jonz: added domain blocklisting

added domain blocklisting via data/user.blocklist file; does not train. useful
for filtering unwanted nonspam. may expand for full support later via cgi.

[20050331.0600] jonz: external default preferences merged

external preferences (default preferences stored in the database or .pref
file) were not previously used when merging system and user preferences. this
now changes behavior so that dspam.conf preferences are only loaded in absence 
of default preferences set in the database.
 
[20050331.0600] jonz: added fallback domain support

fallback domain support is for systems using a user's full email address as
a username, and allows users being processed who have no preferences to
fallback to using @domain.tld as the username (which should be appropriately
trained and set to notrain). this is only useful under limited circumstances,
and would require that all valid system users be given at least one
preference when provsioned. To use:
   1. Add "FallbackDomains on" to dspam.conf
   2. Create wildcard domains (@domain.tld) as users in dspam
   3. Set each wildcard domain with the "fallbackDomain on" preference

Version 3.4.9
-------------

[20050808.0000] jonz: upgraded streamlined blacklisting support to RABL
changed configuration options for streamlined blacklisting to support the RABL

[20050714.0700] jonz: add X-DSPAM-User when using localStore

added X-DSPAM-User heading to spam when localStore preference is used, so that
destination address will be right

[20050714.0700] jonz: bugfix for header decoding

fixed a bug where the decoded headers (used internally) were delivered, instead
of the original headers

Version 3.4.8
-------------

[20050612.2200] jonz: fixed null decoding bug

fixed a bug where placing %00 or =00 in an encoded message would decode the NUL character, causing the message to become truncated.

[20050611.1000] jonz: fixed broken managed group functions

fixed many files (such as logs and corpus maker) which were not written to
the correct directory when used with managed groups

[20050610.0530] jonz: message truncation fix

fixed a problem where some mta's woulud fail to quote a single dot on a line
by itself, leading to message truncation. solution was to check for quoting
inside dspam too.

[20050606.0420] jonz: fixes for encoding shifts

7bit encoding tag changed to 8bit, due to some 8bit data in decoded messages

[20050606.0420] jonz: fix for multiple-recipient decoding bug

bug addresses siguation where messages addressed to multiple recipients may fail
to be completely decoded

Version 3.4.7
-------------

[20050522.2300] jonz: fix for managed groups reclassification

fix for managed groups where delivery of false positives would fail due tothe managed group not being recognized

[20050512.0600] jonz: fix for signature embedding with malformatted boundary

fixed a bug where messages lacking a terminating boundary would fail to
receive a signature in the message body

Version 3.4.6
-------------

[20050503.2000] jonz: fixed obscure malformatted signature crash bug

fixed an obscure bug causing dspam to crash under certain conditions when the
loose signature was provided without the appropriate delimiter

Version 3.4.5
-------------

[20050417.0900] jonz: fixed firstrun/firstspam messages with groups

fixed condition where users in groups would never receive firstrun or firstspam
messages.

IMPORTANT: this fix will cause all users to receive the notifications for the
first time. if you want to avoid this, touch user.firstrun and user.firstspam
files in each user's data directory.

[20050408.0300] jonz: set proper permissions for socket file

set proper permissions (rwxrwxrwx) when creating socket file

[20050419.2200] jonz: applied signature retrieval failure error patch

applied patch to fix some false signature retrievals

[20050415.2300] jonz: LMTP error codes for temporary failures

changed relevant LMTP error codes to report temporary failures instead of
permanent failures, to fix mail loss problem.

for permanent failures, we still send a permanent failure

[20050412.2130] jonz: fix for notspam- aliases and changeuseronparse

corrected parsing error causing notspam- aliases to fail to identify username
when using changeuseronparse

Version 3.4.4
-------------

[20050412.0800] jonz: critical fix for signature-in-body

fixed a critical bug that caused the signature to NOT be written to the
message body on quoted-printable or base64 messages

Version 3.4.3
-------------

[20050407.0800] jonz: fix for LMTP and QuarantineAgent

bugfix for bug causing spam to be delivered via LMTP when a quarantine
agent is specified

[20050406.2300] jonz: fix for domain-scale and admin.cgi

fix for admin.cgi when using domain-scale; could not previously find data dirs

[20050406.0700] jonz: fixed cygwin build errors

fixed cygwin build errors by removing support for ip blacklisting (cygwin
does not have the necessary functions to perform these lookups, at least
nothing standard + mt-safe)

[20050406.0700] jonz: fixed optIn + preferences extension support

fixed a bug where dspam_admin 'ch pref' would fail on nonexistent users,
making it impossible to opt-in users when using preferences extension

[20050404.0700] jonz: applied various cgi patches

applied various cgi patches to fix:
- showFactors preference not setting/showing correctly in admin.cgi
- empty spamSubject raised errors when writing preferences
- global default.prefs file should not be written with pref extensions

[20050402.1000] jonz: changed log status of signature errors

changed logging status of signature-based errors so they show up in the
error and system logs, rather than just debug

[20050402.0900] jonz: fixed system.log formatting bug

fixed a bug causing the system.log to be malformatted when reclassifying a
message

           
Version 3.4.2
-------------

[20050328.0600] jonz: memory leak fix for chained token

fixed a very small memory leak in chained tokens implementation

[20050327.2330] jonz: updates to cgi

added retrain log to keep track of messages retrained, which allows the cgi
to display 'Retrained' in the history next to messages that have already been
retrained. history retraining and quarantine retraining both log this info.

history retraining now forwards false positives and removed from quarantine

made top and bottom row of buttons symmetric in quarantine view (deliver
checked, delete checked, delete all)

[20050327.2330] jonz: no logs when using --classify

no logs should be written when using --classify

[20050327.2330] jonz: users with signatureLocation=headers should prefer header

users using signatureLocation=headers should prefer the X-DSPAM-Signature
header over any signature in the body (this can apparently cause problems 
when conversing with another dspam user)

[20050327.1400] jonz: fix for history retraining

fix for history based retraining that allows non-admins to use the feature

[20050327.1400] jonz: applied pgsql patch

applied pgsql patch to remmove duplicate key warnings and improve performance

[20050326.1730] jonz: added support for non-ascii character sets

added support for non-ascii character sets

[20050325.2345] jonz: fixed avg. processing time reporting

fixed avg. processing time reporting in daemon mode; was previously reporting
time since daemon start

[20050324.0730] jonz: added ChangeUserOnParse 'full' option

setting ChangeUserOnParse to 'full' now uses the entire email address after
the initial {spam,notspam}- identifier allowing the domain to be used. 
using 'on' or 'user' continues to default to just username.

[20050324.0700] jonz: added long usernames support to dspam stats

if --enable-long-usernames is used, formatting will add additional padding
for username in stats output

[20050322.2330] jonz: fixed deadlocked connection on null characters

fixed a condition where a 300-second deadlock, then timeout occurs when
client sends \0 to server

[20050322.2330] jonz: fixed some minor memory leaks

minor leaks in daemon process 

[20050320.0400] jonz: fixed pgsql_drv connection linger bug

fixed a bug when using pgsql_drv and preferences extension that fails to
close a pgsql database connection on exit, leaving connections lingering.

Version 3.4.1
-------------

[20050320.1700] jonz: fix for exit codes with daemon+stdout

fixed a bug causing Broken returnCodes to fail when using daemon in 
combination with --stdout.

[20050318.1930] jonz: fixed transliteration of ""

fixed transliteration of "" so that empty spaces can be passed to LDA

[20050318.1900] jonz: implemented DSPAMPROCESSMODE service tag

implemented DSPAMPROCESSMODE service tag for proprietary client functions

[20050318.0700] jonz: added RSET functionality

added RSET functionality to daemon 

[20050318.0700] jonz: added localStore preference for storage

added localStore preference to set local storage username, which can override
the DSPAM username (ideal for managing aliases)

[20050318.0700] jonz: fixed bug related to corpusfeeding and signatures

normally you're not supposed to corpus traini with email containing dspam
signatures, but that doesn't mean dspam should bomb when you do. fixed a
bug where the signature would cause the process to fail (since the signature
isn't in the database anymore).

[20050318.0700] jonz: fixed bug sending empty parameter via client

fixed a bug where sending an empty parameter via client (for example -a "") 
would break the arg chain, as the empty argument would not be quoted

[20050318.0700] jonz: fixed track sources on report spam bug

fixed a bug causing tracksources to report a spam from the user who forwarded
it to correct the filter

[20050317.0500] jonz: fix for 5.1.0's in daemon mode

fixed condition where 5.1.0 error would occur if TrustedDeliveryAgent was
not specified, even if DeliveryHost was.

fixed condition where 5.1.0 error would be followed by 354 (response to DATA)
instead of failing

[20050316.0700] jonz: fixed crash on long argv list

fixed a crash/potential arg list overflow for authentication DSPAM clients
sending too many arguments.

[20050316.0100] jonz: reclassification failures should not deliver on failure

false positive reclassification failures shouldn't deliver when a signature
isn't found, as a spammer could take advantage of a system using false 
positive aliases and send mail through them.

[20050316.0100] jonz: changed fp- to notspam-

changed fp- to notspam- in parsetoheaders and docs

[20050316.0030] jonz: added admin.cgi opt in/out preferences support

added support for opt in/out to admin.cgi

[20050315.0800] jonz: fixed broken returnCodes when using dspamc

fixed broken return codes when using dspamc, which involved rewriting the
return code reporting pieces of the agent to include a classification. when
running in 'DSPAM' mode, : CLASSIFICATION is sent back with each response.

[20050315.0800] jonz: applied patch to fix file descriptor problems

when listener fails, file descriptor fails to close, causing a possibly
fatal situation in the event of too many failures

fixed invalid file descriptor problem

[20050315.0730] jonz: added LOG_NOWAIT to syslogging

added LOG_NOWAIT to syslogging, to prevent dspam from hanging if syslog is
nonresponsive

[20050313.0700] jonz: fixed bugs handling multiple rcpt to's
 
fixed bugs in the handling of multiple rcpt to's. strange behavior; some email
would get sent to the last recipient only. some email would never be 
processed for a user. fixed a bug where specifying --user without recipients
could cause a crash. fixed it again.

[20050313.0700] jonz: changed lmtp commandline options

--lmtp-rcpt-to and --lmtp-recipient have been changed to --rcpt-to, which
takes a list of users like --user does. For example:

--user spot dick jane --rcpt-to spot@domain.com dick@domain.com jane@domain.com

This allows for each user to be assigned its own unique recipient. If the
recipient is not provided, it will default to the username being processed.

--lmtp-mail-from has been changed to --mail-from

[20050312.1600] jonz: added SMTP delivery support, changes delivery hosts

added SMTP delivery support via DeliveryProto option in dspam.conf

NOTE: configuration has changed for LMTP host delivery. See new dspam.conf
      * fixed crash bug related to an earlier committ

[20050312.1530] jonz: added virtual_user_aliases.sql for mysql

added virtual_user_aliases.sql, providing an alternative dspam_virtual_users
table example where aliases to a single dspam user can be created. ideal for
large mail systems with automatic provisioning of uids. this does, however,
break dspam's own ability to create new virtual users, so they will need to
be added manually if this aliases approach is used.

[20050312.1035] jonz: added ChangeModeOnParse functionality

added ChangeModeOnParse option to set the proper --class and --source when
ParseToHeaders deteects a spam- or fp- address. This allows a system to be
implemented without aliases, as all requests can be passed to dspam.


[20050313.0700] jonz: added \r to socket communications

as per RFC 2822, \r\n should be sent to SMTP/LMTP

Version 3.4.0
-------------

[20050311.0800] jonz: moved supplementary documentation into doc/

moved all supplementary documentation into doc/ including storage driver,
operating system, and mta configuration documentation.

[20050311.0800] jonz: miscellaneous bug fixes

many minor bug fixes

Version 3.4.pr1
---------------
  
[20050308.0800] jonz: fixed header duplication on reclassify

fixed a bug where a new set of headers would be added to the old set upon
reclassification. now, only X-DSPAM-Reclassified header is added. also fixed
duplicate signature being written.

[20050305.0900] jonz: fixed bug in cgi autodetect of extensions

fixed bug causing extensions detection to fail in cgi for users with older
configure.pl scripts lacking AUTODETECT property. also propagated ability to
"not" autodetect to admin.cgi

[20050304.2100] jonz: fixed user duplicate bug in ParseToHeaders+Daemon combo

fixed a bug causing a user to be processed twice when using ParseToHeaders and
server daemon mode

[20050302.0730] jonz: ClientHost to be used for client domain sockets

If connecting to DLMTP client via domain socket, ClientHost is now used as the
path to the socket on the client side, and ServerDomainSocketPath on the 
server side.

[20050302.0730] jonz: minor lmtp protocol tweaks

minor tweaks to LMTP protocol service

[20050302.0730] jonz: fixed a bug with TOE and automatic whitelisting

fixed a bug where the "spam hit" from a false positive wouldn't be reversed
for users using TOE after their training threshold has been exceeded

[20050302.0700] jonz: fixed unknown token value bug

fixed a bug where unknown tokens would be assigned an innocent value rather
than neutral
 
Version 3.4.rc2a
----------------

[20050301.0700] jonz: fix for segfault on malformatted MAIL FROM

fixed a bug causing a segfault when MAIL FROM was malformatted in DLMTP mode

[20050301.0700] jonz: fix for pgsql double-innocent bug

important bug fix for pgsql driver, where new innocent tokens were learned
twice, killing accuracy since RC1

[20050301.0600] jonz: rewrote flat file preference code

rewrote flat file preference code after discovering many bugs with patches
submitted over past few months
 
Version 3.4.rc2
---------------

[20050227.1515] jonz: LMTP enhancements

1.  if --user is specified in ServerParameters when DSPAM is running in 
standard LMTP mode, then RCPT TO will be used only for delivery, and not 
processing.

2.  added 'auto' mode for server, to auto-detect DLMTP or LMTP based on
the LHLO ident
 
3. added extended status codes and initial LMTP extensions list

[20050225.0700] jonz: added support for LMTP delivery via domai socket

specifying a path as LMTPDeliveryHost will cause DSPAM to connect and deliver
via domain socket (instead of TCP).

[20050225.0600] jonz: added %r and %s LDA arguments

For LMTP front-end with LDA delivery, the following conventions may be used
in ServerParameters:

%r - the RCPT TO provided via LMTP
%s - the MAIL FROM provided via LMTP
 
in both cases, only the content between < > is actually used

[20050225.0500] jonz: added pass-through of mail from

added pass-through of contents of MAIL FROM inside < >. for commandline
functionality, also added --lmtp-mail-from to define it.
 
[20050224.0700] jonz: added "standard" LMTP inbound support

added a ServerMode option to dspam.conf allowing DSPAM to function in either
"dspam" proprietary LMTP mode or "standard" LMTP server mode. see README 
section 2.4 for more information. support for multiple RCPT TO's added.  
Fixed pass-through of error messages, to show failure instead of "accepted"
when a failure actually did occur.

[20050224.0700] jonz: fixed dspamc hanging bug

applied fix contributed by Peter Santiago to prevent dspamc from hanging 

[20050224.0700] jonz: added lmtp delivery support

added support to deliver via LMTP instead of LDA. although this doesn't require
operating in daemon mode, you must compile with --enable-daemon as the code
uses some of the daemon's socket components.

Version 3.4.rc1
---------------

[20050214.0800] jonz: miscellaneous minor patches

miscellaneous minor patches to CGI and some small subroutines

Version 3.4.beta.3
------------------

[20050208.0330] jonz: added CGI spam functionality to history

added a retrain link in the history portion of the CGI allowing users to
retrain spam misses without having to forward messages.

[20050205.1800] jonz: added fast user switching to CGI for admins

added the ability for admins to change users dynamically in the cgi

[20050204.0800] jonz: applied patch to improve performance in pgsql migration

patched for performance improvement in pgsql migration tool

[20050203.0800] jonz: signature to write to all text segments

reverted back to signature writing to all text segments of an email, and not
just the top level parts.

[20050202.0530] jonz: miscellaneous cgi patches

applied many miscellaneous cgi patches submitted by Ben Reser <ben@reser.org>

Version 3.4.beta.2
------------------

[20050129.1945] jonz: applied cgi patch for delivery failure

applied patch submitted by Martin Forssen <maf@appgate.com> to avoid losing
mail in the event of a FP delivery failure within the CGI

[20050129.0000] jonz: added doc for osx builds

added doc for building dspam on osx

[20050128.0245] jonz: fixed segfault on post-signature failure

fixed a segfault that only occurs when dspam_process() fails after loading a
signature

[20050126.0100] jonz: fixed whitelist training bug with notrain

when in notrain mode, whitelists were still trained

[20050122.2115] jonz: applied bugfixes to sbl lookup

fixed sbl lookups to use mt-safe functions, also fixed crashing

[20050122.0900] jonz: applied latest postgresql patches

applied latest patches upgrade postgresql drivers. significant change is the 
use of BIGINT to improve performance. while backward compatible, a migration
tool has been provided (see UPGRADING).

[20050119.0300] jonz: set default training buffer to off

sedation should not be instantiated unless specifically specified

[20050119.0245] jonz: applied patch adding aggregation to dspam_admin

applied a patch by Philip Champon <flah@hell.com> adding a preference
aggregation function to dspam_admin, which combined system preferences with
user preferences before outputting.
 
[20050119.0230] jonz: fixed bug in statisticalSedation preference

statisticalSedation preference has been ignored for 2 versions. this bug
has been fixed.

[20050118.0730] jonz: removed requirement for dspam_home in dspam_create()

some storage drivers don't require dspam_home

[20050118.0710] jonz: fixed negative values in web stats

fixed a bug causing negative values to be written in web stats
 
[20050118.0700] jonz: applied patch adding full preferences functionality

applied a patch by Philip Champon <flah@hell.com> adding full preferences
functionality for installs without storage driver extensions. this allows
preferences to be set through dspam_admin for flat file-based systems as well.

[20050116.2330] jonz: tweak to statistical sedation

added a tweak to statistical sedation so that only spammy tokens are
sedated

[20050116.2300] jonz: added makeCorpus preference

added makeCorpus preference which records all messages processed for the user
to dspam_home/data/username/username.corpus/[spam|nonspam]. when an error is
corrected, the file is moved to the appropriate corpus.

[20050116.1800] jonz: change to TUM training mode

TUM training mode to train also when confidence is low

[20050115.1200] jonz: fixed token malalignment bug

fixed a bug causing token names to be malaligned with their values; should
not have affected accuracy, only debug messages 

[20050115.1000] jonz: added dspam_logrotate

added Steve Pellegrin's log rotate script as a tool
 
[20050115.0900] jonz: fixed preference overrides

fixed a bug causing preferences to fail to override

Version 3.4.beta.1
------------------

[20050113.0500] jonz: fixed sqlite shared group bug

fixed a bug where sqlite was ignoring shared groups

[20050112.0600] jonz: added stats functionality to dspam_stats

added the following functionality to dspam_stats:

- specifying -S will display accuracy levels as percentages
- specifying -s will use totals "since last reset" (s = snapshot)
- specifying -r will reset the totals in the snapshot
 
to measure accuracy for each month, use dspam_stats -r [username] at the
beginning of the month, and dspam_stats -s -S [username] at the end of the
month.

NOTE: dspam_stats -r uses the same .rstats file as the CGI, so resetting
stats in either will affect the other. this is the desired behavior at
the moment, rather than have two snapshots running around.

[20050111.0700] jonz: fixed preferences to aggregate

fixed preferences so that system prefs and user prefs are aggregated, instead
of either-or.

also fixed a bug causing AllowOverrides to not work in preferences extension

[20050106.1800] jonz: added SBLQueue dspam.conf option

removed hard-coded sbl queue path (/var/spool/sbl) and added to config an
option to specify the path

[20050106.0000] jonz: added manual override for CGI options

for implementations running the cgi as an untrusted user (e.g. cpanel),
the cgi needs to manually override autodetection of certain options like
filesystem scale and preferences extension detection. added option for this
to configure.pl in the cgi.

[20050105.2035] jonz: trainingMode preference to be case insensitive

made training mode preference (toe, tum, etc) to be case insensitive

[20050102.1800] jonz: added --client flag for client operations

added --client flag for client operations. failure to use --client causes
dspam to operate in independent mode

IMPORTANT: --client should be added to the client arguments on daemon-based
           setups

[20050102.2100] jonz: rewrite hash-code into 'diction' and 'term' structures

rewrote the lht code (long-hash-token) into a more specialized 'diction'
structure, which has terms. a diction represents a subset of lexical data
within the user's dataset (namely, the data representing a message).

[20041230.0800] jonz: memory leak and thread-safe cleanup

- fixed memory leak related to bayesian noise reduction calls
- fixed use of inet_ntoa, which is not thread-safe, replaced with inet_ntoa_r
- inet_ntoa_r should never be called if using domain sockets

[20041228.2300] jonz: integrated libbnr

integrated bayesian noise reduction in DSPAM with libbnr sources

[20041225.1400] jonz: added support for trainPristine as user preference

Set trainPristine to 'on' in a user's preference to enable for that user. If
master configuration has TrainPristine on, this will override the preference
option.

[20041224.1400] jonz: implemented daemon SIGHUP reload

upon receiving SIGHUP, daemon will:
- stop listening for new requests
- allow all threads to finish processing
- terminate all connections to the database
- reload dspam.conf
- reestablish all connections to the database
- start listening for new requests

[20041223.2106] jonz: fixed segfault on unreadable dspam.conf

fixed a bug causing a segfault when dspam.conf is unreadable

[20041223.1504] jonz: made url and tag scans case insensitive

url and tag scans now case-insensitive, so HTTP:// and HREF= should also be
detected and tokenized

[20041221.2230] jonz: rewrote signature embedding code

rewrote signature embedding code
- fixed malformatting bug with signed messages
- made body and html tag searches case insensitive

[20041221.0600] jonz: added streamlined blackhole list "learn as spam" support

added functionality to automatically perform streamlined blackhole list
lookups and learn as spam if blacklisted. for more information on the
streamlined blackhole list, see http://www.nuclearelephant.com/projects/sbl/

[20041221.0600] jonz: fixed preferences extensions and storage profiles

fixed preferences extensions so that they will work with storage profiles

[20041217.0700] jonz: replaced btree sort with heap sort, major speedup

replaced btree sort of tokens by delta with a heap window-size sort, major
speedup in performance; a 200k text message down to 2.2s from 12s.

[20041215.1845] jonz: added sqlite3_drv 

added sqlite3 storage driver, use --with-storage-driver=sqlite3_drv

[20041213.0600] jonz: added build hooks for NodalCore(tm) C-Series Accelerator

added build hooks for the NodalCore(tm) C-Series Accelerator Card, the
hardware platform used for Accelerated DSPAM(R). actual adapter code remains
proprietary, so build hooks are only useful if you've licensed the
accelerator.

[20041209.0800] jonz: added profile failover

added failover support for storage profiles

[20041203.0800] jonz: added patch to sort by subject or from in CGI

applied patch adding hotlinks to subject/from fields in dspam.cgi for
sorting.

[20041202.2024] jonz: added dspamc thin client

added dspamc thin client for those who would prefer something light than
running DSPAM in client mode (linked in with libdspam and other unnecessary
code).
 
[20041130.0800] jonz: dspam daemonized LMTP server and client

added --daemon functionality to put DSPAM into daemonized LMTP server mode;
configure client in dspam.conf to talk to daemon, or speak LMTP without the
client. authentication configurable in dspam.conf. implemented stateful
database connections. --stdout and --classify supported. 

NOTE: Daemon mode is multithreaded, and therefore requires a multithreaded
      driver: mysql_drv or pgsql_drv.

[20041127.1700] jonz: normalized process_message()

normalized process_message() by breaking up code into many smaller subroutines

[20041121.1530] jonz: implemented bayesian noise reduction 2.0

see http://bnr.nuclearelephant.com

[20041101.0700] jonz: added MaxMessageSize option to dspam.conf

added a max message size configuration option to specify a maximum message
size to process

[20041024.1700] jonz: moved sources to src/

moved all build sources to src/ for easier management

[20041024.1545] jonz: repeat linestripping as necessary

changed broken linestripping code to repeat as necessary, for some dos
systems with multiple ^M's.

[20041020.2345] jonz: enhanced TUM mode training

greatly enhanced TUM-mode training by setting a dirty bit for each token in
memory; instead of writing all tokens and then adding a conditional where
clause for TUM (where total hits < 50), only tokens whose total hits are
below 50 are included in the sql statement, the extra where clause is no
longer necessary. this helps tum outperform teft in resources.

[20041020.1400] jonz: normalized main()

normalized main() by breaking up code into smaller subroutines

Version 3.2.4
-------------

[20041229.1925] jonz: fix for broken boundary rfc

added fix for intentionally broken mime boundaries

[20041203.0800] jonz: performance fixes for pgsql_drv

minor performance fixed for pgsql_drv that may have a big effect on some
implementations

[20041203.0800] jonz: applied patch to fix build fail when CFLAGS defined

fixed a bug causing tools to fail to build when CFLAGS was specified

[20041203.0745] jonz: fixed addition of spurious colons after delimiter

fixed bug causing a colon to be added to lines after -- delimiters that
were not actual boundary delimiters. this also caused certain encoded portions
of the body to not be decoded, giving the appearance of equal signs (=) in
message bodies.

Version 3.2.3
-------------

[20041125.0925] jonz: rewrote boundary extraction in decode.c

rewrote boundary extraction in decode.c to fix a bug where messages could
get mangled if boundary was specified without quotes, but other tags used
quotes

[20041123.1830] jonz: fixed multipart blocks with no content-type

fixed bug causing the DSPAM signature to NOT be written to multipart
blocks without a content-type (broken RFC)

[20041123.0800] jonz: fixed bug in _ds_get_spamrecord broken on mysql 4.1

changed token = '' to token in('') in _ds_get_spamrecord, to fix bug in
mysql 4.1 with respect to numeric fields and quoted conditionals. in('') seems
to work without problem.

[20041117.0800] jonz: fixed critical bug in Bayesian Noise Reduction

fixed a critical bug in Bayesian Noise Reduction causing the algorithm to
never instantiate

Version 3.2.2
-------------

[20041114.2245] jonz: fixed optOut preferences option

fixed a bug causing optOut preference to be ignored

[20041112.0800] jonz: fixed source address tracking bug w/TOE

fixed a bug causing source address tracking to fail when TOE used

[20041112.0800] jonz: fixed LocalMX bug

fixed a bug causing LocalMX to be ignored in dspam.conf

[20041110.0800] jonz: set permissions on dspam.conf to 640

permissions on dspam.conf were defaulted to 750, changed to 640

[20041101.0700] jonz: changed loose signature matching

changed loose signature to X-DSPAM-Signature from the ever useless DSPAM: to
allow the use of signature headers in forwarded attachments.

[20041109.0800] jonz: fixed source address tracking

fixed source address tracking by removing an old #ifdef that never got
defined in 3.2. also changed 'ham' to 'nonspam' in dspam.conf.

[20041109.0800] jonz: adjusted chi-square cutoff

changed chi-square cutoff from 0.5000 to 0.5010 to avoid erroneous 
classifications when there is no data

[20041108.0800] jonz: fixed multiline token bug

fixed a bug where tokens on a multiline header would be ignored past the first
line

[20041103.0745] jonz: fixed segfault on signature scan

fixed a bug causing segfault during scanning of some messages for a signature

[20041103.0745] jonz: fixed signature encoding bug in sqlite_drv

fixed a bug causing signature inserts to fail in sqlite.

Version 3.2.1
-------------

[20041029.0800] jonz: added needed c/r at end of pgp messages

added needed c/r at end of pgp messages

[20041029.0800] jonz: fixed invalid read of free()'d memory

fixed invalid read of free()'d memory caused when parsing multi-line 
header tokens

[20041029.0800] jonz: fixed pragma bug in sqlite_drv

fixed a bug in sqlite_drv causing pragma's in dspam.conf to be ignored

[20041028.0700] jonz: support for mysql 4.1's ON DUPLICATE KEY

added support for mysql 4.1's ON DUPLICATE KEY functionality, so that compiling
with 4.1+ will perform a single insert query without causing duplicate key
failures

[20041025.0600] jonz: memory leaks in dspam_clean

fixed minor memory leaks in dspam_clean

[20041025.0600] jonz: sqlite fixes

fixes to sqlite driver; started using sqlite_[encode|decode]_binary and fixed
calls to sqlite_finalize causing segfaults.

[20041024.2000] jonz: added patch for parsing signature from body

added a patch to parse leading whitespace from signature keys found in
messages with malformed signature lines

[20041024.1845] jonz: added patch for pgsql and PQfreemem()

added patch to search for PQfreemem() and use free() as an alternative

[20041024.0650] jonz: fixed bug with mysql_drv and duplicate key entries

fixed a bug caused by performing multiple inserts simultaneously on the
database

[20041023.0923] jonz: fixed memory malpractice in pgsql_drv

fixed some bugs in pgsql_drv with memory mishandling

[20041021.0730] jonz: fixed attachment for PGP signed messages

fixed a bug in the dspam.txt attachment added to PGP signed messages, causing
the attachment to have an invalid boundary delimiter

[20041021.0700] jonz: put --with-delivery-agent back, minor config fixes

put --with-delivery-agent flag back (formerly, would try and just autodetect)
and made some fixes to escape comma's.

[20041020.1400] jonz: changed default logdir to dspam_home/log

default logdir has been changed to dspam_home/log; this prevents confusion
around permissions on /var/log

[20041020.1400] jonz: applied patch for man page install

applied patch adding $(DESTDIR) to man page install

[20041020.1200] jonz: applied patch for URL parsing bug

applied a patch causing an invalid memory read when an email ends with
http://

Version 3.2.0
-------------

[20041020.0100] jonz: fixed mysql performance bottleneck

fixed mysql performance bottleneck with inserts by using multi-row inserts
instead of hundreds of individual inserts

[20041019.2315] jonz: changes to mysql 4.1 purge script

changed IN() to left-join query for faster purging
rewrote all fields as 'not null' 

[20041019.2200] jonz: made all rows not null in mysql

made all rows in mysql scripts, conserves 1-2 bytes per row and speeds up
just a hair

[20041018.0800] jonz: added qmail/vpopmail instructions

added qmail/vpopmail instructions contributed by John Peacock 
<jpeacock@rowman.com>
 
[20041018.0800] jonz: split up MTA configuration into multiple README files

split up the MTA configuration section of the README into multiple files.

[20041018.0730] jonz: fix for write of .stats files on notrain

.stats files shouldn't be getting written when in notrain mode

[20041018.0700] jonz: memory leak fixes for pgsql

many memory leak fixes for postgresql driver

[20041017.1725] jonz: added mysql4-initialization configure option

added an option to disable mysql client library initialization and
cleanup. this is only really useful if you're using libdspam with a third
party application that requires this (e.g. the application accesses
libmysqlclient itself, and therefore needs to manage startup and shutdown
of the library).

[20041016.1935] jonz: fixed massive number of memory leaks

fixed a massive number of memory leaks in libdspam and the agent and
incorrect memory management practices.

[20041015.1700] jonz: fixed sedation deactivation

fixed a bug causing statistical sedation to _not_ deactivate even when the
training buffer level was set to 0.

[20041015.0800] jonz: fixed dspam_admin segfaults on invalid syntax

fixed bugs in dspam_admin causing a segfault (instead of print of usage
information) when too few arguments for a function were specified

[20041015.0800] jonz: fixed preferences extensions in admin.cgi

fixed bugs in admin.cgi causing server errors when preferences extensions
was used.

[20041014.1130] jonz: bugfix for dspam_dump

applied bugfix for dspam_dump to username correctly when commandline options
are specified.
 
Version 3.2.pr1
---------------

[20041014.0800] jonz: added WITHOUT OIDS to all pgsql tables

turned off OIDS for all pgsql tables, speeding up table access significantly

[20041014.0000] jonz: added DSPAM_BIN to path in CGI

added DSPAM_BIN to the path in configure.pl; some CGIs weren't finding
the DSPAM binaries

[20041013.1800] jonz: added mysql 4.1 objects script, renamed .sql files

did some minor renaming of .sql files. added a mysql 4.1 object script which
uses bigint/unsigned instead of char(20) for tokens. put neural networking
in its own file.

[20041013.0900] jonz: added mysql/postgres purge scripts for TOE and TUM

added purge-pe.sql for mysql/postgres with preferences extensions.
additional logic skips certain purges for TOE and TUM-mode users.

[20041013.0830] jonz: consolidated error messages in language.h

consolidated error messages and other important output in language.h to
centralize most commonly used output, and to make translation easier.

[20041012.2330] jonz: dspam_clean fix for toe training mode

added patch to dspam_clean to skip certain unused token operations for users 
with toe training mode, since their last_hit value is never updated. left token
probability operation, as it will use the date the token was first hit which
is still useful.

[20041012.0300] jonz: removed !DSPAM tag from X-DSPAM-Signature header

when using signatureLocation=headers, the !DSPAM tag is no longer written to
the header, just the signature. backward-compatible.

[20041012.0300] jonz: changed location of dspam_home and dspam.conf

dspam.conf now defaulted to sysconfdir (default: /usr/local/etc)

dspam home now defaulted to prefix/var/dspam (default /usr/local/var/dspam)

can still override dspam home using --with-dspam-home

[20041011.0300] jonz: added --signature= functionality

for commandline signature correction where the admin would prefer to just
specify the signature on the commandline, --signature=[signature] can be
used. only the signature itself should be provided, and not the !DSPAM tag.

[20041009.1830] jonz: bugfixes for inline decoding

fixed a bug which caused a segfault on malformed inline encoding blocks

added better support for inline encoding; now encodes all blocks in a header
and not just the first block.

[20041009.1345] jonz: fix for sqlite permissions

added fix for sqlite permissions to create database as 0660

[20041009.1100] jonz: added storage profile support

Implemented this from my blog:

5. Distributed database configurations. I'd like to add a commandline
   option or environment variable to set a storage "profile". This
   profile would refer to a MySQL or PgSQL server config in dspam.conf.
   For example:
                                                                                
     MySQLServer.Sun420R 10.0.0.5
     MySQLPort.Sun420R   3306
     ...
                                                                                
     MySQLServer.DECAlpha 10.0.0.6
     MySQLPort.DECAlpha   3306
     ...
                                                                                
     Providing --profile=DECAlpha on the commandline would cause DSPAM to
     use that particular storage profile. This is especially useful in
     distributed environments where a user might be mapped to a particular
     server.

[20041008.0420] jonz: added new logo

added new logo for cgi

[20041008.0420] jonz: added index to dspam_signature_data for created_on dates

added index and updated purge.sql to use for dspam_signature_data, which
greatly improves purge speed.

Version 3.2.rc2
---------------

[20041007.2300] jonz: made LARGE_SCALE and DOMAIN_SCALE autodetect in CGI

made filesystem scaling auto-detect based on dspam --version

[20041007.1950] jonz: added preferences verbose output

added verbose debugging of preference attributes and values loaded

[20041007.0400] jonz: added autogen support for freebsd

added support for freebsd to autogen script, so freebsd users can build
from cvs.

[20041006.2245] jonz: merged group cgi bugfix

fixed a bug in the cgi where merged groups would be added to the user totals
when displayed under statistics.

[20041006.0430] jonz: added algorithm and pvalue choice to dspam.conf

added support for selecting the combination algorithm(s) and pvalue
technique to dspam.conf. for third-party agent compatibility, configure
options have remained active, but the agent will override these if it finds
options in dspam.conf.

[20041005.1930] jonz: added debug option to dspam.conf

added DebugOpt option to specify which types of messages to route to debug

[20041005.0400] jonz: syslogging of more error messages

added the syslogging of more types of failures

[20041005.0355] jonz: libdspam debug: all calculations only on verbose

changed libdspam's debugging output so that all calculations are made only
when verbose is active; this will allow users to run with standard debug
enabled without using as many resources

[20041005.0102] jonz: applied pgsql patches for performance/bugfixes

applied pgsql patch submitted by Rustam Aliyev to improve performance by
an estimated 30% and fix some minor issues.

[20041003.0515] jonz: --version to return configure parms only when trusted

--version should print configuration parameters only when running as a
trusted user

[20041002.1730] jonz: added return codes on quarantine failure

the agent now returns a failure code if it was unable to quarantine an
incoming spam. 

[20041002.1700] jonz: added unlearning functionality

added unlearning functionality which can be triggered in one of two ways:

1. using --mode=unlearn on the commandline will unlearn the message passed in;
   useful for some cases of error and such. Use --source=error and set --class
   to the original classification that the message was LEARNED WITH (e.g.
   if it was originally classified as spam, set --class=spam to unlearn it
   as spam).

2. by setting OnFail to unlearn in dspam.conf, DSPAM will unlearn a message
   on delivery or quarantine failure. this will fix problems on some servers 
   where the message is requeued, and then reprocessed.

[20041002.1040] jonz: minor bugfix for inoculation

fixed a minor bug which may have caused message inoculations to be overly
inoculated (5 hits instead of 2).

[20041001.2145] jonz: algorithm cleanup

cleaned up algorithm definitions in configure:

1. --disable-traditional-bayesian is now --disable-graham-bayesian
2. --disable-alternative-bayesian is now --disable-burton-bayesian
3. configure will no longer allow you to enable chi-square without disabling
   both bayesian calculations; this is to avoid rainstorms of false positives
   and accidental configurations by users who don't realize you need to
   disable one to enable the other

Version 3.2.rc1
---------------

[20041001.0035] jonz: signatureLocation "headers" preserves encoding

when signatureLocation is set to "headers", the message is treated as if
signed and the original encoded body is preserved.

[20041001.0030] jonz: a few features removed/changed

a few features have been removed from the agent and/or libdspam to improve
functionality and/or to restore libdspam's function as a text classifier and
not an encoding/decoding engine. some functionality has simply been "moved"
into the agent and out of libdspam. changes are very minor and shouldn't
affect a majority of third party applications or many end-users.

1. dropped "attachment" signatureLocation

signatureLocation = 'attachment' has been officially dropped. I realize one or
two people were using it, but the amount of black magic that had to be
used to maintain this function were just too time consuming, and nobody
liked having paper clips on every message. 

2. dropped DSF_COPYBACK; libdspam no longer permanently decodes anything

copyback feature to copy back decoded message no longer used by DSPAM agent,
not very useful for any other applications. applications looking to decode
should consider either self-actualizing the message (as the dspam agent does)
or using a different approach to decoding.

3. libdspam to treat all messages as "signed"

libdspam now preserves the original message body and transfer-encoding,
treating all messages as signed. the agent is the piece responsible now for
modifying the message and appending signatures.

[20040930.2215] jonz: added parse-to-headers to dspam.conf

setting "ParseToHeaders on" in dspam.conf is now used to parse the To: 
line for extracting a username when forwarding spam/fp's to catchall
domains (see README)

[20040930.1900] jonz: changes to homedir option

1. --enable-homedir-dotfiles is now --enable-homedir
2. all .nodspam and .dspam files are now .optout and .optin, respectively
3. when --enable-homedir is used at configure time, not only are opt-in/out
   files looked for in the user's home directory, but all data files are stored
   in ~/.dspam including the .inoc file previously stored in ~.

NOTE: This option requires dspam to run as setuid root (automatically
      configured) and is incompatible with the DSPAM CGI (can't read mailboxes).
      If you require users to be able to opt themselves in/out and use the CGI,
      use the CGI's opt-in/out preference or configure a small tool to manage
      them from DSPAM Home.
 
[20040930.1845] jonz: added TrackSources attribute

moved source address tracking into dspam.conf

[20040930.1830] jonz: added Opt attribute

Opt can be set to in or out in dspam.conf to specify whether the system is 
opt-in or opt-out.

[20040930.1815] jonz: added TrainPristine attribute

TrainPristine replaces --enable-webmail and is used to put the DSPAM agent in
a training mode where it assumes the original message is provided for 
retraining. This ceases the writing of any signatures, and is ideal for
webmail or imap systems where the original message is preserved on the server
and can be used to retrain.

[20040930.0129] jonz: implemented thread-safe functionality

implemented thread-safe functionality in libdspam and two storage drivers
(mysql_drv and pgsql_drv). each thread will require its own context, however
if you check out the libdspam man page, you'll see it's possible to set
up multiple contexts with the same database handle. 

[20040929.0710] jonz: added libdspam man page

added libdspam man page, API reference. symlinked to:
dspam_init, dspam_create, dspam_attach, dspam_addattribute, dspam_process,
dspam_getsource, dspam_destroy

[20040929.0700] jonz: fixed bugs in preferences extension/dspam.cgi

applied fixed submitted by Marty Pauley to fix bugs with dspam.cgi's handling
of preferences extension calls

[20040928.1900] jonz: added new API functions

added new API functions to support the libdspam attribute API. dspam_init has
remained in the code for backward compatibility with other applications,
however a new set of create/attach functions have been added with the
attribute API so that storage attributes (such as server information) can be
set prior to connecting to storage. see the updated example.c's example 4 for
a more thorough explanation.

to retain backward-compatibility, contexts instantiated using dspam_init
will revert to their legacy behavior of looking for a [driver].data file in
the dspam home. dspam_init() has been slightly tweaked, however, to require
the path to the dspam home as an argument. DSPAM_HOME can be passed in by
legacy applications that already have it defined.

This functionality also allows multithreaded applications (which must have
a separate context) to share a single database handle.

[20040928.0501] jonz: added IgnoreHeader option to dspam.conf

added IgnoreHeader option to dspam.conf, allowing specific headers from other
virus tools/spam filters on the network to be ignored.

[20040928.0210] jonz: made dspam.conf operational

dspam.conf now operational; will copy at install time into prefix/etc if a
copy does not already exist. 

NOTE: See the UPGRADING section of the README for a full explanation of changes
and be sure to read before attempting an upgrade 

[20040928.0200] jonz: moved show/hide factors as a preference

showFactors is now a preference (and suitable as an option for each user or
globally); set to "on" to enable factors in the message headers. added to cgi. 

[20040925.0817] jonz: fixed empty spamSubject bug

fixed bug causing subject of spam to be truncated if spamSubject left blank

[20040923.0300] jonz: rating sort default quarantine view

made "rating sort" the default quarantine view in cgi. if a delete all fails,
will revert to a chronological sort so users can see the last spam to be
quarantined.

Version 3.2.beta-1
------------------

[20040922.0500] jonz: fixed toe/zero signature bug

fixed a bug created in 3.1.2 causing toe-mode training to write zeros for
signatures (causing toe users to cease all learning)

[20040922.0400] jonz: fixed mysql buffer overrun bug

fixed a bug in the mysql driver where escaping a large number of
characters caused an unexploitable overrun.

Version 3.1.2
-------------

[20040906.0939] jonz: implemented sparse binary polynomial hashing

implemented Bill Yerazunis' sparse binary polynomial hashing (tokenizer
method only). use --feature=sbph to generate an SBPH-based token set.

[20040901.0800] jonz: bugfix for --debug

fixed --debug so it doesn't get passed along to MTA

[20040830.0800] jonz: pgsql fixes

Applied Rustam Aliyev's patch to fix the following issued with pgsql_drv:

- Added support for Preferences Extensions.
- BUGFIX: 'length' field's type changed from 'smallint' to 'int' 
 'smallint' not enought for big signatures.
- All values passed to columns with 'smallint' type now are quoted. 
  This will enable casting and make indexes on these columns available. 
- Added new index on dspam_token_data (token) which helps speed up 
  some operations. 
- Number of fixes to keep memory cleaner.

[20040826.2100] jonz: fixed bugs in classify and inoculation

fixed a bug where noise reduction and chained tokens weren't applied to
user classification and message inoculation

[20040826.0600] jonz: tweaked mysql where clauses

tweaked mysql where clauses for better indexing

[20040825.0225] jonz: added --disable-factors option

added option to disable factors in message headers

Version 3.1.1
-------------

[20040819.0800] jonz: minor CGI template changes

minor changes to CGI templates
 
[20040819.0800] jonz: added X-DSPAM-Factors

added determining factors header to emails containing a list of tokens that
played a role in the decision. if multiple algorithms are defined, only one
is used. if the message is spam, the factor set from an algorithm returning
a spam result will be used. 

[20040818.1900] jonz: cast smallints in postgres

cast all smallint's in postgres, so indexes should be used now (major 
performance increase)

[20040818.1845] jonz: fixed memory leaks

fixed some miscellaneous memory leaks

[20040818.1845] jonz: added optIn / optOut preference

added optIn and optOut preference support; whichever one is used depends on
whether dspam is configured for opt-in or opt-out.

[20040811.0900] jonz: fixed totals bug with merged groups

fixed a small bug preventing totals from traveling < 0 when using merged
groups

Version 3.1.0
-------------

[20040724.2000] jonz: fixes to Bayesian Noise Reduction

made fixed to Bayesian Noise Reduction to fix bugs related to 3.1 Beta

[20040723.0630] jonz: added --deliver=summary option

added 'summary' delivery option which will deliver (to stdout) a summary
identical to the output of message classification:

X-DSPAM-Result: user; result="Innocent"; probability=0.0023; confidence=1.00

Obviously, should not be used with --stdout

Version 3.1.0-beta-2
--------------------

[20040721.0800] jonz: added single spam hit purge to purge.sql

added purge of tokens with single spam hit to purge.sql. adjusted purge times

[20040721.0800] jonz: place signature before /HTML tags

in spams without a </body> tag, signature are now placed before /html tags
in order to ensure they are passed on with some email clients (such as
outlook). this might explain users who receive the same spam over and over.

[20040720.2130] jonz: rewrites to bayesian noise reduction

rewrite of BNR algorithm with minor tweaks, code cleanup

[20040720.0800] jonz: applied patches to CGI

applied patches to CGI submitted by Craig Hockenberry to add configure.pl
functionality for configuring the CGIs.

[20040716.0800] jonz: removed 2500 message threshold for TOE

TOE-mode training now kicks in immediately after 100 learned innocent
messages, rather than waiting for 2500 messages. as a result, more initial 
errors are likely to occur (just as with any other filter implementing TOE) 
but final accuracy should be better.

[20040711.2300] jonz: fixed field names in dspam_2sql

updated field names in dspam_2sql to reflect present-day database field names.

Version 3.1.0.beta.1.1
----------------------

[20040711.2200] jonz: fixed --disable-trusted-user-security compile errors

fixed compiler errors when users disabled trusted user security

[20040711.2200] jonz removed debug output

removed a line of debug output causing problems with implementations using
stdout 

Version 3.1.0.beta.1
--------------------

[20040709.0700] jonz: fixed bug with subject encoding and spam tags

fixed a bug where spam tags would not be added to encoded subjects

[20040709.0700] jonz: added --debug commandline argument

if --enable-debug is specified, --debug can be passed on the commandline to
activate debugging. alternatively, dropping a .debug file in DSPAM_HOME or
user.debug file in the user's DSPAM_HOME data directory will still work.

[20040709.0700] jonz: fixed error bug with snprintf

fixed a bug in error reporting where not using vsnprintf as required
caused crashing on some systems.

[20040709.0700] jonz: added header support for automatic whitelisting

instead of X-DSPAM-Probability: -2 to identify automatic whitelisted emails,
a header of X-DSPAM-Result: Whitelisted will be used, and the original 
probability (even if guilty) will be provided in each message.

[20040707.0730] jonz: added dynamic noise reduction extension support

added support for dynamic noise reduction extension; designed to track SNR
in emails for each user to dynamically determine noise thresholds and perform
calibration. extensions supported in libdspam, but is still experimental and
only used for tracking noise margins at the moment.

[20040707.0700] jonz: added whitelistThreshold preference

the whitelistThreshold preference will set the threshold for innocent hits
before automatically whitelisting a recipient. the default value is 10. do
not set this value too low!

[20040707.0500] jonz: added NOTRAIN preference for trainingMode

added NOTRAIN preference for trainingMode, which will result in messages being
processed but not trained.

[20040707.0408] jonz: signature location now a preference

signature location (headers, message, attachment) now moved to 
signatureLocation preference and added to CGI. configure-time arguments
will set a default preference if user hasn't overridden.

[20040706.2000] jonz: applied win32 patches

applied patch portion of win32 build supplement; win32/README updated. 
visual c++ project updated. initial testing shows all systems go =) 

[20040706.0800] jonz: added ignoreGroups preference

ignoreGroups, when set to 'on', will ignore any group memberships the user
should belong to (including system-wide). useful to allow some users to remove
themselves from any memberships. 

[20040705.2000] jonz: utilities to require trusted user permissions

utilities modified to require the caller be a trusted user. this is normally
done with groups, but as an extra security measure is also done with trusted
users.

[20040705.2000] jonz: rewrote preferences, added preferences extension support

preference functions entirely rewritten. added preferences extension support to
dspam, added first extension to mysql_drv, and added preference administration 
to dspam_admin.

[20040705.0800] jonz: added sort option to cgi quarantine 

added ability to sort by rating or date to cgi's quarantine

[20040630.0800] jonz: added preliminary win32 support files

added Vadim Zeitlin's preliminary win32 files into win32/ directory 

[20040630.0800] jonz: added transactional blocks to postgres driver

applied rustam's patch to add transactional blocks to pgsql_drv for
performance increase

[20040629.1945] jonz: untrusted user error to report username

untrusted user error (specifying --user) should report active username

[20040629.0800] jonz: fixed domain scale in dspam.cgi

domain scale pathname was missing /data/ in dspam.cgi

[20040629.0800] jonz: fixed segfault on empty body

fixed a bug causing libdspam to segfault with some email having an empty body.

[20040628.1945] jonz: added removal option for merged groups

added removal option for merged groups, by specifying -username, username is
removed from the group. This is useful if you want system-wide merged groups
but have a few users who want to unsubscribe

[20040628.0700] jonz: fixed bug in spam-subject

fixed a bug in spam-subject causing:
  1. the last character of the subject to be truncated
  2. spam tags to be repeated for each local recipient

[20040627.1330] jonz: added sql-formatted output support to dspam_dump

added support for sql-formatted output in dspam_dump using the -d [driver]
command. only driver supported is sqlite_drv. use dspam_2sql for all other
drivers (dspam_dump dumps one user at a time, so is only useful for sqlite
at the moment).

[20040625.2300] jonz: rewrote locking in bdb drivers

rewrote locking in bdb drivers to use fcntl locking instead of db env
locking. kernel-level locking works over nfs and automatically removes
stale locks if a process should crash or the system fail.

[20040625.2200] jonz: fixed a locking bug with fcntllocking/quarantine

fixed a quarantine locking bug where fcntl locking was not waiting for a lock,
but returning a failure immediately if already locked

[20040625.0800] jonz: added configure arguments to --version output

added a list of arguments DSPAM was configured with to --version output

[20040624.0425] jonz: applied CGI facelife 

applied CGI facelift submitted by Craig Hockenberry <craig@iconfactory.com>

[20040623.0700] jonz: bugfix for encoded multiline header mangling

fixed a bug that caused encoded, multiline headers to lose any lines of text
after the first.

[20040621.2135] jonz: made sqlite_drv default storage driver

made sqlite_drv the default storage driver 

[20040621.2135] jonz: added SQLite storage driver
                                                                                
added SQLite storage driver. see tools.sqlite_drv/README for more information

[20040621.0245] jonz: committed minor patch for Solaris builds

another patch for declaring u_int32_t's on Solaris

[20040617.0220] jonz: fixed configure help text for --enable-webmail

fixed configure help text for --enable-webmail, which was mangled

[20040617.0211] jonz: fixed type-o in admin.cgi for $CONFIG{'LARGE_SCALE'}

fixed a type-o in admin.cgi where $CONFIG{'LARGE_SCLAE'} = 0;

Version 3.0.0
-------------

[20040614.0700] jonz: fixed 14-day user graphs

fixed a bug causing the 14-day user graphs to appear empty

[20040612.0018] jonz: oracle storage driver fixes

made several bugfixes to oracle storage driver
added --with-oracle-version[=10] configure flag for linking to 10g libraries

[20040609.0205] jonz: fixed a bug in --enable-signature-attachments

fixed two bugs using --enable-signature attachments; 1 compiler error and 1
segfault (uninitialized value)

[20040608.0715] jonz: fixed compile bug with --enable-webmail

fixed compile errors resulting from --enable-webmail

[20040607.1800] jonz: replaced quarantine locking with fcntl locking

replaced quarantine .lock'ing with fcntl locking and also applied it to
locking .log files. fcntl should work over NFS.

[20040607.0730] jonz: fixed rare segfault (strlen on NULL)

fixed a rare segfault in decode.c

[20040607.0730] jonz: minor aesthetic changes to cgi

minor aesthetic changes to cgi

[20040606.1445] jonz: added training left option to dspam_stats -H

modified dspam_stats to display # of training messages left when using -H 
command

[20040606.1441] jonz: fixed bug in training threshold

fixed a bug in the training threshold, which miscalculated the mail left to
train.

[20040605.1521] jonz: added statistical sedation to cgi

added level of sensitivity-during-training to cgi preferences

[20040605.1450] jonz: added ability to edit user preferences from admin suite

added the ability to edit user preferences (and the default preferences)
from the admin suite.

[20040605.1100] jonz: fixed a bug with user processing flag

fixed a bug where some parameters may be added as users instead of parameters.
this was particularly the case if no mailer flags prepended %u.

[20040604.0525] jonz: fixed blank dspam signature on reclassification

fixed a problem where reclassified messages would receive:

X-DSPAM-Signature: !DSPAM!

fixed this by NOT stripping the old X-DSPAM-Signature header, since a new one
is not created upon reclassification

[20040604.0525] jonz: fixed untrusted.mailer_args

fixed a bug where the last argument of untrusted.mailer_args was ignored.

Version 3.0.0.rc2
-----------------

[20040603.2215] jonz: added user-logging option

added --disable-user-logging option to disable user logging

[20040603.0500] jonz: auto-whitelisting now works with toe-mode training

added code to cause automatic whitelisting to function with toe-mode training

[20040602.0030] jonz: added administration suite cgi

added administration suite cgi

[20040602.0030] jonz: added system logging of execution time

added system logging of execution time

[20040602.0025] jonz: fixed spam subject

fixed spam subject headings to support variable length titles

[20040601.2230] jonz: added system logging

added system logging to DSPAM_HOME/system.log for future sysadmin interface

[20040601.1822] jonz: removed mysql delay_key_write

removed mysql's delay_key_write feature from the sql scripts, because of a
bug in mysql that leads to database corruption when using it.

[20040601.0330] jonz: added To: header parsing

added --enable-parse-to-header, which will parse spam-username and fp-username
from the To: header of a message to determine the username. This can be
used in lieu of using spam/fp aliases by creating a wildcard subdomain
(such as spam.yourdomain.com) and piping all email into dspam without a
--user flag, for example:

wildcard: "|/usr/local/bin/dspam --mode=toe --class=spam --source=error"

[20040531.2245] jonz: added pkgconfig files

added installation of pkgconfig files submitted by Ronald Hummelink
<ronald@hummelink.xs4all.nl>

[20040531.2120] jonz: added --enable-broken-return-codes

added --enable-broken-return-codes configure option which causes DSPAM to 
return an exit code of 99 if the message being processed is believed to be
spam, 0 if not, and any other code to suggest an error has occured. this is
useful for some MTAs such as qmail.

[20040531.2100] jonz: fixed error.h overwrite bug

fixed a bug where libc's error.h would be overwritten if --prefix=/usr. DSPAM
headers are now written to includedir/dspam.

[20040531.1915] jonz: added man pages

added man pages to distribution 

[20040531.0830] jonz: fixed header signature stripping

signatures no longer stripped if --enable-signature-headers is used; to allow
for re-re-training

[20040531.0830] jonz: fixed cgi graphs falling below zero

minor fix to cgi graphs preventing data points from falling below zero

Version 3.0.0.rc1
-----------------

[20040528.0100] jonz: added logging support

added support for message logging (enabled by default). logs all classification 
calls to $DSPAM_HOME/data/user/user.log. disable with --disable-logging.

[20040527.2200] jonz: added new CGI

added new CGI

[20040527.0730] jonz: added support for profiling

added support for profiling using gmon output. this allows developers to use
profiling tools such as gprof to analyze the performance of the software.

[20040527.0730] jonz: applied patch submitted by Mark Femal

applied a patch submitted by Mark Femal <mark@beantree.com> which:
1. Includes select *.h files and incorporates them into the installation
2. Fixes some issues in compiling with Sun's Pro C compiler
3. Makes some minor changes to header files to avoid conflicts

Version 3.0.0.beta.3.1
----------------------

[20040525.0830] jonz: fixed compiler error on verbose debug

fixed compiler errors when verbose debug was enabled

Version 3.0.0.beta.3
--------------------

[20040524.2024] jonz: bugfix for null bodies

applied bugfix causing a segfault when the message body of some parts was
null. rare occurrence.

[20040524.1903] jonz: implemented Robinson's technique for combining p-values

added support for using Robinson's technique for combining p-values, as
described at http://www.linuxjournal.com/article.php?sid=6467. This technique
is presently used for chi-square calculations, but using 
--enable-robinson-pvalues will use this technique for *all* calculations in 
place of Graham's approach. Appears to provide slightly better results
(on the order of 1 message per thousand).

[20040524.0529] jonz: implemented *real* chi-square

implement Fisher-Robinson's Inverse Chi-Square algorithm...the real stuff.
use --enable-chi-square to use.

[20040522.2350] jonz: renamed chi-square to robinson's naive bayesian

renamed chi-square because it really isn't chi-square, but robinson's first
algorithm for naive bayesian combination. use --enable-robinson to use.

[20040520.0800] jonz: bugfix for attachments

fixed a bug that caused message headers in attachment sections to be ignored

Version 3.0.0.beta.2.1
----------------------

[20040518.0630] jonz: bugfix: seg faults on rare occasions

fixed a strlen(NULL) bug fixing an occasional segfault

[20040514.1130] jonz: applied dspam_genaliases patch

applied dspam_genaliases patch supplied by Scott Moorhouse 
<smoorhouse@ae-solutions.com> which adds the following functionality:

--exclude NAME     Do not generate an alias for username / usernames.
--excludeuid NUM   Do not generate an alias for UID / UIDS.
--minuid NUM       Minimum UID for which to generate an alias.
--maxuid NUM       Maximum UID for which to generate an alias.

It also uses setpwent/getpwent to get passwd information instead
of /etc/passwd. This allows the tool to be used with any default system
authentication.

[20040514.0830] jonz: modified mode=notrain to ignore signature

when setting mode=notrain, the signature is NOT stored, and not appended to
an email.

Version 3.0.0.beta.2
--------------------

[20040513.1845] jonz: updated configure.ac

updated configure.ac to work with newer versions of autoconf (with warnings)

[20040513.0157] jonz: segfault patch for sql drivers

applied patch to prevent segfaults in mysql and pgsql drivers under certain
conditions

[20040512.0830] jonz: user directories moved to $DSPAM_HOME/data

user directories have been moved to $DSPAM_HOME/data. it will be necessary to
move all user directories into this folder when upgrading

[20040512.0830] jonz: default $DSPAM_HOME changed

default dspam home has been changed from /etc/mail/dspam to /var/dspam. use
--with-dspam-home to change this.

[20040512.0830] jonz: patch for sql drivers

applied patch for mysql and pgsql drivers to prevent errors in sql due to 
lack of commas

Version 3.0.0.beta.1.2
----------------------

[20040504.1835] jonz: bugfix for signed message signature

corrected a bug where the boundary for a signed message would be missing
a carriage return.

[20040504.0548] jonz: bugfix for token storage bug

fixed a token storage bug, where some tokens would not be stored if they
were preceeded by a token that was found in the database

[20040503.0830] jonz: bugfix for corpus spam delivery

fixed a bug where corpusfed messages would be delivered if a quarantine agent
was specified at configure time.

[20040501.1052] jonz: added spam-subject feature

added a spam-subject feature which can be activated with --enable-spam-subject.
when enabled, DSPAM will prepend [SPAM] to the subject headers of all messages
suspected to be spam.

Version 3.0.0.beta.1.1
----------------------

[20040501.0630] jonz: fixed critical problems with pgsql_drv driver

fixed a critical problem with the postgres storage driver to correct sql errors
in processing

Version 3.0.0.beta.1
--------------------

[20040430.0800] jonz: fix for sql driver subtractions

implemented GREATEST(0, [Argument] ) functions for subtractions, which fixes a
problem in which error corrections are not made to tokens where there are
zero hits for the classification being subtracted from.  should also
definitively prevent negative values in hit totals.

[20040430.0800] jonz: bugfix: corpus feeding invoked test-conditional training

fixed a bug where corpus feeding would invoke test-conditional training.
 
[20040430.0800] jonz: test-conditional training to subtract only once

test-conditional training modified to subtract from misclassified corpus only
once, and corpus feed for all other iterations

[20040430.0800] jonz: fixed bug in sql-drivers/test-conditional training

fixed a bug in the sql drivers where test condition training would make
exponential changes instead of incremental.  this was due to not resetting
the control token on every call to _ds_getall_spamrecords.
 
[20040430.0745] jonz: fixed bug in web stats

fixed bug where merged group web stats wouldn't get written

[20040430.0730] jonz: fixed bug in TOE totals

fixed a bug where spam/innocent classified wasn't updated when TOE was used

[20040427.0433] jonz: fixed bug in mysql and pgsql drivers

fixed a bug in mysql and pgsql drivers where dspam_merge was functioning
incorrectly, due to the token count on record insertion being set to 1 or 0,
and not the actual token value.

[20040427.0155] jonz: merged groups shouldn't merge with themselves

corrected a situation where the actual user in a merged group could be merged
with themselves, if they were the target user.

[20040427.0119] jonz: applied bdb patch for solaris

applied a patch to building on Solaris 9 with BDB drivers

[20040425.0757] jonz: updated pgsql drivers

applied pgsql_drv storage driver updates submitted by Rustam Aliyev

Version 3.0.0.alpha.6
---------------------

[20040424.2235] jonz: fixed header tokenization

fixed header tokenization from previous alpha; was suddenly leaving out
heading from token names.

[20040424.1427] jonz: added merged groups

merged groups are similar to global groups, only instead of the global user
being used in lieu of per-user statistics, the global user in a merged group
is merged with the user's own training data.  this allows immediate correction
to take place and no training loop.

NOTE: merged groups are storage driver dependent.  presently they have only
been implemented for the mysql driver.

[20040422.1900] jonz: messages with empty bodies should still be processed

fixed bug where messages with empty bodies failed into delivery 

[20040422.1829] jonz: added encoding strip patch

added patch to fix the stripping of the content-transfer-encoding

[20040421.1809] jonz: added training mode 'notrain'

added training mode 'notrain' which will process the message, but not train any
user data; this is ideal for implementations where a global dictionary is
used, but the administrator doesn't want to accumulate training data for each
user.

[20040421.0310] jonz: fixed TOE-mode totals updating

fixed bug where TOE-mode would update totals when it shouldn't

Version 3.0.0.alpha.5
---------------------

[20040421.0100] jonz: fixed totaling problems with classification groups

fixed totaling problems with global users and classification groups, where
spams wouldn't get counted, and some innocents

[20040421.0100] jonz: fix for dspam_stats

fix for dspam_stats, identifying individual users

[20040420.0734] jonz: fix for builds on Solaris w/BDB

fixed compiler error when building on Solaris w/BDB drivers

[20040419.0758] jonz: fix for X-DSPAM-Result header problem with TOE

TOE resulted in the X-DSPAM-Result being send to stdout, which broke all
implementations of TOE where --stdout was used.  bug fixed.

[20040419.0700] jonz: added support for multipart/encrypted messages

added the same support for multipart/encrypted messags as is provided
for multipart/signed

[20040418.1840] jonz: changes to pgsql objects

changes to pgsql objects to fix performance issues

[20040417.1105] jonz: more global user tweaks

if the global user thinks the message was innocent, but the user thinks it was
spam, retrain the message as a false positive into the user's dictionary
automatically, but don't update FP totals (internal function)

[20040417.1050] jonz: implemented totals checking

implemented totals checking to insure no totals travel below 0

[20040417.1045] jonz: don't retrain some classification catches

patch added not to retrain some spams in a global user catch if the user's
own dictionary already learned it as spam

[20040417.1037] jonz: patch for non-user creation

patch made to sql-based drivers to avoid creating virtual users in cases where
a message isn't being directly processed (e.g. tools, error correction, etc.)

[20040417.2006] jonz: added human-readable patch to dspam_stats

added patch for human-readable format to dspam_stats, submitted by Alan
Shields

Version 3.0.0.alpha.4
---------------------

[20040416.0000] jonz: fix for global users to prevent FPs

applied bugfix for global users code where false positives were getting
generated because the user's dictionary wasn't completely ignored.  

[20040416.0000] jonz: applied dspam_corpus division by zero patch

applied div by zero patch for dspam_corpus submitted by Nick Burnett

[20040415.0010] jonz: added end-of-token truncated symbols

added support for end-of-token symbols, such as exclamation point.  slight
boost in accuracy in testing.

[20040414.0052] jonz: added abbreviated feature references

the first two letters of a feature can be used alternatively instead of the
whole feature name; for example --feature=ch,no,wh

[20040411.0100] jonz: added X-DSPAM-Confidence header

added X-DSPAM-Confidence header to all processed messages to identify the
confidence level of the decision made.

[20040410.0930] jonz: tum maturity level increased to 50 hits

train-until-mature level increased from 25 hits to 50; doesn't appear to work
well in classification groups.

[20040409.0201] jonz: added support for domain scale

added support for domain scale applying patches submitted by 
Patrick Tudor <ptudor@ptudor.net>

[20040409.0153] jonz: applied pgsql patches

applied more pgsql patches

[20040409.0129] jonz: fixed headers to preserve original encoding

headers are now delivered with original encodings

[20040407.2254] jonz: added mass false positive button to CGI

added a button to reverse multipe false positives by clicking on checkboxes.

[20040407.2248] jonz: fixed bug in classification groups

fixed a bug in classification groups, where a "classify catch" would cause
the DSPAM signature to be empty, and thus irreversible.

[20040407.0255] jonz: tweaks to postgres m4

tweaks to postgres m4 to test headers and library on configure

Version 3.0.0.alpha.3
---------------------

[20040406.0124] jonz: supress extra newline in message body

corrected message reassembly behavior by supressing newline characters at the
end of the message body.

[20040405.0524] jonz: added postgresql driver to project

added pgsql_drv (PostgreSQL) submitted by Rustam Aliyev <rustam@azernews.com>
to project, added to configure with its own set of configuration commands.
see tools.pgsql/README for more information.  Applied recent SQL fixes.
 
[20040405.0330] jonz: virtual users should not be created on reclassification

if a message is being submitted for reclassification, a virtual user should not
be created, but fail instead - e.g. spam could be getting sent to the alias,
and shouldn't create new uids.

[20040405.0233] jonz: fixed SQL-driver hits-below-zero bug

fixed a bug causing some tokens to drop below zero hits using the mysql
driver.

[20040405.0149] jonz: fixed BNR bug

fixed a bug caused by Bayesian Noise Reduction which caused some messages
never to get learned if the control token was filtered; or caused filtered
tokens never to be learned.

[20040403.1745] jonz: rewrite of libdspam API

rewrite of libdspam's API.  in short:

- Operating modes DSM_ADDSPAM and DSM_FALSEPOSITIVE dropped
- CTX->classification added: DSR_ISSPAM | DSR_ISINNOCENT | DSR_NONE
- CTX->source added: DSS_ERROR | DSS_INOCULATION | DSS_CORPUS | DSS_NONE

provides a much cleaner and less ambiguous interface

[20040403.1215] jonz: removed signature deletion

removed signature deletion from agent, so messages can be re-re-classified.
also prevents mysql errors.

[20040403.1125] jonz: added dotfile debugging support

--enable-debug and --enable-verbose-debug flags now require a .debug file
to be dropped in order to log debug messages, providing you with the ability
to dynamically activate/deactivate debug messages for some or all users.  A 
.debug file can either be dropped in DSPAM_HOME to activate debugging for all 
users, or a username.debug file can be dropped in DSPAM_HOME/userpath/ to 
activate debugging for a subset of users.  

[20040402.1839] jonz: added support for domain-name groups

added support for groups based on domain name

Version 3.0.0.alpha.2
---------------------

[20040402.0730] jonz: improved agent classification output

agent classification output improved to include username, result, probability,
and confidence level in MIME format for easy parsing

[20040402.0730] jonz: added broken MTA support

--enable-broken-mta
You should enable this if your MTA is broken and passes messages into DSPAM
with CTRL-M's (^M) in them.

[20040402.0730] jonz: added training loop buffering feature

Training loop buffering is the amount of statistical sedation performed to
water down statistics and avoid false positives during the user's training loop.
The training buffer sets the buffer sensitivity, and should be a number 
between 0 (no buffering whatsoever) to 10 (heavy buffering).  The default is 5,
half of what previous versions of DSPAM used.  To avoid dulling down 
statistics at all during the training loop, set this to 0.

The training buffer can be set using bf=N as a feature, where N is the level of
buffering (0-10).  For example:

--feature=chained,noise,tb=10

Causes the buffer level to be set to 10, the highest level of safety, whereas

--feature=chained,noise,tb=0

Removes all buffering constraints

[20040402.0723] jonz: fixed bug in dspam_dump

fixed a bug in dspam_dump causing unknown tokens to be displayed with 
uninitialized values

[20040402.0720] jonz: fixed bug in agent for signature dropping

when a signature can't be found, the message is dropped; unfortunately the
agent forgot to shut down the dspam context which caused BDB to lock up.
 
[20040402.0700] jonz: added switch for webmail

The webmail switch is designed for systems where the original message remains
server side and can therefore be presented in pristine format for retraining.

   --enable-webmail
   The webmail switch is designed for systems where the original message
   remains server side and can therefore be presented in pristine format for
   retraining.  This option will cause DSPAM to cease all writing of
   signatures and DSPAM headers to the message, and deliver the message in as
   pristine format as possible.  This mode REQUIRES that the original message
   in its pristine format (as of delivery) be presented for retraining, as in
   the case of webmail or other applications where the message is actually
   kept server-side during reading, and is preserved.  DO NOT use this switch
   unless the original message can be presented for retraining with the
   ORIGINAL HEADERS and NO MODIFICATIONS.
 
[20040401.2243] jonz: fix for signature headers

applied patch to fix multipart boundary bug when signature-headers is enabled

Version 3.0.0.alpha.1
---------------------

[20040401.1230] jonz: patches to corpus locking

made patches for corpus locking, to help prevent corruption with BDB drivers.  
DSPAM agent now drops a .corpuslock file upon processing a corpus which in 
turn tells the drivers not to run automatic recovery.  this should prevent 
corruption when an email comes in while you are corpus training with the BDB 
drivers.  this was not an issue with the SQL-based drivers.

[20040401.1230] jonz: deleted libdb4_purge, libdb3_purge

libdb4_purge and libdb3_purge have been obsoleted by the new rewritten 
dspam_clean tool

[20040401.0720] jonz: extended group line length to 10k

extended length of a single group line to 10k, from 1k

[20040401.0720] jonz: new dspam_clean functionality

dspam_clean has been rewritten to support the following different clean
operations:

1. Using the -s flag, dspam_clean will continue to perform stale signature
   purging.  If an age is specified, for example -s14, the age defined as the
   default will be overridden.  Specifying an age of 0 will delete all
   signatures for the users processed.

2. Using the -p flag, dspam_clean will delete all tokens from a user's database
   whose probability is between 0.35 and 0.65 (fairly neutral, useless tokens)
   that fall beyond the default age.  If an age is specified, for example
   -p30, the age defined as the default will be overridden.  It is a good
   idea to use this type of clean with an age of 0 on users after a lot of
   corpus training.  

3. Using the -u flag, dspam_clean will delete all unused tokens from a user's
   database.  There are four different types of unused tokens:

     - Tokens which have not been used for a long time
     - Tokens which have a total hit count below 5
     - Tokens which have only one spam hit
     - Tokens which have only one innocent hit

   Ages may be overridden by specifying a format such as -u30,15,10,10
   where each number represents the respective age.  Specifying an age of
   zero will delete all unused tokens in the category. 

Optionally, usernames may be specified to override the default behavior of
processing all users.

Examples:

Process all users on the system using all clean operations:
  dspam_clean -s -p15 -u90,30,15,15 

Delete all of user 'dick' and 'jane's signatures
  dspam_clean -s0 dick jane

Perform a post-corpus training clean on user 'spot'
  dspam_clean -p0 -u0,0,0,0

Perform nightly maintenance using all default values, for all users, with all
options enabled:
  dspam_clean -p -u -s

NOTE: You may wish to only run certain cleaning modes depending on the type of
storage driver you are using.  For example, the MySQL storage driver
includes a purge.sql script which performs signature and unused operations,
leaving only the probability operation as a useful operation.  If you are 
using a SQL-based storage driver, it is strongly recommended that you use
the maintenace scripts wherever possible.

[20040401.0720] jonz: added _ds_delall_spamrecords and _ds_del_spamrecord

added spamrecord deletion functionality to storage driver, increased version
to 5:0:0

[20040331.2000] jonz: applied some memory leak patches

applied some memory leak patches submitted by 
William Ahern <wahern@barracudanetworks.com>

[20040328.2200] jonz: renamed USERDIR to DSPAM_HOME

all references to USERDIR are now known as DSPAM_HOME, including the 
--with-dspam-home configure flag, and mode settings.

[20040328.2200] jonz: moved several features to commandline

many features have been REMOVED from the configure script and into the
commandline including chained tokens, bayesian noise reduction, automatic
whitelisting, and training modes.  please see the documentation for a complete
list of commandline arguments.

configure functions which have changed:

--with-userdir-*			changed all to dspam-home
--with-local-delivery-agent		changed to --with-delivery-agent
--enable/disable-chained-tokens		removed from configure
--enable/disable-bnr			removed from configure
--enable/disable-whitelist		removed from configure
--enable/disable-toe			removed from configure
--enable/disable-tum			removed from configure
--enable/disable-spam-delivery		removed from configure
--enable-deliver-to-stdout		removed from configure

[20040328.1745] jonz: completely reworked commandline arguments 

please see documentation for new commandline arguments. 

[20040328.1745] jonz: removed free-pass of arguments by untrusted users

removed ability to pass in arguments by untrusted users, when the file
untrusted.mailer_args didn't exist

[20040327.2230] jonz: CGI to allow logo-click to return

changed CGI to allow a click on the DSPAM logo to return the user to the
main page

[20040327.2222] jonz: thresholds to include all totals

thresholds changed to include all 3 totals: learned, classified, corpusfed

[20040327.2221] jonz: test-conditional training threshold dropped

test-conditional training threshold dropped to 1000 messages

[20040326.0730] jonz: extended DAF flagset

extended DAF flagset to four bytes

[20040326.0730] jonz: temporarily removed blackbox framework

archived and removed blackbox framework from cvs; not likely i'll be working
on it any time soon

[20040325.2129] jonz: extended context flags to u_int32_t

extended context flags to 4 bytes, to add additional commandline features

[20040325.2129] jonz: compatibility fixes for TOE

compatibility fixes for TOE for web client and stats

[20040325.1939] jonz: code cleanup

commented headers, cleaned up code

[20040325.1930] jonz: converted total_spam, total_innocent

converted total_spam, total_innocent to spam_learned, innocent_learned, and
added spam_classified, innocent_classified for stats use with TOE.  

NOTE: changes are required to SQL-based drivers for this version

MySQL Example:

alter table dspam_stats add spam_learned int;
alter table dspam_stats add innocent_learned int;
alter table dspam_stats add spam_classified int;
alter table dspam_stats add innocent_classified int;
update dspam_stats set spam_learned = total_spam;
update dspam_stats set innocent_learned = total_innocent;
update dspam_stats set spam_classified = 0;
update dspam_stats set innocent_classified = 0;
alter table dspam_stats drop column total_spam;
alter table dspam_stats drop column total_innocent;
alter table dspam_stats add spam_misclassified int;
alter table dspam_stats add innocent_misclassified int;
update dspam_stats set spam_misclassified = spam_misses;
update dspam_stats set innocent_misclassified = false_positives;
alter table dspam_stats drop column spam_misses;
alter table dspam_stats drop column false_positives;

[20040325.1930] jonz: addspam to fail on failed signature retrieval

due to a lot of misconfigurations of dspam, addspam will now fail if a 
signature cannot be retrieved.  this should help pinpoint problem installs
and clients, and prevent poor accuracy. 

Version 2.11.1
--------------

[20040325.0757] jonz: added --help

added --help commandline argument

[20040325.0757] jonz: fixed division by zero bug in dspam.cgi

small chance of division by zero bug fixed

[20040325.0740] jonz: fixed toe

fixed toe, which has been accidentally disabled in testing

[20040325.0740] jonz: provided runtime arguments for training mode

added run-time arguments --toe --tum --teft to specify training mode.  the
default is based on configure-time options.

also added training_mode variable to dspam context, should not affect
compatibility.

Version 2.10.2
--------------

[20040319.2138] jonz: added shell quoting of special characters

special characters are now quoted, instead of filtered, when calling the LDA.

Version 2.11.0 / Version 2.10.2
-------------------------------

[20040319.1845] jonz: fixed bash special characters problem

fixed special characters problem in bash by encapsulating all arguments in
quotes

[20040319.0730] jonz: added train-on-mature training option

--enable-tum
train-on-mature (TuM) is a hybrid of train-everything and train-on-error.  
all tokens are candidates for training as in train-everything, but only tokens
whose total number of "hits" don't exceed 100 are trained.  on error, all
tokens are trained.  this provides a good balance between the volatility of
train-everything and the lack of behavioral learning in train-on-error.  it 
also has the added benefit of not breaking the things that toe presently
breaks in dspam (whitelists, stats, etc).

[20040319.0700] jonz: fixed source address bug

fixed a bug in source address tracking where messages were reported as innocent
even if they were guilty, if the user had < 2500 messages in corpus

[20040318.1932] jonz: fixed compile-time warning in dspam_tools.c

fixed warning for uninitialized crc variable

[20040318.0259] jonz: post-training features dropped to 2500

post-training features such as TOE and BNR have had their prerequisite ham count
droped from 4000 to 2500.

[20040318.0241] jonz: fixed up headers so developers only need libdspam.h

fixed up header dependencies so developers only need include libdspam.h to
use libdspam.

[20040318.0124] jonz: added support for header-based signatures

for implementations where a signature in the body is unacceptable, using
--enable-signature-headers will place the signature in the header, and not
in the body.

IMPORTANT: This will -require- that the headers be forwarded with the message
when being reported as spam.  This usually requires bouncing the message,
forwarding it as an attachment, or using a macro.  The header will otherwise
be lost with standard forwarding.

[20040316.2315] jonz: added support for userlist termination

userlist can now be terminated using --

Version 2.10.1
--------------

[20040314.0128] jonz: bugfix for segfaults in dspam.c

segfaults can occur on some systems (predominantly Solaris) when mail is sent
to multiple local recipients.  bugfix required the header insert pointer to
be reset.

Version 2.10.0
--------------

[20040307.1828] jonz: new dspam_corpus tool by Gary Funck

replaced old dspam_corpus tool with a better one contributed by Gary Funck 
<gary@intrepid.com>

[20040305.0320] jonz: added postfix documentation

added documentation for postfix local delivery

[20040305.0320] jonz: added support for domain filesystem structure

use of --enable-domain-scale configures filesystem for domain-based
support.  when used, username@domain should be passed in as the userid and
$USERDIR/domain/username/ will be used instead of $USERDIR/username or
$USERDIR/u/us/username as done with large scale

[20040303.2208] jonz: applied bugfix patch by dennis pedersen

applied a bugfix to libdb3 and libdb4 fixing a bug that was presented in rc2
causing loop hangs.  submitted by dennis pedersen <dennis@moellegaard.dk>

[20040303.0243] jonz: added long username support

by default, the username length uses the same limits as the operating system.
if --enable-long-usernames is specified, however, the limit will be set to
256.

Version 2.10-rc2
----------------

[20040302.0007] jonz: implemented auto-whitelisting

implemented auto-whitelisting using --enable-whitelist function.  automatic
whitelisting will automatically whitelist any full 'From' addresses (including
the name) that have appeared in at least 10 innocent messages and zero spams.
when a message is forwarded as a spam, any automatic whitelisting for that
address is permanently deactivated.

[20040301.2339] jonz: fixed purge.sql

fixed some bugs in MySQL's purge.sql, optimized for speed thanks to another
patch submitted by bob glamm.

[20040229.1245] jonz: applied patch submitted by Sascha Blank

applied patch submitted by Sascha Blank for dspam_dump to allow lookup of
individual tokens.

[20040228.1618] jonz: train-on-error to perform source address tracking

train-on-error mode fixed to perform source address tracking

[20040224.2008] jonz: fixed high cpu utilization on large messages

fixed an iteration problem which caused high cpu utilization on large (2MB+)
text messages

[20040223.0350] jonz: fixed compile error in libdspam.c

fixed compile error in libdspam.c when HAVE_ISO_VARARGS isn't defined

Version 2.10-rc1
----------------

[20040222.1606] jonz: added support for global groups

global groups allows DSPAM to provide a "SpamAssassin type out-of-the-box
filtering" for all new users until they have built their own useful
dictionaries.  to create a global classification group, add something like
this to $USERDIR/group:

groupname:classification:*globaluser

This will automatically add globaluser as a classification peer to all users.
Any user who has less than 1000 innocent messages or 250 spam messages in 
their corpus, or whose filter is uncertain about a particular message will 
consult the global dictionary for an answer.

global groups will need to be trained using corpus or other means, or by
using the dspam_merge tool.  the global user (in this case 'globaluser') is
treated just as any other user on the system.

[20040221.2155] jonz: format changes to dspam_dump

dspam_dump formatting changes + display of token probability

[20040220.1700] jonz: added quick fix for \r stripping in dspam_corpus

added a quick fix to strip \r's in mailboxes when using dspam_corpus

[20040220.1700] jonz: fixed segfault bug

fixed a bug that caused DSPAM to segfault on empty MIME delimiters.  This
generally only occured with spams, as legitimate messages have RFC-compliant
delimiters.

[20040219.0150] jonz: added support for neural networking

see README for more details

[20040218.2300] jonz: added tweaking to BNR for small text samples

added tweaking of thresholds to BNR for small text sampes < 3.5k

[20040217.0724] jonz: fixed some miscellaneous compile warnings

fixed some miscellaneous compile warnings.  2 for when trusted user security
is disabled, 1 for dspam_2mysql.c:126

Version 2.10-beta-2
-------------------

[20040214.1632] jonz: added TOE support

added TOE (Train on Error) support using the --enable-toe configure function.
see the README file for more details.

[20040213.1549] jonz: fixed X-DSPAM header duplication bug

fixed a bug which caused X-DSPAM headers to be cumulatively appended when
a single message addresses multiple local users.

[20040214.1327] jonz: added --enable-client-compression configure flag

added option --enable-client-compression to use compression option between
data source and its clients (where available).  presently only available with
the mysql_drv storage driver.  you should enable this if the data source
is on a separate machine from the DSPAM agent(s), as it conserves bandwidth
at the expense of a few CPU cycles.

[20040214.1258] jonz: created speed and space optimized MySQL scripts

created both speed and space optimized mysql_objects.sql scripts.

[20040214.1235] jonz: added new stats to CGI

added FP stats + overall accuracy to CGI

[20040214.1235] jonz: added debug output for noise filtering

added noise level, spammy tokens, and eliminations to debug output

Version 2.10-beta-1
-------------------

[20040212.2208] jonz: added stale data purge / PURGE_ANY

added stale data purge to libdb3 and libdb4 purge tools.  based on PURGE_ANY,
defined in config.h, any stale data is removed after six months.

[20040212.2205] jonz: added DSF_NOISE flag

added DSF_NOISE flag to libdspam interface for activating Bayesian Noise 
Reduction. 

[20040211.0158] jonz: disabled mysql_drv _ds_delete_signature

disabled _ds_delete_signature in mysql_drv due to errors; added signature
purge to purge.sql script.  no longer necessary to run dspam_clean if using
the mysql storage driver.

[20040211.0155] jonz: mysql_drv get_one update

check to insure there was at least one token to be loaded, otherwise do not
perform query

Version 2.9.6
-------------
[20040208.1906] jonz: bugfix for BNR

BUGFIX: when BNR is activated on users with < 4000 innocent
messages, the filter forgets to load token stats for the user and marks
all messages as innocent.

Version 2.9.5
-------------
[20040204.0413] jonz: implemented Bayesian Dolby

implemented Bayesian Noise Reduction
(see http://www.nuclearelephant.com/projects/dspam/bnr.html)

[20040202.2216] jonz: added multipart frequency threshholds

body tokens in multipart messages now require a minimum frequency of 2 to be
included in the calculation.

[20040128.2021] jonz: only report source-addresses in mature corpuses

only report source-addresses when the user has >4000 innocent messages in
their corpus.

Version 2.9.4
-------------

[20030128.0334] jonz: added DSPAM SBL dropfile support

added support to source address tracking to drop SBL files to /var/spool/sbl
if exists, where client in directory watch mode can read.

Version 2.9.3 
-------------

[20040122.0700] jonz: hex decoding
                                                                                
a small piece of code to perform hex-decoding on 8bit encodings.  very useful, 
although hex encoding is still somewhat rare.
                                                                                
[20040121.0805] jonz: new stats watering-down code for high-spam users
                                                                                
implemented new code for watering down statistcs during the learning phase to
compensate for users with a high percentage of spam.  this should only affect
accuracy of normal (average spam) users for the first 1000 messages.
significant watering down takes place up to 1000 spams.  limited watering
down takes place up to 2500 spams if the user has more spam in their corpus
than innocent mail.
                                                                                
[20040121.0805] jonz: priority given to complex tokens
                                                                                
slight code tweak to give priority to more complex tokens (e.g. chained
tokens) to help improve accuracy.
                                                                                
[20030121.0805] jonz: signaure should not be stored when using --corpus
                                                                                
signatures are no longer stored when using the --corpus flag

Version 2.9.1
------------

[20031220.1442] jonz: added notification emails

three different notification emails can be configured to get sent:

- to a user the first time they receive a message through dspam (first run)
- to a user the first time a spam is caught through dspam on their behalf
- to a user when their quarantine box is > 2MB in size

to use notification emails, copy the txt/ directory from the distribution
into USERDIR and configure the emails accordingly.  more information is
available in the README.

Version 2.8.1
-------------

[20031205.0821] jonz: html preformatting only for html parts

html preformatting to be done only to html parts; html comments in
plain text parts should not be filtered out.

[20031205.0156] jonz: high-byte tokens not ignored

fixed a small bug causing tokens consisting of all high-bytes to be
ignored.
 
[20031205.0122] jonz: tweaked cgi spam ratio

tweaked cgi spam ratio to include misclassificatoins

[20031130.1016] jonz: dspam_merge to corpusfy totals

dspam_merge now moves all totals to corpusfed, so that a merged user can
easily start with fresh stats.

[20031129.1619] jonz: fixed quarantine agent arg skip bug

fixed minor bug which caused some arguments to be skipped then using a custom
quarantine agent
 
[20031129.1443] jonz: implemented opt-in/opt-out storage directory

moved all user.dspam and user.nodspam files to USERDIR/opt-in and
USERDIR/opt-out, respectively.  this saves from needing to have and set up
a directory for each user.
 
Version 2.8
-----------

[20031126.1633] jonz: stepped down insert query error to debug info

stepped down the query error on insert down to debug info, as it is a common
occurance on busy servers.

[20031124.0523] jonz: corrected buffer overrun in BDB drivers

corrected buffer overrun vulnerability in BDB drivers dealing with copying
tokens into memory.  discovered when working with corrupt dictionaries which
caused segfaults.  the dictionary would have to be manipulated in order to 
exploit, so risk was minimal.

[20031124.0459] jonz: fixed bug in dspam_2mysql

dspam_2mysql failed to place quotes around token value.

[20031123.1351] jonz: fixed libdb4,libdb3 shared group bug

fixed a bug that caused shared groups to fail with the following error:

DB_ENV->open failed: No such file or directory

[20031120.0405] jonz: fixed HTML boundary corruption with signature removal

fixed a bug that caused boundary corruption after an HTML part where a DSPAM
signature from a previous reply was removed by the agent.

[20031120.0405] jonz: do not remove old signatures from signed messages

corrected the dspam agent so that older signatures from signed messages were
not parsed out.  this caused the message to fail to authenticate.

Version 2.8-rc-1
----------------

[20031115.2042] jonz: fixed minor memory leak on initialization failure

minor memory leak caused in libdspam when dspam_init fails.  does not affect
DSPAM agent, only library.

[20031115.2042] jonz: DSM_CLASSIFY generated truncated signatures

fixed a bug where DSM_CLASSIFY generated truncated signatures 

[20031115.1540] jonz: corrected multipart analysis bug

corrected a bug that caused parts of a multipart message that were not
specifically marked as text with the "Content-Type" header to be ignored from
analysis.

[20031114.1949] jonz: corrected DSM_CLASSIFY in-memory totals bug

corrected a bug that changed in-memory totals when DSM_CLASSIFY was used

[20031113.1938] jonz: corrected DSM_CLASSIFY bug in libdspam

corrected two bugs in libdspam regarding the DSM_CLASSIFY mode:

1. CTX->signature would overwrite the provided signature with a new signature
   resulting in a potential memory leak

2. If no signature was provided, DSM_CLASSIFY would segfault instead of create
   a new signature

Version 2.8-beta-2
------------------

[20031103.1119] awn: libdspam version changed to the '4:0:0'

libdspam version changed to the '4:0:0' because introducing and
requiring of dspam_init_driver() at start and dspam_shutdown_driver() at
and is backward incompatible change.

[20031031.0402] jonz: fixed web stats for shared groups

shared group webstats fixed

[20031031.0340] jonz: added commandline options

added --stdout commandline option to deliver messages to stdout
added --deliver-spam commandline option to deliver spams to user's mailbox
changed --deliver flag to --deliver-fp, although --deliver still supported
  for backward compatibility.  option still only necessary when configuring
  with --enable-spam-delivery

[20031031.0324] jonz: changed default configure options

enabled the following as defaults in configure:

alternative-bayesian	(alternative Bayesian algorithm)
test-conditional	(test-conditional, iterative based training)

[20031030.1120] jonz: fixed caching bug

fixed caching bug in mysql_drv driver and ora_drv drivers causing dspam_stats
to return stats for first user, as stats for all users

[20031029.0538] jonz: added --classify commandline flag

the --classify commandline flag will classify the input message and output
to stdout "SPAM" or "HAM" depending on the result.  No changes will be made
to the user's tokens or totals.

[20031029.0538] jonz: changed totals mechanism

the following changes have been made to the totals mechanism:

- spam_misses has been changed to spam_misclassified
- false_positives has been changed to innocent_misclassified
- spam_corpusfed and innocent_corpusfed have been added

IMPORTANT UPGRADE NOTE: Please see the README for information on updating your
SQL databases to accept these changes if you are using a SQL-based driver.  If
you are using a BDB-based driver, these changes will automatically be 
implemented.

[20031028.2000] jonz: corrected CLASSIFY bug in mysql_drv and ora_drv

corrected a significant bug in mysql_drv and ora_drv which caused tokens and
totals to be incremented on all CLASSIFY calls.

[20031028.2000] jonz: changed DSF_CLASSIFY (flag) to DSM_CLASSIFY (mode)

the DSF_CLASSIFY flag is now a mode called DSM_CLASSIFY.

Version 2.8-beta-1
------------------

[20031028.0531] jonz: added customizable header for cgi

cgi spam account now has customizable header

[20031028.0448] jonz: classification catches to add as spam

spam catches by a member of a classification group should result in the
message being added as spam, as opposed to innocent.  this has been corrected.

[20031028.0204] jonz: X-DSPAM-User header only considered in managed groups

the X-DSPAM-User header field is only paid attention to when the user is
a member of a managed group (the only time where the original user is
necessary).

the parsing of the X-DSPAM-User header has also been corrected to chomp the
newline character, which was resulting in some systems including the character
in the username.

[20031028.0116] jonz: corrected a critical error in classification groups

corrected a critical error in classification groups causing DSPAM to crash
(and the message get delivered by the MTA's failsafe in most cases) when a
user in a classification group resulted in a spam being caught.

[20031027.0137] jonz: added mta whitelists for source address tracking

file USERDIR/mta.whitelist may now contain a list of internal MTA ip addresses,
which will cause DSPAM to skip to the next 'Received' header when processing
the source address.  each IP should be on a newline.

[20031026.1706] jonz: added signal handling to tools

added signal handling to tools, to unlock databases upon SIGINT, SIGPIPE or 
SIGTERM to avoid stale locks.

[20031025.1111] jonz: added rolling filter accuracy stats to cgi

rolling filter accuracy stats allows the user to measure their filtering
accuracy over a period of time (usually monthly or quarterly).  stats should
be reset after a good learning period (approximately 4000 spams and nonspams)
to measure accuracy accurately =)

[20031024.0007] jonz: libdb drivers reworked

libdb drivers reworked for better:
- locking (exclusive)
- recovery (simple recovery run on open)
- environment management (individual user environments)

IMPORTANT UPGRADE NOTE:

run the script 'dspam_movefiles [userdir]' in the tools directory to upgrade to
this new directory storage format.  after running, make sure you chown the
correct file ownership to the newly created directories.  this should be done
with the MTA shut down and no dspam processes running.

you will also need to reinstall/reconfigure the CGI

[20031023.1949] jonz: update to cgi to avoid missed messages

cgi now tracks the size of the quarantine between viewing and deleting all
messages, to avoid deleting messages that came in while reviewing the
quarantine.

[20031023.1727] jonz: compensated for converged boundaries

compensated for a slight break of RFC where two boundaries in a nested 
message appear without a blank space in-between, leading to message corruption.
fortunatley, this type of behavior is extremely scarce.

[20031023.0900] jonz: fixed classification group bug

fixed a bug that caused classification groups never to fire; datatype
CTX->confidence should be float, not int.

[20031022.2229] jonz: added "-d %u" to default cgi flags

added "-d %u" to default dspam cgi flags to assist new users

[20031022.0930] jonz: fixed bug preventing multiple group subscriptions

fixed a bug that caused a user to not be able to be subscribed to multiple
groups

Version 2.7.6.10
----------------

[20031022.0930] jonz: added support for managed shared groups

the group type 'shared' can be appended with ',managed' to convert the shared
group into a managed shared group.  a managed shared group is the same as a
shared group, only the managed version will share the quarantine box as well,
enabling one user (named after the group) to manage the handling of all
quarantine functions (false positive reporting, etc.).

this is generally not what users want, as personal information could potentially
be shared with the administrator of the group, however there are some
circumstances where this would be appropriate.

a regular shared group:

groupname:shared:user1,user2,userN

a managed shared group:

groupname:shared,managed:user1,user2,userN

[20031022.0930] jonz: corrected long-time stdin bug

corrected a long-time, just discovered but that caused stdin to be read in very
small chunks (32 bytes each).  correcting this bug has caused DSPAM to read
in messages much quicker.

[20031022.0930] jonz: cgi to use X-DSPAM-Signature

when message-id is not present, the cgi will now use the X-DSPAM-Signature
field to uniquely identify each message.

[20031022.0930] jonz: extended header assembly buffer to 4k

header assembly buffer extended to 4k; was truncating some longer fields at 1k.

[20031022.0930] jonz: minor crash bugfix

an obscure bug has been corrected which caused dspam to crash if the word
"boundary" was placed on a line in the message body, and that line began
with a space or tab.

[20031022.0900] jonz: false positives not delivered when spam-delivery enabled

false positives shouldn't be delivered when --enable-spam-delivery is enabled,
since they will be mailed in (or otherwise processed) directly from the user's
inbox.

to force false positives to be delivered, use the --deliver commandline
argument

Version 2.7.6.9
---------------

[20031021.1300] jonz: significant changes to mysql driver

the data type for the 'token' field in the dspam_token_data table has been
changed from BIGINT to VARCHAR.  This is due to a bug in MySQL being unable to
handle some of the large numeric values used for tokens.  

BEFORE UPGRADING, SHUT DOWN YOUR MTA AND ISSUE THE FOLLOWING MYSQL QUERY:

alter table dspam_token_data modify token varchar(32);

[20031021.1206] awn: Convenience symlinks for libdb{3,4}_deadlock

Convenience symlinks dspam_deadlock.libdb4 (in case of libdb4_drv),
dspam_deadlock.libdb3 (in case of libdb3_drv) and dspam_deadlock (in
case of both libdb*_drv) are added and pointed to the appropriate
libdb{3,4}_deadlock binary.

[20031021.1016] awn: configure: mysql and network-related libraries

-lnsl and -lsocket are added to the mysql client library check where
needed (e.g. on Solaris).

[20031021.0000] jonz: changed signature format to include frequency

WARNING: You should delete all your temporary signature information before
upgrading to this version, as the signature format has changed.  You can do
this by deleting all your .sig files or issuing a 
"delete from dspam_signature_data" query if using a SQL-based driver.

RATIONALE: When performing classification queries with signatures, the
frequency is necessary to insure an identical calculation.

[20031021.0000] jonz: added support for 'CLASSIFICATION' group

A 'CLASSIFICATION' group type has been added.  Classify groups are groups of 
users who share the results of spams against their own personal dictionaries.  
This means that for every message that comes in for any user in the group, 
dspam classifies that message for every user and if any user believes the 
message to be spam, it is marked as spam for the destination user.

To avoid false positives, external classification is only used when there is
a confidence level of 0.30 or higher of spam.  The confidence level is
calculated with Chi-Square.

Members of this type of group should only join after their initial training
period.  Members may also be part of an inoculation group, but users can
not be a part of both a classify group and a shared group.

[20031021.0000] jonz: changed default probability for single-corpus tokens

changed the probability for tokens that appear only in one corpus:

TYPE			FROM		TO
Appears +10 in Spam	.9901		.9999
Appears <10 in Spam	.9900		.9998
Appears +10 in Innocent	.0099		.0001
Appears <10 in Innocent	.0100		.0002

[20031019.2200] jonz: added test-conditional training support

added configure flag --enable-test-conditional which will enable test-
conditional training.  test-conditional tranining will automatically re-train
the user's dictionary on spam or false positive until the message condition is
met (e.g. until the user's dictionary no longer results in misclassification of
the message being retrained).  this training has a maximum number of 5
iterations, and will only invoke when:

- The user has > 4000 innocent messages in their corpus, and is reporting
  a spam

- The user is reporting a false positive (regardless of the number of
messages in their corpus)

[20031019.2016] jonz: added support for shared groups in mysql_drv driver

support has been added for shared groups using the mysql_drv driver, but with
one caveat: if you will NOT be enabling "virtual users" support, you will need
to create a user on your system for each group you add.  This is because the
mysql_drv driver maps user ids in the database to users on the system.  this
is not an issue when "virtual users" support is enabled.

Version 2.7.6.8
---------------

[20031019.1722] jonz: added mysql.sock functionality

added functionality for connecting via mysql.sock instead of TCP.  specify
pathname to socket in lieu of hostname to implement.

[20031019.1700] jonz: eliminated false-positive retrain headers

eliminated the additional X-DSPAM headers added when reclassifying a 
false positive.  the headers from the original classification are
preserved.

[20031019.1530] jonz: centralized syslog logging of mysql query errors

centralized/standardized syslog logging of all mysql query errors

[20031019.1530] jonz: corrected bug in virtual users w/mysql

corrected a bug causing some tools to fail when virtual users is enabled while
using the mysql_drv driver.

[20031018.1050] jonz: corrected type-o in dspam_corpus.in

fixed close(PIPIE) type-o in dspam_corpus.in

Version 2.7.6.7
---------------

[20031017.2230] jonz: enhanced overall inoculation processing

code cleanup of inoculation processing; one central subroutine.  fixed some
minor related bugs.

[20031017.2129] jonz: corrected external inoculation processing

external inoculations (--corpus --inoculate --addspam combination) resulted in
an error causing the user to never be inoculated, however all users in the
inoculation group were.  corrected this bug so that the destination user would
also be inoculated. 

Version 2.7.6.6
---------------

[20031017.1930] jonz: fixed bugs in CGI 'From' line reporting

fixed a bug that caused malformatting in the 'Fron' line when placing in spam
quarantine

[20031017.1930] jonz: fixed bugs in false positive processing

fixed a bug, which now strips out any quarantine message 'From' line added by
DSPAM prior to processing.

[20031017.1930] jonz: fixed variable definition problems with experimental code

fixed bugs in experimental code; should not affect normal users, but broke
the build anyway.

Version 2.7.6.5
---------------

[20031017.1730] jonz: added --enable-experimental

added --enable-experimental flag which activates experimental code, moved
the following code bases to experimental:

- Versatile Language Message Inoculation Format
  (standard for sending/receiving inoculations across multiple anti-spam
   platforms and systems)

- Counting of unknown tokens in messages

[20031017.1700] jonz: only inoculate users who require inoculation

inoculation now only inoculates users who would otherwise have misclassified
the message being presented
 
[20031017.1600] jonz: changed all /tmp files to USERDIR

all /tmp files now outputted to USERDIR to avoid a race condition.

[20031016.2207] awn: libdb detection is changed again (sigh)

Probing for -ldb-<major> and -ldb<major> is resurrected again (needed
for some version of Debian with libdb v3.2.9).  Difference from previous
one is using libtool for linking test frogram at the "header-
vs. library version" check stage.

[20031016.1837] jonz: changed high characters to 'z' instead of ignored

changed all high characters to z's; previously ignored them.  effective way to
improve filter rate on spams using wide characters.  credit for this technique
given to Brian Burton.

[20031016.1400] jonz: added warning about MySQL bug to README

added information about the bug in MySQL versions < 4.0.15.stable to the
MySQL README.

[20031016.1227] jonz: compensated for mysql_drv insert bug

compensated for mysql_drv insert bug; made better code in both mysql_drv and
ora_drv to handle insert failures with more grace

[20031016.1142] jonz: corrected token insert debug output

corrected debug output for token inserts to display correct query and disk
state.

Version 2.7.6.4
---------------

[20031016.0946] jonz: switched to MyISAM MySQL tables

InnoDB turned out to be much slower than MyISAM, so all MySQL objects have
been changed to be of type "MyISAM".

[20031015.1434] jonz: added exit code mirroring of LDA

added exit code mirroring of LDA; if any calls to LDA fail, dspam will return
the last failed exit code

[20031015.1045] jonz: added caching of getpwnam() and getpwuid() information

added caching of getpwnam() and getpwuid() information for non-virtual users
(already caches for virtual users).  this was added to keep some tools from
hammering on LDAP or other local authentication mechanisms.

Version 2.7.6.3
---------------

[20031014.2211] jonz: fixed 100% cpu utilization bug in libdbX_deadlock

fixed a bug in libdbX_deadlock causing 100% cpu utilization on linux
 
[20031014.1935] jonz: fixed auto-recovery in libdb drivers

fixed bugs in auto-recovery mechanism in libdb drivers

[20031014.1545] jonz: added support for accepting inoculation messages

Added support for "Inoculation Message Format", a new standard which
is currently in the form of an Internet-Draft, to allow inoculation
via email and trusted checksums.

[20031014.0824] jonz: added X-DSPAM-Signature

X-DSPAM-Signature is NOT a replacement for having in-line signatures
but is useful for debugging purposes

[20031014.0842] jonz: enhanced boundary recognition

enhanced boundary recognition to catch boundaries with malformatted 
definition lines

[20031013.2217] jonz: fixed bug in dspam_2mysql

fixed type-o in 'false-positives' field to false_positives

[20031013.1949] jonz: better html filtering

implemented better filtering of some useless html tag data, focus more on
content; resulted in the catching of a few more spams

[20031013.1832] jonz: added --inoculate flag

added support for inoculation using --inoculate flag.  this can be used in
conjunction with external inoculation as described in the README file.

Version 2.7.6.2
---------------

[20031013.1443] jonz: fixed algorithm initialization bug

fixed a bug in the initialization of algorithm data, which caused some
miscalculations whenever the first token was very innocent.

[20031013.1413] jonz: changed token sorting algorithm

token sorting now sorts by delta first, then by frequency; this means 
tiebreakers will be based in part on token frequency

[20031013.1329] jonz: added deadlock detection tool

for large-volume implementations, added a deadlock detection tool, 
libdb3_deadlock or libdb4_deadlock.  this tool can be run at system start and
will continue to perform deadlock operations in the background.
 
[20031013.1317] jonz: implemented deadlock detection

Implemented calls to libdb's deadlock detection mechanism

[20031013.1250] jonz: modified Chi-Square algorithm for better performance

Chi-Square algorithm changed to use 25 tokens, ignoring mid-range

[20031012.1831] jonz: changed group file format, added inoculation type

changed group format to:

groupname:grouptype:user1,user2,userN

BE SURE TO UPDATE IN YOUR GROUP FILE

there are now two types of groups: shared and inoculation.  the shared group
is the group everyone is used to, sharing dictionaries and signature dbs.

the inoculation group allows each member of the group to maintain their own
private dictionary and signature database, but members of the group will
automatically train eachother's dictionaries with spams they manually forward in
which will help 'inoculate' all other group members from new spams going out.

examples:

development:shared:bob,tom,bill

company:inoculation:jim,ted,robert

a user can be a member of multiple inoculation groups, but cannot be a member
of both a shared group and an inoculation group.

[20031012.0009] jonz: fixed freed-memory bug in decode.c

fixed freed-memory bug in deocde.c, which caused an occasional crash when
decoding encoded headers.

Version 2.7.6.1
---------------

[20031011.1236] jonz: added support for multiple algorithms

added support for multiple algorithms; e.g. if any of the enabled algorithms
suspect the message is spam, it is spam.  you can use the following flags:

--enable-chi-square
--enable-alternative-bayesian
--disable-traditional-bayesian

traditional bayesian is enabled by default

[20031011.1034] jonz: added Chi-Square specific per-token calculations

when using Chi-Square, added Chi-Square's expanded per-token calculations

[20031011.0923] jonz: fixed alternative bayesian calculations

fixed problem with the wrong definition names being used, which caused
alternative bayesian never to get invoked

[20031011.0923] jonz: fixed a bug in all calculations

a bug in 2.7.6 was fixed which resulted in spams to be missed if there were
fewer than 15 tokens available for calculation.  this could only occur in the
most rarest of circumstances, so it should not have affected much.

Version 2.7.6
-------------

[20031008.2200] jonz: added alternative calculation modes

added --enable-alternative-bayesian flag which invokes Brian Burton's 
alternative Bayesian algorithm 

added --enable-chi-square flag which invokes Chi-Square algorithm

only one or neither (for default bayesian) flags should be used.  debug
information for all three calculations is generated regardless.

[20031008.2029] jonz: fixed bug in libdb drivers

fixed a bug which used memory that had already been freed causing
some occasional unpredictible behavior.
 
[20031008.1431] jonz: added support for multipart/signed messages

added support for multipart/signed messages without altering message body.
signature is appended as a text attachment.

[20031007.1904] jonz: fixed bug in boundary detection

fixed a bug in boundary detection where boundary would fail to be detected if
it wasn't the first definition on the Content-Type heading.  For example:

Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; 
  boundary="------------ms010307080208090601090900"

would have failed.  this bug fix also improves overall boundary detection. 

[20031007.1724] jonz: added source address reporting

the source address for all messages are now reported via syslog. this uses 
the new dspam_getsource() function added to the API.  depending on whether the
message is spam or innocent, the message will be reported either to MAIL.INFO
or MAIL.DEBUG.  for example:

dspam[30965]: spam detected from X.X.X.X 

dspam[30414]: innocent message from X.X.X.X 

this can be used for creating automatic blacklists.  more to come.

[20031007.1557] awn: configure script changes

Configure script now detects version of libdb headers and guesses
appropriate library name from this version.  Probed libraries are:

    -ldb-<major>.minor>
    -ldb<major><minor>

As consequence and for example, no symlinking libdb41.so to the libdb-4.so is required now on FreeBSD.

Version 2.7.5
-------------

[20031007.0930] jonz: date field no longer ignored

date field is no longer ignored; time of day can sometimes play an effective
role in identifying spam or preventing false positives.

[20031006.1911] jonz: Oracle storage driver

first release of ora_drv; storage driver for Oracle.  please see README file
for more information.

[20031004.1423] awn: support for program-name transformation.

Configure options `--program-prefix', `--program-suffix' and
`--program-transform-name' are fully supported now except CGI.
(Was: dspam_corpus and dspam_genaliases don't honor transformed name of
dspam binary).

[20031003.1832] jonz: fix for base64-encoded binary messages 

bug fixed which caused corruption in some base64-encoded single-part
messages in which the only component was a binary file.

[20031003.0031] jonz: automatic recovery for libdb drivers

automatic recovery has been implemented for libdb drivers 

[20031003.0031] jonz: DB_ENV implemented for libdb drivers

DB_ENV locking has been implemented for libdb drivers.  This obsoletes 
storage driver dot-lock file locking, which is no longer used.  quarantine 
dot-lockfile locking is still used when writing to the quarantine.

Version 2.7.4
-------------

[20031002.1728] jonz: modified corpus flag to force results

use of corpus flag now forces results to match commandline flags, meaning
innocent messages no longer need to be fed in first.
 
[20031002.0800] jonz: added unique id to dspam_ngstats

for systems without a static public ip address, a unique id can be configured
in dspam_ngstats.c (NGSTATS_UID) comprised of alphanumeric characters, periods,
and underscores.  any invalid characters will cause stats to be ignored.

[20031002.0800] jonz: removed broken sanity checks

some sanity checks were firing off erroneous messages in 2.7.3; these have
been removed

[20031001.0800] jonz: fixed --enable-large-scale with mysql_drv

modified all drivers to add support for --enable-large-scale with mysql_drv

[20031001.0800] jonz: added dspam_ngstats

added dspam_ngstats, a global stats reporting tool designed for global
stats tracking for dspam

[20030930.1547] awn: Convenience symlinks for libdb{3,4}_purge

IMHO, `libdb3_purge' and `libdb4_purge' are not a very descriptive names.
Therefore, 2 convenience symlinks are added:
  o  dspam_purge.libdb4  (dspam_purge.libdb3 in case of libdb3 driver), and
  o  dspam_purge
both pointed to the appropriate libdb{3,4}_purge.

[20030930.1517] jonz: fixed problem with trailing commas in update command

Version 2.7.3
-------------

[20030929.1450] jonz: fixed problem with groups

groups has been repaired; apparently a line of code was inadvertantly deleted
from the source tree causing it to fail in 2.7.2.

[20030928.0253] awn: New scheme for conditional compilation of storage drivers

All following is for `configure.ac' and resulting `configure' script:

    Now configure doesn't assume that storage driver sources are have
    name `${storage_drv}.c' and `${storage_drv}.h'

    You need to list resulting .lo files in the `${storage_drv_objects}'
    variable instead.

    Storage driver specific subdirectories are should be listed in the
    `${storage_drv_subdirs}' variable also.

This allows to have any number (including zero) driver-specific sources
and subdirectories, build automatically driver specific tools in these
directories (like `libdb4_purge') and should work properly in the VPATH
environment.

[20030928.0248] awn: configure.ac bug fix

Fix CPPFLAGS related bugs in the storage drivers sections of
`configure.ac'.

All three storage sections in the configure.ac was have code like
    CPPFLAGS="$DB_LIBS $CPPFLAGS"
instead of
    CPPFLAGS="$DB_CPPFLAGS $CPPFLAGS"
(replace DB_ by MYSQL for give mysql case).

This was my bug, I know.

[20030927.1600] jonz: added docs for Courier MTA

added documentation for configuring Courier MTA with DSPAM.  contributed by
Michael Greb.

Version 2.7.2
-------------

[20030925.2231] jonz: added --disable-trusted-user-security

added configure flag --disable-trusted-user-security to disable trusted user
security, rather than trying to maintain two different versions of dspam.

[20030925.1103] jonz: added support for RedHat's built-in libdb4.0

added support for RedHat's built-in libdb-4.0.  This should also provide
compatibility with any other libdb-4.0.  An alias will still be necessary:

ln -s /usr/lib/libdb-4.0.so /usr/lib/libdb-4.so

[20030925.1103] jonz: removed -d $u from default LDA configuration

-d $u coming first in the argument list caused some problems; -d %u should now
be used instead in the MTA configuration.
 
[20030925.1103] jonz: patch to compensate for yahoo broken RFC bug

implemented patch to compensate for a bug in the yahoo client where yahoo
breaks RFC and writes an end boundary prematurely, causing the real boundary
to get corrupted.

[20030925.0855] jonz: changed compile flag --enable-virtual-uids

changed compile flag --enable-virtual-uids to --enable-virtual-users

[20030925.0852] jonz: fixed plain text html signature placement bug

fixed a small bug that caused DSPAM to place the signature in html code samples
in plain text.  

[20030924.0000] jonz: added support for virtual users

added support for virtual users in mysql_drv.  this is necessary when the
users don't actually exist on the system.  use --enable-virtual-users to
enable.  only necessary when using the mysql storage driver.

[20030923.2043] jonz: fix for multiple user bug

restored %u and adjusted docs for multiple local user bug with sendmail

Version 2.7.1
-------------

[20030923.0050] jonz: fixes for libdb tools

several small fixes to issues with compiling libdb tools

[20030923.0045] jonz: bug fix for header decoding

fixed a bug causing some headers to decode incorrectly

[20030923.0030] jonz: bug fix for attachments and signature

added code to specifically NOT append a signature to any segments that have
"Content-Disposition" of type attachment.

[20030922.1900] jonz: added more debug output 

added more debug output (on error) to mysql driver and libdspam

[20030920.0840] jonz: mysql_drv to use -lm -lz 

switched mysql_drv to use -lm -lz in place of -lcrypto.  both apparently have
compress/uncompress functions

Version 2.7
-----------

[20030919.0900] jonz: added dspam_merge tool

Version 2.7.beta.3
------------------

[20030915.0000] jonz: added mysql_drv storage driver

mysql_drv storage driver added for MySQL functionality.  please see README
and tools.mysql_drv for more information.

[20030914.1410] jonz: fixed bug in innocent_hits

fixed bug where some tokens received 2 innocent hits instead of 1 (apparently
is an old but but did not dramatically affect effectiveness)

[20030913.0956] jonz: implemented quarantine locking

implemented quarantine locking mechanism independent of driver locking

[20030913.0900] jonz: internalized API locking

all API locking performed internally (driver-specific).  no external locking
calls exist; part of _ds_init_storage and _ds_shutdown_storage.  reason:
not all drivers will require context locking (and hopefully someday neither
will libdb3/libdb4 drivers).

[20030912.0000] jonz: locks to use USERDIR

for driver compatibility, all .lock file locking takes place in USERDIR, even
for large-scale implementations

[20030911.0000] jonz: driver config script management

implemented driver configure script management and tools.[driver] for
driver-specific tools.

Version 2.7.beta.2
------------------

[20030910.0054] jonz: message header decoding

added message header decoding per RFC 2047

[20030909.1830] jonz: implmented standardized return codes

implemented standardized return codes for the major api functions:
EINVAL, EFAILURE, ELOCk, EFILE, EUNKNOWN

[20030909.1730] jonz: ported all tools to new driver API

ported all tools to new driver API.  dspam_purge has been replaced with
a driver-specific purge mechanism (default: libdb4_purge), due to the fact
that not all drivers will need to purge, and recreating datafiles is a very
specific function...still uses the storage driver api's locking mechanism.

[20030909.0051] jonz: removed dspam_convert

removed dspam_convert tool for 2.5->2.6 upgrades

[20030909.0051] awn: configure script changes

`--enable-gcc-warnings' configure option is added.

[20030908.2000] jonz: implemented storage driver API

implemented storage driver api.  default driver is libdb4_drv

[20030907.1627] awn: dspam_genaliases changes

dspam_genaliases now generates `nospam-USER' aliases (aliases for false
positive reporting) by explicitly request only.  New `--nospam' command
line option is used for this.

Version 2.7.beta.1
------------------

[20030907.1140] jonz: user identification and passthru changes

the method of user identification and passthru has been changed:

  - DSPAM no longer recognizes -d to identify the user, but instead --user
    must be used.  --user will never be passed onto the local delivery agent.

  - In order to pass the -d flag through to the local delivery agent, it
    must be specified either separately on the commandline, or at configure
    time. 

  - To allow -d flag support to be supported at configure time (and when
    overriding untrusted users), the $u variable has been added to dspam.
    any commandline arguments passed through DSPAM matching $u will be
    replaced with the actual destination username (specified with --user
    or automatically forced for untrusted users).

These changes require some modifications to the mailer configuration.  In the
following example for sendmail, you would change the following line in
the Mlocal block:

A=/usr/local/bin/dspam -d $u

to:

A=/usr/local/bin/dspam --user $u -d $u

--user is not passed through to the LDA, but -d is.  Alternatively, you could
remove '-d $u' from sendmail.cf, and configure dspam with:

--with-local-delivery-agent="/path/to/lda -d \$u"

NOTE: be sure to escape the $ in $u ONLY when specifying it on the commandline.
This will prevent $u from being overwritten with the shell's environment
variable 'u'.

Specifying this at configure time is especially useful if you plan on running 
dspam via commandline and do not want to have to specify -d [username] in 
addition to your --user [username] arguments.

[20030907.1440] jonz: removed --deliver-cmd and --quarantine-cmd

removed runtime --deliver-cmd and --quarantine-cmd functions; added configure
time --with-quarantine-agent="/path/to/agent" to override default quarantine
function.

[20030906.0000] jonz: fix for boundary definition identification

fix to detect non-lowercase multipart boundary definitions

[20030906.0000] jonz: partial rewrite of internal sorting routines

partial rewrite of tbt sort routines to drop recursion and potential stack
problems to follow.  problems only experienced when using API with
multithreaded code.  original patch submitted by Stuart Gathman 
<stuart@bmsi.com>

[20030906.0000] jonz: forced --deliver-cmd and --quarantine-cmd to require
trusted user permissions.  dspam also must be compiled with 
--enable-insecure-functions for them to be available.

[20030906.0000] jonz: trusted user implementation

implemented trusted user approach with user and passthru overrides for the
untrusted users.  see README for more information

Version 2.6.5.2
---------------

[20030906.0000] jonz: insecure parameter check

insecure parameter check; checks parameters for insecure characters:
| ; < > ` 

Version 2.6.5.1
---------------

[20030905.1105] jonz: partitioned insecure functions

partitioned potentially insecure functions to require the configure flag 
--enable-insecure-functions to be set to activate.  these include:

--deliver-cmd
--quarantine-cmd

special attention needs to be given to the execution permissions of the dspam
agent when enabling these functions to avoid users being able to 
execute arbitrary commands on the server.  it should be understood that these
are potentially insecure functions and could potentially lead to the execution 
of arbitrary code if exploited by a malicious user or CGI.

[20030905.0418] jonz: fixed bug: from header corruption

if MTA is passing in From headers, they were being corrupted by DSPAM's
header parsing.  fixed to specifically parse From headers differently

[20030904.1422] jonz: fixed bug with quoted-printable debugging

fixed a small bug that would fail to decode a quoted character immediately
following a line break

[20030904.1127] awn: c89 compatiblity

C89 compatiblity patch is applied.  Patch author: Albert Chin-A-Young
<china@thewrittenword.com>

	* configure.ac, base64.c, decode.cn dspam.c, error.c,
	error.h, libdspam.c, localdb.c, lock.c, signature.c,
	tools/dspam_dump.c: Allow building with a C89 compiler
	which does not have ISO varargs.

[20030904.1046] awn: work around Solaris' make

tools/Makefile.am doesn't uses $< authomatic variable because Solaris
make (at least some versions) doesn't supports its.

[20030904.0700] jonz: segfaulting on _ds_message_destroy

fixed a bug where destroying CTX->message caused a segfault.  fortunately, this
bug would have never been reached by the agent or the api.

[20030904.0700] jonz: nfs locking

modified lock.c to work over nfs mounts, only checking pid when hostname 
matches.  maximum 20-minute stale lock removal.
 
[20030903.1716] awn: dspam_corpus and dspam_genaliases update

dspam_corpus and dspam_genaliases are use real path to the dspam binary
instead of assuming default /usr/local/bin/dspam.

dspam_genaliases outputs aliases table to the stdout now by default.
Use new `-o filename' or `--output filename' option for redirect its to
the file.

dspam_genaliases generates `nospam-USER' aliases in addition to the
`spam-USER' aliases now.

[20030903.0145] jonz: fixed memory leak in dspam agent

fixed internal memory leak in dspam agent where CTX->message was not destroyed.
only leaked until dspam agent exited, then memory was reclaimed

[20030903.0145] jonz: updated example.c 

updated example.c to show correct CTX->message destruction

[20030903.0115] jonz: fixed bug in false positive reporting

fixed bug where innocent_hits incremented twice on false positive report

Version 2.6.5
-------------

[20030902.0000] jonz: added --version commandline parameter

added --version commandline parameter to display version; -v is not used as
it could be a passthru parameter to an LDA.

[20030902.0000] awn: dspam_purge changes

minor fixes to dspam_purge tool

[20030901.0000] awn: configure changes

- implemented checks (and use of results) for <sys/time.h> <time.h> 
- checking for math.h and fabs() were added, use -lm where need
- aesthetic changes

[20030901.0000] awn: removed compiler warnings

removed "no previous prototype" warnings with some compilers

[20030901.0000] awn: compiler warnings

miscellaneous changes to remove some compilation warnings

Version 2.6.5-rc1.1
-------------------

[20030831.0000] jonz: debug output

removed left over debug output

Version 2.6.5-rc1
-----------------

[20030829.0000] jonz: fixed broken rfc attachments

made compensation for broken rfcs with embedded attachments, where original
message should've been message/rfc822 but was instead attached as plain/text.
this caused attachments to be processed/consume large quantities of time.
decode.c modified to accept a new boundary definition from any header.

[20030829.0000] jonz: --corpus flag foregoes message delivery/quarantine

use of the --corpus flag will now prevent the messages fed in as corpus from
being delivered/quarantined

[20030829.0000] jonz: added commandline delivery override

commandline flags --deliver-cmd and --quarantine-cmd added to override the
default behavior for delivery (MLOCAL) and quarantine (either MLOCAL or
quarantine depending on configuration).  syntax:

dspam --deliver-cmd "/path/to/cmd -flags" 
dspam --quarantine-cmd "/path/to/cmd -flags"

(be sure not to use = sign).

when overridden values used, the user id is by default NOT passed through to
the called program.  use --with-passthru to pass ARG_USER %USER through to
the called program.  example:

dspam --deliver-cmd "/bin/cat" --with-passthru

actually calls: /bin/cat -d [username]

dspam --deliver-cmd "/bin/cat"

actually calls: /bin/cat

[20030829.0000] jonz: signature insertion moved inside body tag

dspam signature now inserted (wherever possible) inside HTML body tags to
avoid droppage under certain conditions.

[20030829.0000] jonz: changed dspam signature

dspam signature changed to a visble signature to work with clients that 
reformat only visible data (Eudora).  new signature:

!DSPAM:[SERIAL]!

Version 2.6.5-beta-2
--------------------

[20030826.1800] jonz: added --enable-delivery-to-stdout option

added --enable-delivery-to-stdout option which causes all delivered messages
to be printed to stdout rather than piped to an LDA.  if you wish to have spams
printed to stdout as well, use the --enable-spam-delivery option in 
conjunction.

[20030825.0031] jonz: signature attachment mode

coded signature-attachments mode, rewriting messages to include a dspam
signature attachment with full data, instead of writing the server-side
attachment.  use --enable-signature-attachments to enable. 

[20030824.2345] jonz: application/dspam-signature media type

added application/dspam-signature media type recognition

Version 2.6.5-beta-1.1
----------------------

[20030823.2010] jonz: fixed bug for empty headers

fixed a bug where segments with empty headers would be dropped in reassembly 
(currently these only seem to appear in mailer-daemon messages)

Version 2.6.5-beta-1
--------------------

[20030823.1804] jonz: groups now share same signature file

groups now share same signature file enabling them to use a single group alias 
for forwarding spams.

[20030823.1339] jonz: added new configure flags

--enable-homedir-dotfiles
When enabled, instead of checking for $USERDIR/$USER[.nodspam|.dspam],
DSPAM will check for a .nodspam|.dspam file in the user's home directory.
 
--enable-opt-in
Causes DSPAM to filter mail only for users with a .dspam dotfile.  The default
is opt-out, which requires a .nodspam file to exist to bypass filtering.

when using --enable-homedir-dotfiles, dspam installs as setuid root.

[20030823.1100] jonz: fixed segfaulting on signature reversal

[only affected alpha-4-internal]
fixed a bug where dspam segfaulted while reversing a signature making it
impossible to train dspam using signatures with alpha-4-internal.

[20030823.1100] jonz: added support for message/rfc822

[only affected alpha-4-internal]
added support for parsing message/rfc822 components; signature was not being
found in forwarded messages using this media type.

[20030822.0929] jonz: added fp alerts to cgi

added customizable false positive alerts to cgi.  alerts list will be
compared to message headers and hilight all messages that match in yellow.
alerts are stored as $USERDIR/$USER.alerts.

[20030822.0929] jonz: fixed decoding header bug

fixed a bug in the header decoding where the original encoding type was
reassembled into the message, instead of the decoded type.  fix only
affected alpha-4 (internal). 

[20030822.0929] jonz: moved signature append to process

moved appending of signature out of delivery_message and into the process
function, using the new message structures instead of parsing.  this also 
fixes a problem in that on memory failure, the delivery_message function
will no longer need to allocate memory.

[20030822.0016] jonz: adjusted lock timeout

adjusted lock timeout from 10 to 20 seconds.  depending on the load of your
machine, this could be set higher or lower.  the higher the setting, the less
chance of any failover deliveries being made, and the more chance of multiple
processes lined up waiting for a lock on a user's mailbox.

[20030822.0014] jonz: documentation tweaks

a few miscellaneous tweaks

[20030821.2145] jonz: added --enable-spam-delivery

added configure flag --enable-spam-delivery causing all spams to be delivered
instead of quarantined (for use with X-DSPAM header filtering

[20030821.1935] jonz: rewrite of message post-processing

Message post-processing rewritten; including appending of signature, 
message re-write, etcetera.  

[20030821.1908] jonz: added header information

X-DSPAM-Result: Spam || Innocent
X-DSPAM-Probability: (Actual Probability)

[20030821.1820] jonz: removed CTX->copyback

CTX->copyback is now obsolete.  All base64 decoding is performed on 
CTX->message, which is available from the context, or via calling
_ds_assemble_message() function using the message structure as a parameter.

[20030821.1730] jonz: changes to DSPAM_CTX

+  struct _ds_message *message;          /* Message Components */

for compatibility with existing API, dspam_process still accepts a const char *,
however tools that already perform message actualization (such as the DSPAM
agent) can set CTX->message to the existing struct _ds_message * to avoid
reprocessing the message, and to carry over any encoding changes.

[20030821.1730] jonz: implemented new decode/actualization functions in sig

implemented use of new actualization and decoding functions [decode.c] in
dspam.c's signature scan code. 

[20030821.1729] jonz: finished block decoding functions

/* Public decode function */
char *                  _ds_decode_block(struct _ds_message_block *block);
                                                                                                                                                                   
/* Private decoding functions */
char *                  _ds_decode_base64(const char *body);
char *                  _ds_decode_quoted(const char *body);

[20030820.0015] jonz: finished preliminary message actualization

decode.c: finished preliminary actualization code (code responsible for
actualizing a message into its individual components).  experiments with
plain messages and non-embedded multipart messages succeeded.  next phase of
testing to include embedded multipart messages, including spams that are
designed to frequently break RFC.  once testing/patching is complete,
decoding routines to follow.

[20030819.0000] jonz: signature embeddedding changes

signatures are now embedded in every text segment of a message to
insure they are forwarded properly

[20030818.1350] awn: fix for empty messages

(Submitted by Andrew W. Nosenk  <awn@bcs.zp.ua>)

* added check for empty data to prevent segfault

[20030817.1336] awn: configure script changes

(Submitted by Andrew W. Nosenko  <awn@bcs.zp.ua>)

* configure.ac: Work around versioning issues of some versions of
  db-4.  E.g. db_create() may be not a real function but simple
  forwarding macro to the db_create_4001().

* configure.ac: New configure option `--with-db4-libraries' (as
  pair for `--with-db4-includes')

[20030817.1230] jonz: added --disable-bias configure flag

when configure is run with --disable-bias, dspam no longer biases the
statistics in favor of innocent mail.  This may increase the filter's
effectiveness in catching spam, but could also potentially result in less
false positive protection.  some argue that eliminating bias is more
accurate, not less.

[20030815.0300] jonz: added dspam_genaliases script

a small script to create an aliases table from /etc/passwd

[20030814.1928] jonz: added large-scale directory support to tools

ported tools to support large-scape directory support (see below).

[20030814.0005] jonz: added large-scale directory support

when configure is run with --enable-large-scale, dspam stores all its user
files in large-scale mode.  for example, user root's files would be stored in
/etc/mail/dspam/r/ro/root.  directories are created automatically as needed. 

Version 2.6.4.1
---------------
                                                                                
[20030816.2352] jonz: parse fix for boundaries with spaces
                                                                                
added fix for multipart emails with spaces in the boundary definition
(e.g. boundary= "blah").  Discovered in some of the newer 'Urgent Response'
type spams.

Version 2.6.4
-------------

[20030809.1115] jonz: corpus spams marked as misses

spams learned through dspam_corpus are now marked as misses instead of 
caught spam.

[20030808.1945] jonz: changes to header processing

Message-ID is now considered for useful information.  Received header is now
considered, but parsed in a different manner preserving IP addresses and
other useful information.

[20030808.1945] jonz: blank signatures will no longer get written

blank signatures are a result of a failover passthrough for a particular
user.  dpsam has been changed to not write a signature if the signature
itself is blank, preventing <!DSPAM:> from appearing in an email.

[20030808.1945] jonz: added .nodspam file functionality

in an attempt to conserve disk space, a username.nodspam file may be
touched in the /etc/mail/dspam directory, which will cause all messages
for that user to be passed through dspam and not processed.  this will
prevent a dictionary or signature file from being built and save disk
space.  users wishing not to use dspam can still simply not use it,
but dropping a .nodspam file will prevent any files from being created. 

[20030805.1630] jonz: fixed multiple header destroy calls

fixed bug where the header nodetree was destroyed a second time in some errors
that cleaned up and returned, causing a segmentation fault.

[20030805.1400] jonz: added quoted-printable decoding

added quoted-printable decoding; decodes hex codes into actual characters.

[20030805.1230] jonz: documentation correction for dspam_corpus

dspam_corpus uses --addspam flag, not -a anymore

[20030805.1200] jonz: added verbose debugging option

added --enable-verbose-debug for verbose debugging information to be written
to /tmp/dspam.debug

[20030805.1200] jonz: new line unbreaking code

new line unbreaking code to unbreak only quoted-printable lines

Version 2.6.3
-------------

[20030801.0930] jonz: debug after context destruction

fixed a bug in dspam.c that reported debug information for a context
after it had been destroyed.

20030801.0930] jonz: dspam_clean to create new databases

dspam_clean tool rewritten to create new databases when called in the same 
fashion as dspam_purge.  this helps keep the databases in good health and
smaller filesize.
 
[20030801.0900] jonz: fix for PGP signatures

fixed formatting bug causing PGP signatures to be corrupted.  fix required
removing line unbreaking from message which could potentially cause dspam to
lose one or two signatures when messages are being forwarded from Microsoft
Outlook.  does not appear to be a significant issue.

[20030801.0900] jonz: fix for unchecked malloc calls

fixed two unchecked malloc calls
=> struct nt *nt_create(int nodetype)
=> struct nt_node *nt_add(struct nt *nt, void *data)

submitted by Thomas Lussing <lussnig@smcc.net>

[20030731.0852] jonz: added syslog logging 

added syslog logging using mail facility

[20030730.2323] jonz: documentation addition for username case

  added this to the README:

  NOTE: Some authentication mechanisms are case insensitive and will
   authenticate the user regardless of the case they type it in.  DSPAM,
   on the other hand, is case sensitive and the case of the username used
   will need to match the case on the system.  If you suffer from this
   authentication problem, and are certain all of your users' usernames are
   in lowercase, you can add the following line of code to the CGI right
   after the call to &ReadParse...

   $ENV{'REMOTE_USER'} = lc($ENV{'REMOTE_USER'});

[20030730.2311] jonz: fixed bug in dspam_stats

fixed formatting bug in dspam_stats causing problem with usernames > 16 
characters.  submitted by Stuart Gathman <stuart@bmsi.com>

Version 2.6.2.03
----------------

[20030729.2205] jonz: fixed more line parsing bugs

fixed some additional bugs in line parsing which may have caused some emails
to appear blank in Microsoft Outlook

Version 2.6.2.02
----------------

[20030729.0225] jonz: internal cleanup

removed unused variables and added prototypes for some functions lacking them

[20030729.0225] jonz: implemented strsep to fix processing snag

large messages resulted in significant processor consumption due to previous
method of splitting up messages line-by-line.  strsep now implemented to remove
this bottleneck.

Version 2.6.2.01
----------------

[20030710.1000] jonz: fixed bug in dspam_stats

dspam_stats now reports TS (total spams) as total spams minus spam misses.

[20030710.1000] jonz: fixed bug in false positives

fixed a bug where false positives reported without a signature would fail to
decrease the total number of spams.  this event should never occur using
dspam, and only addresses this as an issue for any third party software using
the dspam library.

[20030710.1000] jonz: added support for reusable contexts

added support for reusable contexts, enabling a context to be processed 
multiple times.

[20030704.1827] jonz: fixed condition in chomp

fixed a condition in chomp where it could potentially cause a segment fault if
called with a NULL pointer, or a string with zero length.  this should never
occur anyway considering the calling code.

Version 2.6.2
-------------

[20030701.0000] jonz: added DSF_CLASSIFY flag

added DSF_CLASSIFY flag to libdspam.  use of this flag causes libdspam _not_ to
record statistics for a specific operation, but only to evaluate and return
the operation's result.
 
[20030701.0000] jonz: fixed bit assignment bug

fixed a bit assignment bug resulting in clearing of all flags when headers
ignored
submitted by Stuard D. Gathman [stuart@bsmred.dmsi.com]

[20030701.0000] jonz: fixed bugs related to corpus mail

fixed a bug causing corpus mail's headers to be ignored
submitted by Stuard D. Gathman [stuart@bsmred.dmsi.com]

Version 2.6.1.01
----------------

[20030627.1924] jonz: fixed memory free of copyback buffer

copyback buffer is now freed in dspam.c when context is destroyed

Version 2.6.1.00
----------------

[20030622.0000] jonz: added ` as delimiter

[20030620.0000] jonz: added support for group dictionaries

Group dictionaries enable a group of users with similar email behavior to
share the same dictionary while still maintaining a private quarantine box.
Please see README for more information.

[20030620.0000] jonz: added dspam_stats tool

The dspam_stats tool can be used to display the statistics for one or all
users on the system.  Please see README for more information.

Version 2.6.0.69
----------------

[20030618.0000] jonz: line unbreaking correction

correction made to line unbreaking to sanity check for consecutive
equal signs

Version 2.6.0.68
----------------

[20030612.0000] jonz: change to configure tool

changed configure tool to look for db_strerror instead of
db_env_create in the event that libdb was built without
environmental functions

Version 2.6.0.67
----------------

[20030609.0021] jonz: bugfix in line unbreaking

fixed a bug in line unbreaking (where clients use an equal sign
followed by a carriage return to break up long lines) causing
some attachments to be unreadable by some mail clients.  lines
are now only unbroken in text segments.

[20030607.1020] jonz: bugfix in attachment boundaries

fixed a small bug that wrote the boundary twice at the end of
an attachment

Version 2.6.0.66
----------------

[20030603.1900] jonz: bugfix in line unbreaking

fixed a bug in line unbreaking (where clients use an equal sign 
followed by a carriage return to break up long lines) causing 
unquoted signatures ending with an equal sign to be malparsed,
causing the email to become slightly jumbled.

[20030603.1800] jonz: DSF_CORPUS flag

added DSF_CORPUS flag for processing messages that are from corpus; 
prevents innocent totals/hits from being subtracted when spam corpuses
are fed in. 

Version 2.6.0.65 
----------------

[20030601.0000] jonz: bugfix for locking

a bug in the locking mechanism for tools fixed; occasionally could cause
a corrupt dictionary

Version 2.6.0.64
----------------

[20030525.2300] jonz: bugfix for boundaries

fixed a bug causing boundaries ending in == to be parsed incorrectly
fixed a bug in parsing boundaries that used = without quotes

[20030523.2300] jonz: bugfix for attachments

fixed bug causing attachments to be dropped

[20030523.2300] jonz: optimizations for large databases

increased database cache to 4MB and implemented alternative btree
sorting routine to greatly speed up database functions

[20030523.2000] jonz: addition of libtool/shared libs

libtool is now implemented to build a shared libdspam library.

[20030523.1830] jonz: bugfixes

bugfix for multipart messages that caused message to be truncated
bugfixes to signature management causing some segfaults
bugfixes to crc64 calls, some calls returned a different crc every time

[20030523.0100] jonz: partial rewrite

Rewrote dspam engine into libdspam, enabling developers to link in libdspam
to provide "drop-in" spam filtering for their projects.

Migrated to 64-bit tokens; previous 2.6-Beta databases using 32-bit tokens
will not work with this new version.

Server-side-signature presently the only signature storage method; looking
into a different method of incorporating signature in emails.

Implemented tracking of spam misses and false positives.  Reported in CGI

[20030521.2315] jonz: url tokens ignored outside of urls

tokens found inside urls are ignored as individual tokens, and only 
represented as Url*token.

[20030520.0200] jonz: bugfix for base64 decoding

fixed a bug that failed to decode non-multipart base64 messages

[20030519.0000] jonz: ignore all html tags without spaces

ignore all html tags without spaces; frequently used to separate tokens

[20030519.0000] jonz: ignored collapsible html tags 

collapsed (rather than overwrote) html tags to join together tokens that
some spammers use such tags to separate.  

[20030518.1500] jonz: addition of dspam_crc tool

dspam_crc tool converts a string into the numeric crc used for storage in
the dspam dictionary; makes it easier to use dspam_dump and grep for a 
particular token

[20030517.1930] jonz: bugfix for as_spam signature

fixed a bug causing the signature not to be displayed
on messages marked as spams

[20030517.1300] jonz: bugfixes 

fixed bugs in signature storage (delete .sig files to fix)
fixed bugs in dspam_purge
fixed bugs causing segfault under some circumstances

[20030516.0052] jonz: exim documentation corrections by Jerome Alet

Exim configuration to directors, not routers

[20030516.0020] jonz: massive rewrite and optimizations

addition of tbt and lht dynamic data structures
rewrite of debugging functions
rewrite of database functions
conversion to crc32 long integers for token management
addition of dspam_convert to convert old databases
renamed dbdump to dspam_dump, removed dbset/dbdelete

these rewrites/optimizations convert all tokens to numeric (long)
values, making processing and sorting much faster.  tbt implements
a binary tree sorting mechanism eliminating qsort.  storing tokens
in numeric format also removes the necessity for the zlib compression
librayr.

[20030514.1500] jonz: bugfix in content identification

small bugfix in content identification that led some emails to miss a
dspam signature

[20030514.1500] jonz: error message output added to debug

error messages previously only made it to stderr.  when --enable-debug
option is used, errors are also printed to debug

Version 2.5.4 - May 14 2003
---------------------------

[20030514.0240] jonz: added autoconf support contributed by Andrew W. Nosenko

thanks to Andrew W. Nosenko for contributing the files/patches to provide
autoconf support to dspam.  please read the README file for instructions.

[20030514.0200] jonz: changed hash to support ints

hash.c modified to support ints or character pointers.  makes tracking
token frequency much faster.

[20030513.2345] jonz: bug in dspam_clean corrected

corrected a bug in dspam_clean causing it to fail

[20030513.2300] jonz: experimental tokenized rules

playing with a few experimental tokenized rules

[20030513.2300] jonz: freebsd makefile setuid root

modified the freebsd makefile to install as setuid root.  this is due to 
freebsd's mail.local requiring the ability to change its uid.  dspam will
not work correctly on the commandline (for example when reporting false 
positives)

[20030513.0325] jonz: changed probabilities for single-corpus tokens

probabilities of 0.0100 and 0.0101 were previously assigned to tokens
appearing only in the innocent corpus.  this has been changed to
0.0099 and 0.0100 to balance out the 0.9900 and 0.9901 used for tokens
that appear only in the spam corpus.  this very small change corrected
3 false positives that appeared.

[20030513.0250] jonz: added documentation for exim

documentation thanks to David Shirley 

[20030512.1930] jonz: applied changes submitted by Andrew W. Nosenko

(DELIMITERS): Plain `^M' character is replaced by appropriate
	escape sequence `\r' for avoiding gcc-3.2.2 warning "multi-line
	string literals are deprecated"

(MAX_FILENAME_LENGTH, MAX_USERNAME_LENGTH): Use system-defined
	limits when available (for example max. filename length under
	Linux is not 128 as harcoded, but 4096).

(USERDIR): Define USERDIR only if not defined somewhere else
	(e.g. from command line).  Very convenient for building binary
	package.

Version 2.5.3 - May 12, 2003
----------------------------

[20030512.1430] jonz: bugfix for ignored headers

a bug was fixed that caused all headers to be ignored if a message was stored
as a raw message in the signature database.

[20030512.1400] jonz: embedded boundary recognition

added embedded boundary recognition to recognize emails with embedded bounaries,
such as those sent by Eudora when special formatting is enabled.
 
[20030512.1200] jonz: documentation

added better documentation for the correct permissions of the dspam 
directories and the correct group memberships for the MTA user. 

[20030512.1200] jonz: locking bugfix

fixed bug in locking that caused a loop if a lockfile could not be created 
(due to file permissions).  also increased lock debugging verbosity.

[20030511.2025] jonz: false positives adjustment

false positives reported now hit a token 3 times innocent instead of 2,
for faster re-learning.

[20030511.2010] jonz: header parsing bug

fixed a header parsing bug that did not carry the original header name
across multiple lines, for example the Received header.

[20030511.1945] jonz: dspam_purge complete

dspam_purge completed and expanded to delete old non-qualifying tokens
and defragment/shrink user dictionaries

[20030511.1945] jonz: rewrite of dspam tools

dspam tools rewritten to support new spam_record structure. 

[20030511.1945] jonz: implementation of struct spam_record

new spam_record structure implemented for database storage; include last
hit date for new purge tool.  subroutines backward compatible to work
with old databases.

[20030511.1827] jonz: bugfix for lock sleep

fixed a bug that caused all dspam processes to sleep for 1 second, even
if a lock was successfully acquired on the first try.

[20030511.1719] jonz: addition of probability information to spams

messages marked as spams now to include the tokens and probabilities used in
the message

[20030511.1600] jonz: body tag filtering

now ignoring body tags.  the only frequently used tags that are being 
considered are font, img, and meta

Version 2.5.2 - May 11, 2003
----------------------------

[20030510.1615] jonz: token word joins with punctuation

token word joins modified to include dollar signs and exlamation points. for
example:

$S A V E$

previously would result in 3 tokens: $S, AV, E$ but now results in one: $SAVE$

[20030510.1500] jonz: bugfix for multipart boundary

a bug fixing a problem with multipart boundaries not being detected when defined
without using quotes has been corrected.  this resulted in the dspam signature
(or identifier) never making it into the message.  for example:

Content-Type: multipart/alternative; 
  boundary='~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'

is now detected correctly

[20030510.0035] jonz: additional filtering

added additional filtering to ignore words with control characters, 
numbers that are not prefixed with $ or end with %, and any tokens that
do not begin with an alphanumeric character, with the exception of $ and #.

[20030510.0020] jonz: bug fix for lock failures

a bug has been fixed that caused dspam to loop, sending multiple emails
in the event of a lock failure

[20030509.2100] jonz: Makefile for FreeBSD

added makefile for freebsd

[20030509.2015] jonz: procmail fix

added small fix to accomodate some procmail implementations 
that require an empty argument after -a

[20030509.0130] jonz: addition of dspam_purge

please see README for more details

[20030509.0130] jonz: tools to output to stderr

dspam tools to output to stderr

[20030509.0130] jonz: removed probability from db storage

removed the 13-character probability from the hash databases; was 
taking up considerable space and wasn't necessary for the calculation.
is backwards compatible, so there is no need to delete any db's.

[20030509.0040] jonz: ! is now treated as a delimeter

the ! character has been added to the delimiter list

[20030508.2330] jonz: added .lock locking mechanism 

added a .lock locking mechanism to prevent database corruption and/or
quarantine mailbox corruption.

[20030508.1915] jonz: filtering of boundaries

multipart boundaries are now filteres

[20030508.1800] jonz: token word joins

if a token is only one character long, and is adjacent to other similar
tokens, each token will be joined to create a single token.  for example

V I A G R A

will be tokenized as "VIAGRA"

[20030508.1800] jonz: header array abolished

the array holding each header line has been replaced with a nodetree
(dynamic data storage)

[20030508.0800] jonz: bugfix for dspam_clean

dspam_clean segfaults after processing the first user signature file.  this
was due to an invalid database handle being closed.  the correct handle is
now used

Version 2.5.1; May 8 2003:
--------------------------

[20030508.0045] jonz: bugfix for inline comments

inline comments normally used to break up guilty spam words such as
S<!1234>E<!1234>X<!1234>

were only partially filtered, leaving gaps between the letters and causing 
DSPAM to miss the whole word.  this has been corrected to eliminate the space
the comments previously used, bringing the words together for calculation.

[20030508.0025] jonz: strdup() overusage

if only one destination user is specified, strdup() is not used to duplicate 
the original header/body pairs to pass to process_user()

[20030507.1130] jonz: bugfix for multiple users

when multiple users are specified in the local mailer parameters, the first
user process, due to a bug in setting ADD_AS_SPAM, determined whether the
message was spam for all other users.  ADD_AS_SPAM is now reset to its original
value prior to each user's calculation.

[20030507.2200] jonz: increased html filtering

<div and <p html tags are now ignored

Version 2.5; May 7 2003:
------------------------

[20030507.0500] jonz: increased html filtering

td, tr, and table tags are now ignored

[20030507.0500] jonz: increased bare corpus safeguards

the following safeguards have been implemented to prevent false positives
in immature corpuses:

- the minimum number of hits for a token to register at anything above .40
  has been raised from 5 to 20 if the user has fewer than 500 innocent
  messages
- if the user has fewer than 1000 messages, the minumum number of hits
  is equal to 5 + (the spam ratio / 2)

[20030507.0500] jonz: commandline multiple user support 

multiple users on the same commandline (e.g. -d user1 user2 user3) are now 
processed individually.  prior to this, only the first user was processed 
(even though the message was delivered to all users).  this results in each
user having their own unique record of the message in their dictionary and 
signature.

[20030507.0500] jonz: libdb1 -> libdb4 migration

libdb 4 has been implemented after running into some problems with db1 
segmentation faults on large record insertions. as a result, to upgrade to 
this and all newer versions, it will be necessary to delete all existing user 
databases on the system. libdb4 can be found at www.sleepycat.com. it should 
be relatively easy to re-code the db functions for db2 or db3, if the 
administrator doesn't want to use db4. 

[20030506.0400] jonz: buffer.c memcpy implementation

modified buffer.c to use memcpy() instead of strcat() resulting in a 
_significant_ speed increase. the delay caused by strcat() in messages 
with large attachments resulted in message parse times to be +20 seconds. 
using memcpy(), parse time is down to less than a fraction of a second. 
this fix addresses issues with dspam on low-end machines.

[20030506.0400] jonz: server-side storage options

if a token string is longer than the original message, the original message
is stored on the server instead and re-parsed.

[20030506.0400] jonz: zlib compression library

zlib (-lz) is now used to compress server-side signatures. zlib can be found 
at http://www.gzip.org/zlib/.  if you will not be using server-side
signatures, remove the -lz library flag from the makefile.

[20030504.0400] jonz: server-side signatures 

server-side token signatures (SSTS) have been implemented with an optional 
compile flag (set by default). using SSTS will eliminate long, annoying 
DSPAM signatures at the expense of server disk space. the signature appended 
to each email is replaced with a single comment to include a reference token. 
this also enables the complete set of tokens from a message to be recorded 
(although only the top 15 are used in actual calculation).   

compiling without SSTS mode enabled will only record 15 or 60 tokens from a
message, depending on whether more than 5 tokens are recognized.  SSTS mode
will record all tokens.  in either mode, only the most interesting 15 tokens
are used in the calculation.

[20030504.0400] jonz: chained tokens

chained tokens have been implemented providing several new analysis features. 
for example the text 'FREE FOR ALL' will parse into five tokens: 

FREE
FOR
ALL
FREE FOR
FOR ALL

this parsing is not specific to just words, but any type of valid token. 

...for more information.

[20030504.0400] jonz: token precedence

words not appearing in the opposite corpus were previously assigned a 
probability of .99 or .01. now, priority is given to a token that appears 
more than ten times in a single corpus.  

[20030504.0400] jonz: token case

previously, tokens were case insensitive unless they were in all caps. now,
all tokens are case sensitive. 

[20030504.0400] jonz: short html tags

short HTML tags (less than 15 characters) are filtered out. this helps 
prevent false positives that could be caused by a lack of HTML-based email 
in an innocent corpus. it is normally not desirable behavior to assign a 
higher probability of spam to a message simply because it's in HTML, but we 
don't want to filter out all HTML so longer tags will still be tokenized. 

[2003.0503.0400] jonz: special tokens for urls

URLs are broken down into URL-specific tokens. for example, 
http://www.nuclearelephant.com/projects/dspam/ will be broken down into: 

Url*www
Url*nuclearelephant
Url*com
Url*projects
Url*dspam

this should help separate emails with suspicious URLs from emails with the 
same tokens outside of a URL.  

[20030503.0400] jonz: misreported number of messages in quarantine

due to a small bug, the number of messages in a quarantine box can be 
misreported. this has been fixed. 

[20030503.0400] jonz: dspam signature change

the DSPAM signature of previous versions is unfortunately rewritten 
incorrectly by some email clients such as Microsoft Outlook. The signature 
has been modified, and the signature retrieval tool has been coded with more 
of a wildcard approach, to help avoid missing reversal information. 
this only applies to administrators running DSPAM outside of its default 
SSTS mode. 

[20030503.0400] jonz: closing html tags
 
some spams fail to close their /html tag in an attempt to evade some spam 
tools. DSPAM now closes the tag to avoid the dpsm signature being ignored.

[20030503.0400] jonz: ignoring of useless header information

the 'Message-ID', 'Received' and 'Date' headers are now ignored; they 
seemed to be filling up more than half the tokens with useless information 

[20030503.0400] jonz: high asccii characters

tokens with high ASCII characters are now ignored 

[20030503.0400] jonz: forwarded message headers

dspam now ignores message headers for messages forwarded by user as spam with 
no identifiable signature.  this prevents irrelevent information from being
recorded, which could lead to any message in reply to be marked as a false
positive.
 
[20030503.0400] jonz: minor code cleanup for linux build

made some minor changes to code to build without warnings on linux

[20040503.0400] jonz: reequired use of long --addspam flag

the shortened flag for --addspam (-a) has been removed for compatibility 
with procmail (procmail uses -a). in order to use this latest build, 
all spam-box aliases (e.g. spam-bob) must be changed to --addspam. 

[20030503.0400] jonz: flag for chained tokens

added -DCHAINED_TOKENS (enabled by default) switch; those who don't have 
the extra disk space for chained tokens can now turn them off by removing
this compile flag.

[20030503.0400] jonz: debug rework

-DDEBUG now results in debug going to /tmp/dspam.debug 

Version 2.4.1; April 29 2003
----------------------------

[20030429.0000] jonz: dspam_signature tool addition

Added dspam_signature tool for decoding dspam signatures via commandline 

Version 2.4; April 27 2003
--------------------------

[20030427.0000] jonz: signature change

changed the signature to a base64-encoded, BEGIN/END delimited signature. 
people seem to feel more comfortable with it, as it resembles the signatures 
used with PGP, Server Certs, and other encrypted signatures...it's also 
less messy. 

[20030427.0000] jonz: false positive recall mechanism

in the unlikely event of a false positive, a mechanism is now available to 
reverse the information from the false positive and email the message to the 
user. this is made possible via a button while viewing a message in the 
user's quarantine box. 

[20030427.0000] jonz: base64 decoding

new code to Base64 Decode any encoded text segments. some SPAMs being sent 
out today are encoded in an attempt to bypass any filtering.  they are
now decoded prior to analysis and delivery.  this only applies to text 
segments (text/plain, text/html, etc.) and should not affect attachments. 

Version 2.35; April 24 2003
---------------------------

[20030424.0000] jonz: makefile corretion

Makefile.linux: -ldb -> -ldb1

[20030424.0000] jonz: prefixed from line

prefixed messages headed to quarantine with a 'From' header to make mailbox
format compliant.

[20030424.0000] jonz: quarantine box showing no spams

fixed a bug that resulted in caught spams to not show up in quarantine box

Version 2.3; April 20 2003
--------------------------

[20030420.0000] jonz: token insertion bug

fixed a bug that occurs when inserting token information on some
multipart emails, which inserts it into the text/plain segment instead of
the text/html segment

Version 2.2; April 17 2003
--------------------------

[20030417.0000] jonz: reversal information

reversal information is now used in spams to reverse the original 15 tokens
(unlearn and relearn as spam).

Version 2.1; April 14 2003
--------------------------

[20030414.0000] jonz: production changes

applied 0.40 value to words with less than 5 hits
changed spam threshhold from .8 to .9

[2003.0414.0000] jonz: attachments

repaired minor bug in filtering out attachments and html comments

Version 2.0; April 11 2003
--------------------------

Version 2 Initial release