Some random notes and considerations on providing a SNMP agent for amavisd-new. Two tasks need to be addressed: - devising and specifying a content-filtering MIB - providing access to the MIB (a SNMP agent) Devising and specifying a content-filtering MIB ----------------------------------------------- While designing a MIB, the following should be taken as a reference: - collect ideas from Mail Monitoring MIB (RFC 2789, aka MTA-MIB) As amavisd-new behaves in many respects as a true MTA, a subquestion is whether a SNMP agent should provide two subtrees: one a pure RFC 2789, the other a specialized amavisd-new MIB, or would a non-standard specialized amavisd-new MIB suffice? Would clients benefit from a MTA-MIB view of amavisd? - see what variables are currently collected by amavisd-new * by running amavisd-agent, or * by directly browsing a database by: db_dump -p /var/amavis/db/snmp.db * by checking the source for variables which only rarely show; - a MIB description must be compliant to the SMIv2 (which can be automatically converted to SMIv1 if needed). - consider potential future needs and broader use when designing a content-filtering MIB, so that it might be useful to other content filtering applications; The currently provided variables collected and provided by amavisd are mostly of types Counter32 and octet string. It is quite likely that the following types will be used in the future: integer, Gauge32, TimeTicks, and IpAddress. It it unlikely that Counter64 would be needed. The type of each variable is already provided in a database. Most variables are simple non-aggregate objects. Besides these there is a set of counters to be represented as a table, i.e. a table of encountered virus names and associated counts. Furthermore there is a set of counters of a form OpsDecType-xxx, where xxx could be one or more 'short content types', e.g. OpsDecType-audio.mpa.mp3. It is not obvious what would be a best representation of these. Some aggregate representation would probaby be desired. There are currently no variables of type Gauge32, although it is not hard to imagine that such could be useful in the near future. An example may be a current size of a quarantine or tempdir area, It is also easy to imagine that variables of a TimeInterval type would be useful (e.g. LastOutboundActivity, LastConnectToSQL). It is unlikely that traps would be added in a near future, and even less likely that write or create access would be needed for a variable. It is unclear whether a concept of 'groups' from MTA-MIB would be useful in the context of amavisd-new. Perhaps considering groups like (spam, virus, etc.) or groups according to a policy bank path selected? Would that be too far from the spirit of MTA-MIB groups? The view of amavisd-new variables should represent a running instance of amavisd-new as a single entity, aggregating information from all child processes into one set of variables. This aggregation is already provided by using a Berkeley db, so this has already been taken care of. This need for aggregation is also one of the reasons why a SNMP agent or a subagent can not be implemented in each child process. When amavisd is restarted, all counters in a database are reset, and the sysUpTime timeticks variable is also reset. This may be shown to SNMP clients unchanged, or a SNMP agent may decide to implement cumulative counters which survive across daemon restarts (proper unsigned wraparounds would need to be done correctly). If the agent also provides other MIBs of the system and supplies a 'system' MIB reflecting the machine uptime, it may be more logical to hide amavisd restarts to clients unless when this specific information is being sought. A SNMP agent - providing SNMP access to the MIB ----------------------------------------------- Writing a SNMP agent from scratch would be a nightmare. SNMP is a can of worms when one looks behind simple variable associations and descends into details. As far as I'm aware there isn't any Perl implementation or significant support for writing a SNMP agent (although there are several modules to support client-side). It also appears that the only noteworthy community efforts are concentrated around the Net-SNMP project, which has good and active tradition and offers a support for a range of Unix platforms. Net-SNMP http://www.net-snmp.org/ http://www.net-snmp.org/docs/FAQ.html It supports the SNMPv1, SNMPv2c and SNMPv3, which is what we need. It provides a library, a set of tools, a client and an extensible agent. Extending an existing agent (compared to writing one from scratch) hides us from the intricacies of the protocol and differences in its versions, access controls, MIB views, BER encodings, ... When deciding what mechanism should be used to extent the Net-SNMP agent a need to decouple Berkeley db accesses from SNMP client requests must be kept in view. Access to amavisd-new Berkeley database snmp.db requires careful attention to locking: on accessing a database a cursor lock must be requested, then whatever access is needed must be done fairly quickly and in bulk, then lock must be released. Locks must be relatively infrequent and not at a mercy of a frequency of SNMP client requests. Failing to do so can result in lock contentions between amavisd-new child processes and the agent and could slow down or block mail processing. Example of a proper way to access the database should be seen in the 'amavisd-agent' sample utility, including a proper way of masking signals while holding a lock. Net-SNMP offers several mechanisms to extend its SNMP agent: - calling external scripts; - adding static C code; - loading dynamic modules (dlmod); - embedded perl; - SMUX (RFC 1227) - Agent Extensibility (AgentX) Protocol (RFC 2741) Calling scripts is only useful for simple tasks. Adding static C code and recompiling is too impractical. Loading dynamic modules is more promising, although I see a drawback with ensuring the above mentioned decoupling between db access and SNMP requests. Embedded perl sounds nice but suffers from a similar problem, and is at the moment rather experimental. Nevertheless, the decoupling could be achieved by forking a subprocess and separating the two tasks. Multi-threading in Net-SNMP is not recommended. It seems that separating SNMP agent from the process which is accessing the amavisd-new database is needed one way or another. Besides forking a subprocess and implementing some kind of IPC, there are already two standard mechanisms provided: SMUX (RFC 1227) and AgentX (RFC 2741), of which the RFC 1227 has a status of historic, so what remains is AgentX. The idea is to split SNMP agent into a master agent, dealing with clients and the SNMP protocol, and one or more subagents, which are only concerned with implementing a MIB (binds OIDs within registered MIB to amavisd-new variables accessed through Berkeley database) and talking a simpler protocol to the master agent, while being shielded from dangers and intricacies of the 'real world'. A subagent is not concerned with BER encodings, SNMP protocol, views, access controls, etc. There is a wealth of documentation and examples on the subject. Here are the more relevant documents: RFC 2741 (includes introductory material) http://www.net-snmp.org/tutorial/tutorial-5/ http://www.net-snmp.org/tutorial/tutorial-5/demon/ http://www.net-snmp.org/tutorial/tutorial-5/toolkit/demon/ AGENT.txt README.agentx http://www.scguild.com/agentx/ - IETF Agentx Working Group Caching, consistent view of variables ------------------------------------- When deciding when to access the amavisd-new snmp.d database the following should be considered: SNMP requests for reading variables are often independent, possibly in short bursts. The agent has no control on how often and in what manner clients poll the MIB. There may also be one or several independent management stations, each doing its polling schedule. It could be unacceptable if each SNMP read request would require its independent lock/read/unlock access to a database. It is more efficient to access the database on controlled intervals and collect the whole set of variables within a single lock/unlock. This also insures that all variable values are consistent, as the amavisd daemon goes to great lengths to only do atomic updates and always maintain the database in a consistent state, reflecting state before/after each message processing, and never let clients view an intermediate state. The following strategy seems appropriate: - when short interval (say ten or twenty seconds) has passed since the last access to the database, agent supplies the requester a cached view; - when cached data expires, the next requests causes the database reading to be done, refreshing the whole cached MIB. This also gives a SNMP client a possibility to obtain a consistent snapshot of a databases despite using a stateless protocol and sending independent stream of get and get-next requests. The Net-SMTP does provide support for caching, and it remains to be checked whether the supplied mechanism is appropriate for our needs or has to be re-implemented.