Corrupt Database Files ====================== Author: Herbert Straub e-mail: herbert@linuxhacker.at Date: 2004-09-26 Situation: ---------- Try to fixing database corruption errors, such as this Debian bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=52583 or this one (reported via mailinglist): Aug 24 08:51:06 mail NewsCache[10564]: NewsCache Server Start Aug 24 08:51:11 mail NewsCache[14658]: nnrpd: access entry name matched: 0.0.0.0/0.0.0.0 Aug 24 08:51:11 mail NewsCache[14658]: mail.thriftinternet.com [127.0.0.1] connect Aug 24 08:51:11 mail NewsCache[14658]: NServer::NServer() hostname set to: mail.thriftinternet.com Aug 24 08:51:11 mail NewsCache[14658]: nnrpd caught Error Exception! File: NVcontainer.cc Function: virtual void NVcontainer::make current() Line: 135 Desc: NVConatiner(19603): info-record shrunk Aug 24 08:51:11 mail NewsCache[10564]: receiving signal SIGCHLD: 17 Improved Error Messages ----------------------- Starting with NewsCache 1.2rc6. NVcontainer reporting the exact reason, which sanity check fails. At the moment, the improved error message will only be protocolled to the logfile, if configured with --enable-debug. All Exception will be catched in NewsCache.cc, but not be protocolled! -> This should be changed in the future. Such a protocol entry in news.debug looks like: Sep 26 23:35:25 karte-faultier NewsCache[31040]: Exception! File: NVcontainer.cc Function: virtual void NVcontainer::make_current() Line: 186 Desc: NVConatiner::make_current(): wrong header version mem_fn: /v ar/cache/newscache/at/test/.db mem_hdr->version: 842215985 In news.err you will find: Sep 26 23:35:25 karte-faultier NewsCache[31040]: NVcontainer::rename_to_bad_and_close: Error: bad databa se file: /var/cache/newscache/at/test/.db.bad Automatic Failure Recovery -------------------------- If NVcontainer detect a corrupt database, then NewsCache try to rename the corrupt file to *.bad. Then the old programm flow continues. The next User request to the newsgroup, which was curropted, can start with a now datadownload from the remote server. NewsCacheClean: --------------- At the moment, newscacheclean stops with "Aborted" (SIGABRT signal). Starting with NewsCache 1.2rc6, the bad database file will be renamed to *.bad. With the next run of newscacheclean, it will work. The following sequenz will be protocolled in such a situation in news.err: Sep 26 23:25:07 ka NewsCacheClean[30955]: NVcontainer::rename_to_bad_and_close: Error: bad d atabase file: /var/cache/newscache/at/test/.db.bad Sep 26 23:25:07 ka NewsCacheClean[30955]: Error, please report bug! caught Error Exception! File: NVcontainer.cc Function: virtual void NVcontainer::make_current() Line: 186 Desc: NVConatiner::mak e_current(): wrong header version mem_fn: /var/cache/newscache/at/test/.db mem_hdr->version: 842215985 Constraints and Future ---------------------- 1) What happens with local newsgroups. (rename to bad is no solution). 2) Improved Error Message only be protocolled, if configured with --enable-debug 3) newscacheclean: don't abort, if the database file renamed to *.bad 4) NewsCache should be protocolled database failure to news.err Contribute ---------- You can help to improve this situation, with: o) report database curroption errors (mailinglist) o) sending patches, to improve the source code. Programming ----------- Currently only the class NVcontainer is be analyzed for sanity check failures; method make_current. The new method rename_to_bad_and_close will be called only from make_current.