Sophie

Sophie

distrib > Fedora > 15 > i386 > by-pkgid > ade7210802baa7dbca216b600ec3dfba > files > 16

newscache-1.2-0.12.rc6.fc15.i686.rpm

			     Corrupt Database Files
			     ======================

							  Author: Herbert Straub
						  e-mail: herbert@linuxhacker.at
								Date: 2004-09-26

Situation:
----------

Try to fixing database corruption errors, such as this Debian bug:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=52583

or this one (reported via mailinglist):
Aug 24 08:51:06 mail NewsCache[10564]: NewsCache Server Start Aug 24
08:51:11 mail NewsCache[14658]: nnrpd: access entry name matched:
0.0.0.0/0.0.0.0 Aug 24 08:51:11 mail NewsCache[14658]:
mail.thriftinternet.com [127.0.0.1] connect Aug 24 08:51:11 mail
NewsCache[14658]: NServer::NServer() hostname set to:
mail.thriftinternet.com Aug 24 08:51:11 mail NewsCache[14658]: nnrpd caught
Error Exception! File: NVcontainer.cc Function: virtual void
NVcontainer::make current() Line: 135
Desc: NVConatiner(19603): info-record shrunk
Aug 24 08:51:11 mail NewsCache[10564]: receiving signal SIGCHLD: 17



Improved Error Messages
-----------------------

Starting with NewsCache 1.2rc6. NVcontainer reporting the exact reason, which
sanity check fails. At the moment, the improved error message will only be
protocolled to the logfile, if configured with --enable-debug. All Exception
will be catched in NewsCache.cc, but not be protocolled! -> This should be
changed in the future. Such a protocol entry in news.debug looks like:

Sep 26 23:35:25 karte-faultier NewsCache[31040]: Exception! File:
NVcontainer.cc Function: virtual void NVcontainer::make_current() Line: 186
Desc: NVConatiner::make_current(): wrong header version mem_fn: /v
ar/cache/newscache/at/test/.db mem_hdr->version: 842215985

In news.err you will find:
Sep 26 23:35:25 karte-faultier NewsCache[31040]:
NVcontainer::rename_to_bad_and_close: Error: bad databa se file:
/var/cache/newscache/at/test/.db.bad



Automatic Failure Recovery
--------------------------

If NVcontainer detect a corrupt database, then NewsCache try to rename the
corrupt file to *.bad. Then the old programm flow continues. The next User
request to the newsgroup, which was curropted, can start with a now
datadownload from the remote server.



NewsCacheClean:
---------------

At the moment, newscacheclean stops with "Aborted" (SIGABRT signal). Starting
with NewsCache 1.2rc6, the bad database file will be renamed to *.bad. With
the next run of newscacheclean, it will work.

The following sequenz will be protocolled in such a situation in news.err:

Sep 26 23:25:07 ka NewsCacheClean[30955]: NVcontainer::rename_to_bad_and_close: Error: bad d
atabase file: /var/cache/newscache/at/test/.db.bad
Sep 26 23:25:07 ka NewsCacheClean[30955]: Error, please report bug! caught Error Exception!
File: NVcontainer.cc Function: virtual void NVcontainer::make_current() Line: 186 Desc: NVConatiner::mak
e_current(): wrong header version mem_fn: /var/cache/newscache/at/test/.db mem_hdr->version: 842215985



Constraints and Future
----------------------

1) What happens with local newsgroups. (rename to bad is no solution).
2) Improved Error Message only be protocolled, if configured with
   --enable-debug
3) newscacheclean: don't abort, if the database file renamed to *.bad
4) NewsCache should be protocolled database failure to news.err



Contribute
----------

You can help to improve this situation, with:
o) report database curroption errors (mailinglist)
o) sending patches, to improve the source code.



Programming
-----------
Currently only the class NVcontainer is be analyzed for sanity check failures;
method make_current. The new method rename_to_bad_and_close will be called
only from make_current.