CHANGELOG for smartmontools $Id: CHANGELOG,v 1.103 2003/03/13 15:34:37 ballen4705 Exp $ Copyright (C) 2002-3 Bruce Allen <smartmontools-support@lists.sourceforge.net> Home page of code is: http://smartmontools.sourceforge.net This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. You should have received a copy of the GNU General Public License (for example COPYING); if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. This code was originally developed as a Senior Thesis by Michael Cornwell at the Concurrent Systems Laboratory (now part of the Storage Systems Research Center), Jack Baskin School of Engineering, University of California, Santa Cruz. http://ssrc.soe.ucsc.edu/ Maintainers/Developers Key: [BA] Bruce Allen <smartmontools-support@lists.sourceforge.net> [EB] Erik Inge Bolsø <knan@mo.himolde.no> [SB] Stanislav Brabec <sbrabec@suse.cz> [PC] Peter Cassidy <pcassidy@mac.com> [FM] Frederic L. W. Meunier <0@pervalidus.net> [PW] Phil Williams <phil@subbacultcha.demon.co.uk> [DG] Douglas Gilbert <dougg@torque.net> NOTES FOR FUTURE RELEASES: see TODO file. CURRENT RELEASE (see VERSION file in this directory): <ADDITIONS TO THE CHANGE LOG SHOULD BE ADDED HERE, PLEASE> smartmontools-5.1-9 [BA] smartctl: if HDIO_DRIVE_TASK ioctl() is not implemented (no kernel support, then try to assess drive health by examining Attribute values/thresholds directly. [BA] smartd/smartctl: added -v 200,writeerrorcount option/Directive for Fujitsu disks. [BA] smartd: Now send email if any of the SMART commands fails, or if open()ing the device fails. This is often noted as a common disk failure mode. [BA] smartd/smartctl: Added -v N,raw8 -v N,raw16 and -v N,raw48 Directives/Options for printing Raw Attributes in different Formats. [BA] smartd: Added -r ID and -R ID for reporting/tracking Raw values of Attributes. [BA] smartd/smartctl: Changed printing of spin-up-time attribute raw value to reflect current/average as per IBM standard. [BA] smartd/smartctl: Added -v 9,seconds option for disks which use Attribute 9 for power-on lifetime in seconds. [BA] smartctl: Added a warning message so that users of some IBM disks are warned to update their firmware. Note: we may want to add a command-line flag to disable the warning messages. I have done this in a general way, using regexp, so that we can add warnings about any type of disk that we wish... smartmontools-5.1-7 [BA] smartd: Created a subdirectory examplescripts/ of source directory that contains executable scripts for the -M exec PATH Directive of smartd. smartmontools-5.1-5 [BA] smartd: DEVICESCAN in /etc/smartd.conf can now be followed by all the same Directives as a regular device name like /dev/hda takes. This allows one to use (for example): DEVICESCAN -m root@yoyodyne.com in the /etc/smartd.conf file. [BA] smartd: Added -c (--checkonce) command-line option. This checks all devices once, then exits. The exit status can be used to learn if devices were detected, and if smartd is functioning correctly. This is primarily for Distribution scripters. [BA] smartd: Implemented -M exec Directive for smartd.conf. This makes it possible to run an arbitrary script or mailing program with the -m option. [PW] smartd: Modified -M Directive so that it can be given multiple times. Added -M exec Directive. smartmontools-5.1-4 [BA] Fixed bug in smartctl pointed out by Pierre Gentile. -d scsi didn't work because tryata and tryscsi were reversed -- now works on /devfs SCSI devices. [BA] Fixed bug in smartctl pointed out by Gregory Goddard <ggoddard@ufl.edu>. Manual says that bit 6 of return value turned on if errors found in smart error log. But this wasn't implemented. smartmontools-5.1-3 [BA] Modified printing format for 9,minutes to read Xh+Ym not X h + Y m, so that fields are fixed width. [BA] Added Attribute 240 "head flying hours" smartmontools-5.1.1 [BA] As requested, local time/date now printed by smartctl -i [PW] Added "help" argument to -v for smartctl [PW] Added -D, --showdirectives option to smartd [DG] add '-l selftest' capability for SCSI devices (update smartctl.8) [BA] smartd,smartctl: added additional Attribute modification option -v 220,temp and -v 9,temp. [PW] Renamed smartd option -X to -d START OF SMARTMONTOOLS 5.1 series smartmontools-5.0.50 [PW] Changed smartd.conf Directives -- see man page [BA/DG] Fixed uncommented comment in smartd.conf [DG] Correct 'Recommended start stop count' for SCSI devices [PW] Replaced smartd.conf directive -C with smartd option -i [PW] Changed options for smartctl -- see man page. [BA] Use strerror() to generate system call error messages. [BA] smartd: fflush() all open streams before fork(). [BA] smartctl, smartd simplified internal handling of checksums for simpler porting and less code. smartmontools-5.0.49 [PW] smartd --debugmode changed to --debug [BA] smartd/smartctl added attribute 230 Head Amplitude from IBM DPTA-353750. [PW] Added list of proposed new options for smartctl to README. [PW] smartd: ParseOpts() now uses getopt_long() if HAVE_GETOPT_LONG is defined and uses getopt() otherwise. This is controlled by CPPFLAGS in the Makefile. [BA] smartd: Fixed a couple of error messages done with perror() to redirect them as needed. smartmontools-5.0.48 [BA] smartctl: The -O option to enable an Immediate off-line test did not print out the correct time that the test would take to complete. This is because the test timer is volatile and not fixed. This has been fixed, and the smartctl.8 man page has been updated to explain how to track the Immediate offline test as it progresses, and to further emphasize the differences between the off-line immediate test and the self-tests. [BA] smartd/smartctl: Added new attribute (200) Multi_Zone_Error_Rate [BA] smartctl: modified so that arguments could have either a single - as in -ea or multiple ones as in -e -a. Improved warning message for device not opened, and fixed error in redirection of error output of HD identity command. [PW] smartd: added support for long options. All short options are still supported; see manpage for available long options. [BA] smartctl. When raw Attribute value was 2^31 or larger, did not print correctly. smartmontools-5.0.46 [BA] smartd: added smartd.conf Directives -T and -s. The -T Directive enables/disables Automatic Offline Testing. The -s Directive enables/disables Attribute Autosave. Documentation and example configuration file updated to agree. [BA] smartd: user can make smartd check the disks at any time (ie, interrupt sleep) by sending signal SIGUSR1 to smartd. This can be done for example with: kill -USR1 <pid> where <pid> is the process ID number of smartd. [EB] scsi: don't trust the data we receive from the drive too much. It very well might have errors (like zero response length). Seen on Megaraid logical drive, and verified in the driver source. [BA] smartd: added Directive -m for sending test email and for modifying email reminder behavior. Updated manual, and sample configuration file to illustrate & explain this. [BA] smartd: increased size of a continued smartd.conf line to 1023 characters. [BA] Simplified Directive parsers and improved warning/error messages. smartmontools-5.0.45 [EB] Fixed bug in smartd where testunitready logic inverted prevented functioning on scsi devices. The bug in question only affects smartd users with scsi devices. To see if your version of smartd has the testunitready() bug, do smartd -V If the version of the module smartd.c in a line like: Module: smartd.c revision: 1.66 date: 2002/11/17 has a revision greater than or equal to 1.30, and less than or equal to 1.64, then your version of the code has this problem. This problem affected releases starting with RELEASE_5_0_16 up to and including RELEASE_5_0_43. [BA] Added testunitnotready to smartctl for symmetry with smartd. [SB] added Czech descriptions to .spec file [SB] corrected comment in smartd.conf example [BA] Changed way that entries in the ATA error log are printed, to make it clearer which is the most recent error and which is the oldest one. NOTE: All changes made prior to this point were done by Bruce Allen [BA] although several of them had been suggested by earlier postings by Stanislav Brabec [SB]. smartmontools-5.0.43 Changed Temperature_Centigrade to Temperature_Celsius. The term "Centigrade" ceased to exist in 1948. (c.f http://www.bartleby.com/64/C004/016.html). smartmontools-5.0.42 Modified SCSI device check to also send warning emails if requested in directives file. Added a new smartd configuration file Directive: -M ADDRESS. This sends a single warning email to ADDRESS for failures or errors detected with the -c, -L, -l, or -f Directives. smartmontools-5.0.38 Modified perror() statements in atacmds.c so that printout for SMART commands errors is properly suppressed or queued depending upon users choices for error reporting modes. Added Italian descriptions to smartmontools.spec file. Started impementing send-mail-on-error for smartd; not yet enabled. Added -P (Permissive) Directive to smartd.conf file to allow SMART monitoring of pre-ATA-3 Rev 4 disks that have SMART but do not have a SMART capability bit. Removed charset encodings from smartmontools.spec file for non-English fields. smartmontools-5.0.32 Added manual page smartd.conf.5 for configuration file. smartctl: Missing ANSI prototype in failuretest(); fixed. smartctl: Checksum warnings now printed on stdout, or are silent, depending upon -q and -Q settings. smartmontools-5.0.31 Changed Makefile so that the -V option does not reflect file state before commit! smartctl: added new options -W, -U, and -P to control if and how the smartctl exits if an error is detected in either a SMART data structure checksum, or a SMART command returns an error. modified manual page to break options into slightly more logical categories. reformatted 'usage' message order to agree with man page ordering modified .spec file so that locale information now contains character set definition. Changed pt_BR to pt since we do not use any aspect other than language. See man setlocale. smartmontools-5.0.30 smartctl: added new options -n and -N to force device to be ATA or SCSI smartctl: no longer dies silently if device path does not start/dev/X smartctl: now handles arbitrary device paths smartmontools-5.0.29 Modified .spec file and Makefile to make them more compliant with the "right" way of doing things. smartmontools-5.0.26 Fixed typesetting error in man page smartd.8 Removed redundant variable (harmless) from smartd.c smartmontools-5.0.25 Added a new directive for the configuration file. If the word DEVICESCAN appears before any non-commented material in the configuration file, then the confi file will be ignored and the devices wil be scanned. smartmontools-5.0.24 Note: it has now been confirmed that the code modifications between 5.0.23 and 5.0.24 have eliminated the GCC 3.2 problems. Note that there is a GCC bug howerver, see #848 at http://gcc.gnu.org/cgi-bin/gnatsweb.pl?database=gcc&cmd=query Added new Directive for Configuration file: -C <N> This sets the time in between disk checks to be <N> seconds apart. Note that although you can give this Directive multiple times on different lines of the configuration file, only the final value that is given has an effect, and applies to all the disks. The default value of <N> is 1800 sec, and the minimum allowed value is ten seconds. Problem wasn't the print format. F.L.W. Meunier <0@pervalidus.net> sent me a gcc 3.2 build and I ran it under a debugger. The problem seems to be with passing the very large (2x512+4) byte data structures as arguments. I never liked this anyway; it was inherited from smartsuite. So I've changed all the heavyweight functions (ATA ones, anyone) to just passing pointers, not hideous kB size structures on the stack. Hopefully this will now build OK under gcc 3.2 with any sensible compilation options. smartmontools-5.0.23 Because of reported problems with GCC 3.2 compile, I have gone thorough the code and explicitly changed all print format parameters to correspond EXACTLY to int unless they have to be promoted to long longs. To quote from the glibc bible: [From GLIBC Manual: Since the prototype doesn't specify types for optional arguments, in a call to a variadic function the default argument promotions are performed on the optional argument values. This means the objects of type char or short int (whether signed or not) are promoted to either int or unsigned int, as appropriate.] smartmontools-5.0.22 smartd, smartctl now warn if they find an attribute whose ID number does not match between Data and Threshold structures. Fixed nasty bug which led to wrong number of arguments for a varargs statement, with attendent stack corruption. Sheesh! Have added script to CVS attic to help find such nasties in the future. smartmontools-5.0.21 Eliminated some global variables out of header files and other minor cleanup of smartd. smartmontools-5.0.20 Did some revision of the man page for smartd and made the usage messages for Directives 100% consistent. smartmontools-5.0-19 smartd: prints warning message when it gets SIGHUP, saying that it is NOT re-reading the config file. smartctl: updated man page to say self-test commands -O,x,X,s,S,A appear to be supported in the code. [I can't test these, can anyone report?] smartmontools-5.0-18 smartctl: smartctl would previously print the LBA of a self-test if it completed, and the LBA was not 0 or 0xff...f However according to the specs this is not correct. According to the specs, if the self-test completed without error then LBA is undefined. This version fixes that. LBA value only printed if self-test encountered an error. smartmontools-5.0-17 smartd has changed significantly. This is the first CVS checkin of code that extends the options available for smartd. The following options can be placed into the /etc/smartd.conf file, and control the behavior of smartd. Configuration file Directives (following device name): -A Device is an ATA device -S Device is a SCSI device -c Monitor SMART Health Status -l Monitor SMART Error Log for changes -L Monitor SMART Self-Test Log for new errors -f Monitor for failure of any 'Usage' Attributes -p Report changes in 'Prefailure' Attributes -u Report changes in 'Usage' Attributes -t Equivalent to -p and -u Directives -a Equivalent to -c -l -L -f -t Directives -i ID Ignore Attribute ID for -f Directive -I ID Ignore Attribute ID for -p, -u or -t Directive # Comment: text after a hash sign is ignored \ Line continuation character cleaned up functions used for printing CVS IDs. Now use string library, as it should be. modified length of device name string in smartd internal structure to accomodate max length device name strings removed un-implemented (-e = Email notification) option from command line arg list. We'll put it back on when implemeneted. smartd now logs serious (fatal) conditions in its operation at loglevel LOG_CRIT rather than LOG_INFO before exiting with error. smartd used to open a file descriptor for each SMART enabled device, and then keep it open the entire time smartd was running. This meant that some commands, like IOREADBLKPART did not work, since the fd to the device was open. smartd now opens the device when it needs to read values, then closes it. Also, if one time around it can't open the device, it simply prints a warning message but does not give up. Have eliminated the .fd field from data structures -- no longer gets used. smartd now opens SCSI devices as well using O_RDONLY rather than O_RDWR. If someone can no longer monitor a SCSI device that used to be readable, this may well be the reason why. smartd never checked if the number of ata or scsi devices detected was greater than the max number it could monitor. Now it does. smartmontools-5.0-16 smartd on startup now looks in the configuration file /etc/smartd.conf for a list of devices which to include in its monitoring list. See man page (man smartd) for syntax. smartd: close file descriptors of SCSI device if not SMART capable Closes ALL file descriptors after forking to daemon. added new temperature attribute (231, temperature) smartd: now open ATA disks using O_RDONLY smartmontools-5.0-11 smartd now prints the name of a failed or changed attribute into logfile, not just ID number Changed name of -p (print version) option to -V Minor change in philosophy: if a SMART command fails or the device appears incapable of a SMART command that the user has asked for, complain by printing an error message, but go ahead and try anyway. Since unimplemented SMART commands should just return an error but not cause disk problems, this should't cause any difficulty. Added two new flags: q and Q. q is quiet mode - only print: For the -l option, errors recorded in the SMART error log; For the -L option, errors recorded in the device self-test log; For the -c SMART "disk failing" status or device attributes (pre-failure or usage) which failed either now or in the past; For the -v option device attributes (pre-failure or usage) which failed either now or in the past. Q is Very Quiet mode: Print no ouput. The only way to learn about what was found is to use the exit status of smartctl. smartctl now returns sensible values (bitmask). See smartctl.h for the values, and the man page for documentation. The SMART status check now uses the correct ATA call. If failure is detected we search through attributes to list the failed ones. If the SMART status check shows GOOD, we then look to see if their are any usage attributes or prefail attributes have failed at any time. If so we print them. Modified function that prints vendor attributes to say if the attribute has currently failed or has ever failed. -p option now prints out license info and CVS strings for all modules in the code, nicely formatted. Previous versions of this code (and Smartsuite) only generate SMART failure errors if the value of an attribute is below the threshold and the prefailure bit is set. However the ATA Spec (ATA4 <=Rev 4) says that it is a SMART failure if the value of an attribute is LESS THAN OR EQUAL to the threshold and the prefailure bit is set. This is now fixed in both smartctl and smartd. Note that this is a troubled subject -- the original SFF 8035i specification defining SMART was inconsistent about this. One section says that Attribute==Threshold is pass, and another section says it is fail. However the ATA specs are consistent and say Attribute==Threshold is a fail. smartd did not print the correct value of any failing SMART attribute. It printed the index in the attribute table, not the attribute ID. This is fixed. when starting self-tests in captive mode ioctl returns EIO because the drive has been busied out. Detect this and don't return an eror in this case. Check this this is correct (or how to fix it?) fixed possible error in how to determine ATA standard support for devices with no ATA minor revision number. device opened only in read-only not read-write mode. Don't need R/W access to get smart data. Check this with Andre. smartctl now handles all possible choices of "multiple options" gracefully. It goes through the following phases of operation, in order: INFORMATION, ENABLE/DISABLE, DISPLAY DATA, RUN/ABORT TESTS. Documentation has bee updated to explain the different phases of operation. Control flow through ataPrintMain() simplified. If reading device identity information fails, try seeing if the info can be accessed using a "DEVICE PACKET" command. This way we can at least get device info. Modified Makefile to automatically tag CVS archive on issuance of a release Modified drive detection so minor device ID code showing ATA-3 rev 0 (no SMART) is known to not be SMART capable. Now verify the checksum of the device ID data structure, and of the attributes threshold structure. Before neither of these structures had their checksums verified. New behavior vis-a-vis checksums. If they are wrong, we log warning messages to stdout, stderr, and syslog, but carry on anyway. All functions now call a checksumwarning routine if the checksum doesn't vanish as it should. Changed Read Hard Disk Identity function to get fresh info from the disk on each call rather than to use the values that were read upon boot-up into the BIOS. This is the biggest change in this release. The ioctl(device, HDIO_GET_IDENTITY, buf ) call should be avoided in such code. Note that if people get garbled strings for the model, serial no and firmware versions of their drives, then blame goes here (the BIOS does the byte swapping for you, apparently!) Function ataSmartSupport now looks at correct bits in drive identity structure to verify first that these bits are valid, before using them. Function ataIsSmartEnabled() written which uses the Drive ID state information to tell if SMART is enabled or not. We'll carry this along for the moment without using it. Function ataDoesSmartWork() guaranteed to work if the device supports SMART. Replace some numbers by #define MACROS Wrote Function TestTime to return test time associated with each different type of test. Thinking of the future, have added a new function called ataSmartStatus2(). Eventually when I understand how to use the TASKFILE API and am sure that this works correctly, it will replace ataSmartStatus(). This queries the drive directly to see if the SMART status is OK, rather than comparing thresholds to attribute values ourselves. But I need to get some drives that fail their SMART status to check it. smartmontools-5.0-10 Removed extraneous space before printing in some error messages Fixed additional typos in documentation Fixed some character buffers that were too short for their contents. smartmontools-5.0-9 Put project home path into all source files near the top Corrected typos in the documentation Modified Makefile so that Mandrake Cooker won't increment version number (unless they happen to be working on my machine, which I doubt!) smartmontools-5.0-8: For IBM disks whose raw temp data includes three temps. print all three print timestamps for error log to msec precision added -m option for Hitachi disks that store power on life in minutes added -L option for printing self-test error logs in -l option, now print power on lifetime, so that one can see when the error took place updated SMART structure definitions to ATA-5 spec added -p option added -f and -F options to enable/disable autosave threshold parameters changed argv parsing to use getops -- elminate buffer overflow vulnerability expanded and corrected documentation fixed problem with smartd. It did not actually call ataSmartEnable()! Since the argument was left out, the test always suceeded because it evaluated to a pointer to the function. smartd: closed open file descriptors if device does not support smart. Note: this still needs to be fixed for SCSI devices smartmontools-5.0-0 STARTED with smartsuite-2.1-2