Difference between revisions of "S.M.A.R.T."

From ArchWiki
Jump to: navigation, search
(Test the device health)
(19 intermediate revisions by 9 users not shown)
Line 1: Line 1:
 
[[Category:Storage]]
 
[[Category:Storage]]
{{i18n|SMART}}
+
S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is a supplementary component build into many modern storage devices through which devices monitor, store, and analyze the health of their operation. Statistics are collected (temperature, number of reallocated sectors, seek errors...) which software can use to measure the health of a device, predict possible device failure, and provide notifications on unsafe values.
Self-Monitoring, Analysis, and Reporting Technology, or [http://en.wikipedia.org/wiki/S.M.A.R.T. S.M.A.R.T.], is a monitoring system built on some computer hard disks that stores (and on some hard drives detects) various indicators (SMART Attributes) for the hard disks reliability. It does so in the attempt to detect/prevent anticipatory failures.
+
  
SMART disks may either need software to update SMART Attributes on the hard drive or will include some built-in self tests to detect and protect the hard drive (see manufacturer details).
+
== Smartmontools ==
  
This article describes how to install the package smartmontools and use its programs to monitor your hard disk(s).
+
The smartmontools package contains two utility programs ({{ic|smartctl}} and {{ic|smartd}}) to analyze and monitor storage devices.  Install {{Pkg|smartmontools}} from the [[Official Repositories|official repositories]].
  
==Installing==
+
=== Detect if device has SMART support ===
For the command line version:
+
{{bc|# pacman -S smartmontools}}
+
  
For the GUI version:
+
To check if the device has SMART capability (it may be necessary to add {{ic|-d ata}} to specify it is an ATA derived device):
{{bc|# pacman -S gsmartcontrol}}
+
  
==Disk support==
+
# smartctl -i /dev/<device>
Check if your disk(s) support SMART.
+
  
IDE-disks:
+
(where <device> is {{ic|sda, hda,...}}).  This will give general information about the device, the last two lines will show if it is supported:
{{bc|# smartctl -i /dev/hda}}
+
  
SATA-disks:
+
SMART support is: Available - device has SMART capability.
{{bc|# smartctl -i -d ata /dev/sda}}
+
SMART support is: Enabled
  
If it is, you will see:
+
If SMART is not enabled, it can be enabled by doing:
"SMART support is: Available - device has SMART capability."
+
"SMART support is: Enabled"
+
  
If SMART is not enabled you can enable it by:
+
# smartctl -s on /dev/<device>
  
IDE-disks:
+
=== Test the device health ===
{{bc|# smartctl -s on /dev/hda}}
+
  
SATA-discs:
+
Three type of health tests that can be performed on the device (all are safe to user data):
{{bc|# smartctl -s on -d ata /dev/sda}}
+
  
==Test the hard drive==
+
# Short (runs tests that have a high probability of detecting device problems)
Your SMART hard disk may have built-in self-tests that do some checks to record the state of the hard disk and may optionally protect it from common problems (i.e. bad blocks). If you do not know for certain, have smartctl run tests to update the SMART Attributes on your hard disk.
+
# Extended (or Long; a short check with complete disk surface examination)
 +
# Conveyance (identifies if damage incurred during transportation of the device)
  
To know first which tests the device supports:
+
To view the device's available tests and the time it will take to perform each test do:
{{bc|# smartctl -c /dev/<your-hard-disk>}}
+
  
To run the test:
+
# smartctl -c /dev/<device>
{{bc|# smartctl -t offline /dev/<your-hard-disk>}}
+
  
smartctl will tell you how long the test will take. When it is finished you can check the health status to see if there are any problems.
+
To run the tests do:
  
{{Note| Most smartctl commands require the -d ata or -d sat options to identify drives that are not IDE.}}
+
# smartctl -t short /dev/<device>
 +
# smartctl -t long /dev/<device>
 +
# smartctl -t conveyance /dev/<device>
  
==Health status check==
+
==== Results ====
SMART hard disks keep a record of the hard disks health status that can be checked with:
+
{{bc|# smartctl -H /dev/sda}}
+
  
If the hard drive status is healthy it will return the status as: 'PASSED'.
+
To view the test's overall health status (compiled from all tests):
  
If the device reports a failing health status, it means that the device has either already failed, or is predicting its own failure within the next 24 hours. Append the "-a" option to get more information.
+
# smartctl -H /dev/<device>
  
To see if the SMART sensor has detected any errors, look at the SMART Error Log:
+
To view the test's result errors:
{{bc|# smartctl -l error /dev/sda}}
+
  
If "No Errors Logged" is printed, your hard drive is likely healthy. If there are a few errors this may or may not indicate a problem and you should investigate the matter further. Generally when a drive starts to fail it is best practice to backup its data and replace the hard drive.
+
# smartctl -l selftest /dev/<device>
  
{{Note| See {{Ic|man smartctl}} for other tests and more information.}}
+
To view the test's detailed results:
  
==Automatically monitor your drives==
+
# smartctl -a /dev/<device>
smartmontools includes a daemon that will check and update your hard disks status and can optionally mail you of any potential problems. The smart daemon can be edited for more exact configuration in {{ic|/etc/smartd.conf}}.
+
  
If the configuration is not edited, smartd will run tests periodically on ''all'' possible SMART Attributes on all devices it detects. The first non-commented entry in the configuration file (DEVICESCAN) will have smartd ignore the remaining lines in the configuration file, and will scan for devices. For devices with an [http://en.wikipedia.org/wiki/S.M.A.R.T.#S.M.A.R.T._information ATA Attachment], if no options are configured, the daemon will use the '-a' option by default (monitor all SMART properties) on all hard drives.
+
If no errors are reported the device is likely healthy. If there are a few errors this may or may not indicate a problem and should be investigated further.  When a device starts to fail it is recommended to backup the data and replace it.
  
In order to have the drives checked only when they are not in standby (hence avoid them to spin up unnecessarily), you may add the following options:
+
=== Monitor devices ===
  DEVICESCAN -n standby,q -a
+
 
 +
Devices can be monitored in the background with use of the smartmontools daemon that will check devices periodically and optionally email any potential problems.  To have devices monitored on boot, enable smartd service:
 +
 
 +
  systemctl enable smartd.service
 +
 
 +
The smart daemon can be edited for more exact configuration in {{ic|/etc/smartd.conf}} (the configuration is well commented) otherwise all tests are run on all devices. Or, each device can be specified and all tests run by doing (uuid's and device ID can be used for more exact matching):
  
To make a configuration for individual devices, you have to comment the line with DEVICESCAN and add a configuration line for every device. It is recommended to reference devices by the ID (which is derived from the device's model and serial number) rather than a udev name, since the latter is not guaranteed to refer to the same physical device across reboots. Here is an example:
 
 
  #DEVICESCAN
 
  #DEVICESCAN
  /dev/disk/by-id/scsi-SATA_Hitachi_HTS5432090609FB22015CCNRD6A -a -m root@localhost
+
  /dev/<device> -a
 +
 
 +
Other options include:
 +
 
 +
* {{ic|-n standby,q}} to run diagnostics only when device is spun-up.
 +
* Details about smartd operations can be found in: {{ic|/var/log/daemon.log}}.
  
This will monitor all attributes and send an email to root@localhost if a failure or new error occurs. To be able to send internal mail, you need a mail sender (like [[SSMTP]] or [[Msmtp]]) installed, or a mail server (MTA Message Transfer Agent) like sendmail or [[Postfix Local Mail]]. More examples are given in the configuration file.
+
==== Email potential problems ====
  
Once you can send mails out, you can change the root@localhost by your actual email address:
+
To have an email sent when a failure or new error occurs, use the {{ic|-m}} option:
DEVICESCAN -n standby,q -a -m myuser@gmail.com
+
  
[[Daemon#Performing daemon actions manually|Start the smartd daemon]] and add smartd to your [[Daemons#Starting on Boot|DAEMONS array]] so it starts automatically on boot.
+
DEVICESCAN -m address@domain.com
  
Last but not least, if you used the -m option to get mail notifications, you should test that the mail alert works fine. To do so, simply add the '-M test' option to the configuration line and restart smartd daemon:
+
To be able to send the email externally (i.e. not to the root mail account) a MTA (Mail Transport Agent) or a MUA (Mail User Agent) will need to be installed and configured. Common MTAs are [[msmtp|MSMTP]] and [[SSMTP]]. Common MTUs are sendmail and [[Postfix]].
  DEVICESCAN -n standby,q -a -m myuser@gmail.com -M test
+
  
To restart the daemon:
+
Once the mail agent is setup the {{ic|-M test}} option can be used to test if an email will be sent (restart the daemon immediately to discover):
{{bc|# /etc/rc.d/smartd restart}}
+
  
The mail test result can be seen in your mailbox (be a bit patient) but also in {{ic|/var/log/daemon.log}} :
+
  DEVICESCAN -m address@domain.com -M test
  Feb  1 20:07:11 localhost smartd[2306]: Monitoring 3 ATA and 0 SCSI devices
+
Feb  1 20:07:11 localhost smartd[2306]: Executing test of mail to myuser@gmail.com ...
+
Feb  1 20:07:14 localhost smartd[2306]: Test of mail to myuser@gmail.com: successful
+
  
Once the test succeeded, do not forget to remove the '-M test' option.
+
==== Power management ====
  
Smartd uses "mail" (mailx) to send messages, which expects sendmail to be installed. If you use [[Msmtp]], tell mail to use it:
+
If you use a computer under control of power management, you should instruct smartd how to handle disks in low power mode. Usually, in response to SMART commands issued by smartd, the disk platters are spun up. So if this option is not used, then a disk which is in a low-power mode may be spun up and put into a higher-power mode when it is periodically polled by smartd.
{{hc|/etc/mail.rc|2=set sendmail=/usr/bin/msmtp}}
+
  
==Other tips==
+
DEVICESCAN -n standby,15,q
Other tips that can help setup smartmontool programs
+
  
===Get SMART details===
+
More info on [http://sourceforge.net/apps/trac/smartmontools/wiki/Powermode smartmontools wiki].
You can get all SMART details of your drive with:
+
  
IDE-disks:
+
=== GUI Applications ===
{{bc|# smartctl -a /dev/hda}}
+
  
SATA-discs:
+
* {{App|Gsmartcontrol|A GNOME frontend for the smartctl hard disk drive health inspection tool|http://gsmartcontrol.berlios.de/home/index.php/en/Home|{{Pkg|gsmartcontrol}}}}
{{bc|# smartctl -a -d ata /dev/sda}}
+
  
== External links ==
+
== Resources ==
  
 
* [http://smartmontools.sourceforge.net/ Smartmontools Homepage]
 
* [http://smartmontools.sourceforge.net/ Smartmontools Homepage]
 
* [https://help.ubuntu.com/community/Smartmontools Smartmontools on Ubuntu Wiki]
 
* [https://help.ubuntu.com/community/Smartmontools Smartmontools on Ubuntu Wiki]

Revision as of 04:13, 28 April 2013

S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is a supplementary component build into many modern storage devices through which devices monitor, store, and analyze the health of their operation. Statistics are collected (temperature, number of reallocated sectors, seek errors...) which software can use to measure the health of a device, predict possible device failure, and provide notifications on unsafe values.

Smartmontools

The smartmontools package contains two utility programs (smartctl and smartd) to analyze and monitor storage devices. Install smartmontools from the official repositories.

Detect if device has SMART support

To check if the device has SMART capability (it may be necessary to add -d ata to specify it is an ATA derived device):

# smartctl -i /dev/<device>

(where <device> is sda, hda,...). This will give general information about the device, the last two lines will show if it is supported:

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

If SMART is not enabled, it can be enabled by doing:

# smartctl -s on /dev/<device>

Test the device health

Three type of health tests that can be performed on the device (all are safe to user data):

  1. Short (runs tests that have a high probability of detecting device problems)
  2. Extended (or Long; a short check with complete disk surface examination)
  3. Conveyance (identifies if damage incurred during transportation of the device)

To view the device's available tests and the time it will take to perform each test do:

# smartctl -c /dev/<device>

To run the tests do:

# smartctl -t short /dev/<device>
# smartctl -t long /dev/<device>
# smartctl -t conveyance /dev/<device>

Results

To view the test's overall health status (compiled from all tests):

# smartctl -H /dev/<device>

To view the test's result errors:

# smartctl -l selftest /dev/<device>

To view the test's detailed results:

# smartctl -a /dev/<device>

If no errors are reported the device is likely healthy. If there are a few errors this may or may not indicate a problem and should be investigated further. When a device starts to fail it is recommended to backup the data and replace it.

Monitor devices

Devices can be monitored in the background with use of the smartmontools daemon that will check devices periodically and optionally email any potential problems. To have devices monitored on boot, enable smartd service:

 systemctl enable smartd.service

The smart daemon can be edited for more exact configuration in /etc/smartd.conf (the configuration is well commented) otherwise all tests are run on all devices. Or, each device can be specified and all tests run by doing (uuid's and device ID can be used for more exact matching):

#DEVICESCAN
/dev/<device> -a

Other options include:

  • -n standby,q to run diagnostics only when device is spun-up.
  • Details about smartd operations can be found in: /var/log/daemon.log.

Email potential problems

To have an email sent when a failure or new error occurs, use the -m option:

DEVICESCAN -m address@domain.com

To be able to send the email externally (i.e. not to the root mail account) a MTA (Mail Transport Agent) or a MUA (Mail User Agent) will need to be installed and configured. Common MTAs are MSMTP and SSMTP. Common MTUs are sendmail and Postfix.

Once the mail agent is setup the -M test option can be used to test if an email will be sent (restart the daemon immediately to discover):

DEVICESCAN -m address@domain.com -M test

Power management

If you use a computer under control of power management, you should instruct smartd how to handle disks in low power mode. Usually, in response to SMART commands issued by smartd, the disk platters are spun up. So if this option is not used, then a disk which is in a low-power mode may be spun up and put into a higher-power mode when it is periodically polled by smartd.

DEVICESCAN -n standby,15,q

More info on smartmontools wiki.

GUI Applications

  • Gsmartcontrol — A GNOME frontend for the smartctl hard disk drive health inspection tool
http://gsmartcontrol.berlios.de/home/index.php/en/Home || gsmartcontrol

Resources