S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is a supplementary component built into many modern storage devices through which devices monitor, store, and analyze the health of their operation. Statistics are collected (temperature, number of reallocated sectors, seek errors...) which software can use to measure the health of a device, predict possible device failure, and provide notifications on unsafe values.
- 1 Smartmontools
- 2 GUI Applications
- 3 Resources
The smartmontools package contains two utility programs for analyzing and monitoring storage devices:
smartd. Install from the official repositories to use these tools.
SMART support must be available and enabled on each storage device to effectively use these tools. You can use #smartctl to check for and enable SMART support. That done, you can manually #Run a test and #View test results, or you can use #smartd to automatically run tests and email notifications.
smartctl is a command-line tools that "controls the Self-Monitoring, Analysis and Reporting Technology (SMART) system built into most ATA/SATA and SCSI/SAS hard drives and solid-state drives."
-i) option prints a variety of information about a device, including whether SMART is available and enabled:
# smartctl --info /dev/sda | grep 'SMART support is:' SMART support is: Available - device has SMART capability. SMART support is: Enabled
If SMART is available but not enabled, you can enable it:
# smartctl --smart=on /dev/<device>
You may need to specify a device type. For example, specifying
--device=ata tells smartctl that the device type is ATA, and this prevents smartctl from issuing SCSI commands to that device.
Run a test
There are three types of self-tests that a device can execute (all are safe to user data):
- Short (runs tests that have a high probability of detecting device problems)
- Extended (or Long; a short check with complete disk surface examination)
- Conveyance (identifies if damage incurred during transportation of the device)
--capabilities) flag prints which tests a device supports and the approximate execution time of each test. For example:
# smartctl -c /dev/sda […] Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 74) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. […]
--test=<test_name>) flag to run a test:
# smartctl -t short /dev/<device> # smartctl -t long /dev/<device> # smartctl -t conveyance /dev/<device>
View test results
You can view a device's overall health with the
-H flag. "If the device reports failing health status, this means either that the device has already failed, or that it is predicting its own failure within the next 24 hours. If this happens […] get your data off the disk and to someplace safe as soon as you can."
# smartctl -H /dev/<device>
You can also view a list of recent test results and detailed information about a device:
# smartctl -l selftest /dev/<device> # smartctl -a /dev/<device>
The smartd daemon monitors SMART statuses and emits notifications when something goes wrong. It can be managed with systemd and configured using the
/etc/smartd.conf configuration file. The configuration file syntax is esoteric, and this wiki page provides only a quick reference. For more complete information, read the examples and comments within the configuration file, or read the smartd.conf (5) man page. (Execute
man 5 smartd.conf or visit this page.)
To start the daemon, check its status, make it auto-start on system boot and read recent log file entries:
# systemctl start smartd # systemctl status smartd # systemctl enable smartd # journalctl -u smartd
Define the devices to monitor
To monitor for all possible SMART errors on all disks:
To monitor for all possible SMART errors on
/dev/sdb, and ignore all other devices:
/dev/sda -a /dev/sdb -a
To monitor for all possible SMART errors on externally connected disks (USB-backup disks spring to mind) it is prudent to tell SMARTd the UUID of the device since the /dev/sdX of the drive might change during a reboot.
First, you will have to get the UUID of the disk to monitor:
ls -lah /dev/disk/by-uuid/ now look for the disk you want to Monitor
ls -lah /dev/disk/by-uuid/
lrwxrwxrwx 1 root root 9 Nov 5 22:41 820cdd8a-866a-444d-833c-1edb0f4becac -> ../../sde lrwxrwxrwx 1 root root 10 Nov 5 22:41 b51b87f3-425e-4fe7-883f-f4ff1689189e -> ../../sdf2 lrwxrwxrwx 1 root root 9 Nov 5 22:42 ea2199dd-8f9f-4065-a7ba-71bde11a462c -> ../../sda lrwxrwxrwx 1 root root 10 Nov 5 22:41 fe9e886a-8031-439f-a909-ad06c494fadb -> ../../sdf1
I know that my USB disk attached to /dev/sde during boot. Now to tell SMARTd to monitor that disk simply use the
Now your USB disk will be monitored even if the /dev/sdX path changes during reboot.
Email potential problems
To have an email sent when a failure or new error occurs, use the
DEVICESCAN -m email@example.com
To be able to send the email externally (i.e. not to the root mail account) a MTA (Mail Transport Agent) or a MUA (Mail User Agent) will need to be installed and configured. Common MTAs are Msmtp and SSMTP. Common MTUs are sendmail and Postfix. It is enough to simply configure S-nail if you do not want anything else.
-M test option causes a test email to be sent each time the smartd daemon starts:
DEVICESCAN -m firstname.lastname@example.org -M test
E-Mail can take quite a long time to be delivered, but when your hard drive fails you want to be informed immediately to take the appropriate actions. Hence you should rather define a script to be executed instead of only emailing the problem:
DEVICESCAN -m email@example.com -M exec /usr/local/bin/smartdnotify
To send an e-mail and a system notification, put something like this into
#! /bin/sh # Send mail echo "$SMARTD_MESSAGE" | mail -s "$SMARTD_FAILTYPE" "$SMARTD_ADDRESS" # Notify user wall "$SMARTD_MESSAGE"
If you use a computer under control of power management, you should instruct smartd how to handle disks in low power mode. Usually, in response to SMART commands issued by smartd, the disk platters are spun up. So if this option is not used, then a disk which is in a low-power mode may be spun up and put into a higher-power mode when it is periodically polled by smartd.
DEVICESCAN -n standby,15,q
More info on smartmontools wiki.
smartd can tell disks to perform self-tests on a schedule. The following
/etc/smartd.conf configuration will start a short self-test every day between 2-3am, and an extended self test weekly on Saturdays between 3-4am:
DEVICESCAN -s (S/../.././02|L/../../6/03)
Alert on temperature changes
smartd can track disk temperatures and alert if they rise too quickly or hit a high limit. The following will log changes of 4 degrees or more, log when temp reaches 35 degrees, and log/email a warning when temp reaches 40:
DEVICESCAN -W 4,35,40
smartctl -A /dev/<device> | grep Temperature_Celsius
DEVICESCANand define a separate configuration for each device with appropriate temperature settings.
Putting together all of the above gives the following example configuration:
DEVICESCAN(smartd scans for disks and monitors all it finds)
-a(monitor all attributes)
-o on(enable automatic online data collection)
-S on(enable automatic attribute autosave)
-n standby,q(do not check if disk is in standby, and suppress log message to that effect so as not to cause a write to disk)
-s ...(schedule short and long self-tests)
-W ...(monitor temperature)
-m ...(mail alerts)
DEVICESCAN -a -o on -S on -n standby,q -s (S/../.././02|L/../../6/03) -W 4,35,40 -m <username or email>
- Gsmartcontrol — A GNOME frontend for the smartctl hard disk drive health inspection tool
- http://gsmartcontrol.sourceforge.net || or AUR