Difference between revisions of "S.M.A.R.T."

From ArchWiki
Jump to: navigation, search
(updated man page links (interactive))
(Tag: wiki-scripts)
 
(87 intermediate revisions by 34 users not shown)
Line 1: Line 1:
[[Category:HOWTOs (English)]]
+
[[Category:Storage]]
[[Category:File systems (English)]]
+
[[ja:S.M.A.R.T.]]
{{i18n|SMART}}
+
S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is a supplementary component built into many modern storage devices through which devices monitor, store, and analyze the health of their operation.  Statistics are collected (temperature, number of reallocated sectors, seek errors...) which software can use to measure the health of a device, predict possible device failure, and provide notifications on unsafe values.
Self-Monitoring, Analysis, and Reporting Technology, or [http://en.wikipedia.org/wiki/S.M.A.R.T. S.M.A.R.T.], is a monitoring system built on some computer hard disks that stores (and on some hard drives detects) various indicators (SMART Attributes) for the hard disks reliability. It does so in the attempt to detect/prevent anticipatory failures.
 
  
SMART disks may either need software to update SMART Attributes on the hard drive or will include some built-in self tests to detect and protect the hard drive (see manufacturer details).
+
== Smartmontools ==
  
This article describes how to install the package smartmontools and use its programs to monitor your hard disk(s).
+
The smartmontools package contains two utility programs for analyzing and monitoring storage devices: {{ic|smartctl}} and {{ic|smartd}}. [[Install]] the {{Pkg|smartmontools}} package to use these tools.
  
==Installing==
+
SMART support must be available and enabled on each storage device to effectively use these tools. You can use [[#smartctl]] to check for and enable SMART support. That done, you can manually [[#Run a test]] and [[#View test results]], or you can use [[#smartd]] to automatically run tests and email notifications.
For the command line version:
 
pacman -S smartmontools
 
  
For the GUI version:
+
=== smartctl ===
pacman -S gsmartcontrol
 
  
==Disk support==
+
smartctl is a command-line tool that "controls the  Self-Monitoring, Analysis and Reporting Technology (SMART) system built into most ATA/SATA and SCSI/SAS hard drives and solid-state drives."
Check if your disk(s) support SMART.
 
  
IDE-disks:
+
The {{ic|-i}}/{{ic|--info}} option prints a variety of information about a device, including whether SMART is available and enabled:
smartctl -i /dev/hda
 
  
SATA-disks:
+
  # smartctl --info /dev/sda | grep 'SMART support is:'
  smartctl -i -d ata /dev/sda
+
SMART support is: Available - device has SMART capability.
 +
SMART support is: Enabled
  
If it is, you will see:
+
If SMART is available but not enabled, you can enable it:
"SMART support is: Available - device has SMART capability."
 
"SMART support is: Enabled"
 
  
If SMART is not enabled you can enable it by:
+
# smartctl --smart=on /dev/<device>
  
IDE-disks:
+
You may need to specify a device type. For example, specifying {{ic|1=--device=ata}} tells smartctl that the device type is ATA, and this prevents smartctl from issuing SCSI commands to that device.
smartctl -s on /dev/hda
 
  
SATA-discs:
+
==== Run a test ====
smartctl -s on -d ata /dev/sda
 
  
==Test the hard drive==
+
There are three types of self-tests that a device can execute (all are safe to user data):
Your SMART hard disk may have built-in self-tests that do some checks to record the state of the hard disk and may optionally protect it from common problems (i.e. bad blocks). If you do not know for certain, have smartctl run tests to update the SMART Attributes on your hard disk.
 
  
To know first which tests the device supports:
+
* Short (runs tests that have a high probability of detecting device problems)
smartctl -c /dev/<your-hard-disk>
+
* Extended (or Long; a short check with complete disk surface examination)
 +
* Conveyance (identifies if damage incurred during transportation of the device)
  
To run the test:
+
The {{ic|-c}}/{{ic|--capabilities}} flag prints which tests a device supports and the approximate execution time of each test. For example:
smartctl -t offline /dev/<your-hard-disk>
 
  
smartctl will tell you how long the test will take. When it is finished you can check the health status to see if there are any problems.
+
# smartctl -c /dev/sda
 +
[…]
 +
Short self-test routine
 +
recommended polling time:        (  1) minutes.
 +
Extended self-test routine
 +
recommended polling time:        (  74) minutes.
 +
Conveyance self-test routine
 +
recommended polling time:        (  2) minutes.
 +
[…]
  
==Health status check==
+
Use {{ic|-t}}/{{ic|1=--test=<test_name>}} flag to run a test:
SMART hard disks keep a record of the hard disks health status that can be checked with:
 
smartctl -H /dev/sda
 
  
If the hard drive status is healthy it will return the status as: 'PASSED'.
+
# smartctl -t short /dev/<device>
 +
# smartctl -t long /dev/<device>
 +
# smartctl -t conveyance /dev/<device>
  
If the device reports a failing health status, it means that the device has either already failed, or is predicting its own failure within the next 24 hours. Append the "-a" option to get more information.
+
==== View test results ====
  
To see if the SMART sensor has detected any errors, look at the SMART Error Log:
+
You can view a device's overall health with the {{ic|-H}} flag. "If the device reports failing health status, this means either that the device has already failed, or that it is predicting its own failure within the next 24 hours. If this happens […] get your data off the disk and to someplace safe as soon as you can."
smartctl -l error /dev/sda
 
  
If "No Errors Logged" is printed, your hard drive is likely healthy. If there are a few errors this may or may not indicate a problem and you should investigate the matter further. Generally when a drive starts to fail it is best practice to backup its data and replace the hard drive.
+
# smartctl -H /dev/<device>
  
See the man page for other tests and more information.
+
You can also view a list of recent test results and detailed information about a device:
  
==Automatically monitor your drives==
+
# smartctl -l selftest /dev/<device>
smartmontools includes a daemon that will check and update your hard disks status and can optionally mail you of any potential problems. The smart daemon can be edited for more exact configuration in {{Filename|/etc/smartd.conf}}.
+
# smartctl -a /dev/<device>
  
If the configuration is not edited, smartd will run tests periodically on ''all'' possible SMART Attributes on all devices it detects. The first non-commented entry in the configuration file (DEVICESCAN) will have smartd ignore the remaining lines in the configuration file, and will scan for devices. For devices with an [http://en.wikipedia.org/wiki/S.M.A.R.T.#S.M.A.R.T._information ATA Attachment], if no options are configured, the daemon will use the '-a' option by default (monitor all SMART properties) on all hard drives.
+
=== smartd ===
  
In order to have the drives checked only when they are not in standby (hence avoid them to spin up unnecessarily), you may add the following options:
+
The smartd daemon monitors SMART statuses and emits notifications when something goes wrong. It can be managed with systemd and configured using the {{ic|/etc/smartd.conf}} configuration file. The configuration file syntax is esoteric, and this wiki page provides only a quick reference. For more complete information, read the examples and comments within the configuration file, or read {{man|5|smartd.conf}}.
DEVICESCAN -n standby,q -a
 
  
To make a configuration for individual devices, you have to comment the line with DEVICESCAN and add a configuration line for every device. Here is an example:
+
==== daemon management ====
#DEVICESCAN
 
/dev/hda -a -m root@localhost
 
  
This will monitor all attributes and send an email to root@localhost if a failure or new error occurs. To be able to send internal mail, you need a mail sender (like [[SSMTP]] or [[Msmtp]]) installed, or a mail server (MTA Message Transfer Agent) like sendmail or [[Postfix Local Mail]]. More examples are given in the configuration file.
+
To start the daemon, check its status, make it auto-start on system boot and read recent log file entries, simply [[start/enable]] the {{ic|smartd.service}} systemd unit.
  
Once you can send mails out, you can change the root@localhost by your actual email address:
+
smartd respects all the usual systemctl and journalctl commands. For more information on using systemctl and journalctl, see [[Systemd#Using units]] and [[Systemd#Journal]].
DEVICESCAN -n standby,q -a -m myuser@gmail.com
 
  
To start the daemon:
+
==== Define the devices to monitor ====
/etc/rc.d/smartd start
 
  
If everything is working as it should, you can add smartd to your DAEMONS array in {{Filename|/etc/rc.conf}}.
+
To monitor for all possible SMART errors on all disks:
  
Last but not least, if you used the -m option to get mail notifications, you should test that the mail alert works fine. To do so, simply add the '-M test' option to the configuration line and restart smartd daemon:
+
{{hc|/etc/smartd.conf|DEVICESCAN -a}}
DEVICESCAN -n standby,q -a -m myuser@gmail.com -M test
 
  
To restart the daemon:
+
To monitor for all possible SMART errors on {{ic|/dev/sda}} and {{ic|/dev/sdb}}, and ignore all other devices:
/etc/rc.d/smartd restart
 
  
The mail test result can be seen in your mailbox (be a bit patient) but also in {{Filename|/var/log/daemon.log}} :
+
{{hc|/etc/smartd.conf|
Feb  1 20:07:11 localhost smartd[2306]: Monitoring 3 ATA and 0 SCSI devices
+
/dev/sda -a
Feb  1 20:07:11 localhost smartd[2306]: Executing test of mail to myuser@gmail.com ...
+
/dev/sdb -a
Feb  1 20:07:14 localhost smartd[2306]: Test of mail to myuser@gmail.com: successful
+
}}
  
Once the test succeeded, do not forget to remove the '-M test' option.
+
To monitor for all possible SMART errors on externally connected disks (USB-backup disks spring to mind) it is prudent to tell SMARTd the UUID of the device since the /dev/sdX of the drive might change during a reboot.
  
==Other tips==
+
First, you will have to get the UUID of the disk to monitor: {{ic|ls -lah /dev/disk/by-uuid/}} now look for the disk you want to Monitor
Other tips that can help setup smartmontool programs
+
{{hc|ls -lah /dev/disk/by-uuid/|
 +
lrwxrwxrwx 1 root root  9 Nov  5 22:41 820cdd8a-866a-444d-833c-1edb0f4becac -> ../../sde
 +
lrwxrwxrwx 1 root root  10 Nov  5 22:41 b51b87f3-425e-4fe7-883f-f4ff1689189e -> ../../sdf2
 +
lrwxrwxrwx 1 root root  9 Nov  5 22:42 ea2199dd-8f9f-4065-a7ba-71bde11a462c -> ../../sda
 +
lrwxrwxrwx 1 root root  10 Nov  5 22:41 fe9e886a-8031-439f-a909-ad06c494fadb -> ../../sdf1
 +
}}
  
===Get SMART details===
+
I know that my USB disk attached to /dev/sde during boot. Now to tell SMARTd to monitor that disk simply use the {{ic|/dev/disk/by-uuid/}} path.
You can get all SMART details of your drive with:
 
  
IDE-disks:
+
{{hc|/etc/smartd.conf|
smartctl -a /dev/hda
+
/dev/disk/by-uuid/820cdd8a-866a-444d-833c-1edb0f4becac -a
 +
}}
  
SATA-discs:
+
Now your USB disk will be monitored even if the /dev/sdX path changes during reboot.
smartctl -a -d ata /dev/sda
 
  
== External links ==
+
==== Email potential problems ====
  
* [http://smartmontools.sourceforge.net/ Smartmontools Homepage]
+
To have an email sent when a failure or new error occurs, use the {{ic|-m}} option:
 +
 
 +
{{hc|/etc/smartd.conf|DEVICESCAN -m address@domain.com}}
 +
 
 +
To be able to send the email externally (i.e. not to the root mail account) a MTA (Mail Transport Agent) or a MUA (Mail User Agent) will need to be installed and configured.  Common MTAs are [[Msmtp]] and [[SSMTP]]. Common MTUs are sendmail and [[Postfix]]. It is enough to simply configure [[S-nail]] if you do not want anything else, but you will need to follow [//dominicm.com/configure-email-notifications-on-arch-linux/ these instructions].
 +
 
 +
The {{ic|-M test}} option causes a test email to be sent each time the smartd daemon starts:
 +
 
 +
{{hc|/etc/smartd.conf|DEVICESCAN -m address@domain.com -M test}}
 +
 
 +
E-Mail can take quite a long time to be delivered, but when your hard drive fails you want to be informed immediately to take the appropriate actions. Hence you should rather define a script to be executed instead of only emailing the problem:
 +
 
 +
{{hc|/etc/smartd.conf|DEVICESCAN -m address@domain.com -M exec /usr/local/bin/smartdnotify}}
 +
 
 +
To send an e-mail and a system notification, put something like this into {{ic|/usr/local/bin/smartdnotify}}:
 +
 
 +
#! /bin/sh
 +
 +
# Send mail
 +
echo "$SMARTD_MESSAGE" | mail -s "$SMARTD_FAILTYPE" "$SMARTD_ADDRESS"
 +
 +
# Notify user
 +
wall "$SMARTD_MESSAGE"
 +
 
 +
If you are running a desktop environment, you might also prefer having a popup to appear on your desktop. In this case, you can use this script (replace {{ic|''X_user''}} and {{ic|''X_userid''}} with the user and userid running X respectively) :
 +
{{hc|/usr/local/bin/smartdnotify|2=
 +
#!/bin/sh
 +
 
 +
sudo -u ''X_user'' DISPLAY=:0 DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/''X_userid''/bus notify-send "S.M.A.R.T Error ($SMARTD_FAILTYPE)" "$SMARTD_MESSAGE" --icon=dialog-warning
 +
}}
 +
This requires {{Pkg|libnotify}} and a compatible desktop environment. See [[Desktop notifications]] for more details.
 +
 
 +
==== Power management ====
 +
 
 +
If you use a computer under control of power management, you should instruct smartd how to handle disks in low power mode. Usually, in response to SMART commands issued by smartd, the disk platters are spun up. So if this option is not used, then a disk which is in a low-power mode may be spun up and put into a higher-power mode when it is periodically polled by smartd.
 +
 
 +
{{hc|/etc/smartd.conf|DEVICESCAN -n standby,15,q}}
 +
 
 +
More info on [http://www.smartmontools.org/wiki/Powermode smartmontools wiki].
 +
 
 +
On some devices the -n does not work. You get the following error message in syslog:
 +
 
 +
{{
 +
hc|journalctl -u smartd|
 +
CHECK POWER MODE: incomplete response, ATA output registers missing
 +
Device: /dev/sdb [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
 +
}}
 +
 
 +
As an alternative you can user -i option of smartd. It controls how often smartd spins the disks up to check their status. Default is 30 minutes. To change it create and edit {{ic|/etc/default/smartmontools}}.
 +
 
 +
{{
 +
hc|
 +
head=/etc/default/smartmontools|
 +
output=SMARTD_ARGS="-i 21600"  Check status every 21600 seconds (3 hours)
 +
}}
 +
 
 +
For more info see {{man|8|smartd}}.
 +
 
 +
==== Schedule self-tests ====
 +
 
 +
smartd can tell disks to perform self-tests on a schedule. The following {{ic|/etc/smartd.conf}} configuration will start a short self-test every day between 2-3am, and an extended self test weekly on Saturdays between 3-4am:
 +
 
 +
{{hc|/etc/smartd.conf|DEVICESCAN -s (S/../.././02&#124;L/../../6/03)}}
 +
 
 +
==== Alert on temperature changes ====
 +
 
 +
smartd can track disk temperatures and alert if they rise too quickly or hit a high limit. The following will log changes of 4 degrees or more, log when temp reaches 35 degrees, and log/email a warning when temp reaches 40:
 +
 
 +
{{hc|/etc/smartd.conf|DEVICESCAN -W 4,35,40}}
 +
 
 +
{{Tip|You can determine the current disk temperature with the command {{ic|smartctl -A /dev/<device> &#124; grep Temperature_Celsius}}}}
 +
 
 +
{{Tip|If you have some disks that run a lot hotter/cooler than others, remove {{ic|DEVICESCAN}} and define a separate configuration for each device with appropriate temperature settings.}}
 +
 
 +
==== Complete smartd.conf example ====
 +
 
 +
Putting together all of the above gives the following example configuration:
 +
 
 +
* {{ic|DEVICESCAN}} (smartd scans for disks and monitors all it finds)
 +
* {{ic|-a}} (monitor all attributes)
 +
* {{ic|-o on}} (enable automatic online data collection)
 +
* {{ic|-S on}} (enable automatic attribute autosave)
 +
* {{ic|-n standby,q}} (do not check if disk is in standby, and suppress log message to that effect so as not to cause a write to disk)
 +
* {{ic|-s ...}} (schedule short and long self-tests)
 +
* {{ic|-W ...}} (monitor temperature)
 +
* {{ic|-m ...}} (mail alerts)
 +
 
 +
{{hc|/etc/smartd.conf|DEVICESCAN -a -o on -S on -n standby,q -s (S/../.././02&#124;L/../../6/03) -W 4,35,40 -m <username or email>}}
 +
 
 +
== GUI Applications ==
 +
 
 +
* {{App|Gsmartcontrol|A GNOME frontend for the smartctl hard disk drive health inspection tool|http://gsmartcontrol.sourceforge.net|{{Pkg|gsmartcontrol}} or {{AUR|gsmartcontrol-svn}}}}
 +
 
 +
== See also ==
 +
 
 +
* [https://www.smartmontools.org/ Smartmontools Homepage]
 
* [https://help.ubuntu.com/community/Smartmontools Smartmontools on Ubuntu Wiki]
 
* [https://help.ubuntu.com/community/Smartmontools Smartmontools on Ubuntu Wiki]
 +
* [[Gentoo: smartmontools]]

Latest revision as of 21:06, 6 September 2017

S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is a supplementary component built into many modern storage devices through which devices monitor, store, and analyze the health of their operation. Statistics are collected (temperature, number of reallocated sectors, seek errors...) which software can use to measure the health of a device, predict possible device failure, and provide notifications on unsafe values.

Smartmontools

The smartmontools package contains two utility programs for analyzing and monitoring storage devices: smartctl and smartd. Install the smartmontools package to use these tools.

SMART support must be available and enabled on each storage device to effectively use these tools. You can use #smartctl to check for and enable SMART support. That done, you can manually #Run a test and #View test results, or you can use #smartd to automatically run tests and email notifications.

smartctl

smartctl is a command-line tool that "controls the Self-Monitoring, Analysis and Reporting Technology (SMART) system built into most ATA/SATA and SCSI/SAS hard drives and solid-state drives."

The -i/--info option prints a variety of information about a device, including whether SMART is available and enabled:

# smartctl --info /dev/sda | grep 'SMART support is:'
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

If SMART is available but not enabled, you can enable it:

# smartctl --smart=on /dev/<device>

You may need to specify a device type. For example, specifying --device=ata tells smartctl that the device type is ATA, and this prevents smartctl from issuing SCSI commands to that device.

Run a test

There are three types of self-tests that a device can execute (all are safe to user data):

  • Short (runs tests that have a high probability of detecting device problems)
  • Extended (or Long; a short check with complete disk surface examination)
  • Conveyance (identifies if damage incurred during transportation of the device)

The -c/--capabilities flag prints which tests a device supports and the approximate execution time of each test. For example:

# smartctl -c /dev/sda
[…]
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  74) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
[…]

Use -t/--test=<test_name> flag to run a test:

# smartctl -t short /dev/<device>
# smartctl -t long /dev/<device>
# smartctl -t conveyance /dev/<device>

View test results

You can view a device's overall health with the -H flag. "If the device reports failing health status, this means either that the device has already failed, or that it is predicting its own failure within the next 24 hours. If this happens […] get your data off the disk and to someplace safe as soon as you can."

# smartctl -H /dev/<device>

You can also view a list of recent test results and detailed information about a device:

# smartctl -l selftest /dev/<device>
# smartctl -a /dev/<device>

smartd

The smartd daemon monitors SMART statuses and emits notifications when something goes wrong. It can be managed with systemd and configured using the /etc/smartd.conf configuration file. The configuration file syntax is esoteric, and this wiki page provides only a quick reference. For more complete information, read the examples and comments within the configuration file, or read smartd.conf(5).

daemon management

To start the daemon, check its status, make it auto-start on system boot and read recent log file entries, simply start/enable the smartd.service systemd unit.

smartd respects all the usual systemctl and journalctl commands. For more information on using systemctl and journalctl, see Systemd#Using units and Systemd#Journal.

Define the devices to monitor

To monitor for all possible SMART errors on all disks:

/etc/smartd.conf
DEVICESCAN -a

To monitor for all possible SMART errors on /dev/sda and /dev/sdb, and ignore all other devices:

/etc/smartd.conf
/dev/sda -a
/dev/sdb -a

To monitor for all possible SMART errors on externally connected disks (USB-backup disks spring to mind) it is prudent to tell SMARTd the UUID of the device since the /dev/sdX of the drive might change during a reboot.

First, you will have to get the UUID of the disk to monitor: ls -lah /dev/disk/by-uuid/ now look for the disk you want to Monitor

ls -lah /dev/disk/by-uuid/
lrwxrwxrwx 1 root root   9 Nov  5 22:41 820cdd8a-866a-444d-833c-1edb0f4becac -> ../../sde
lrwxrwxrwx 1 root root  10 Nov  5 22:41 b51b87f3-425e-4fe7-883f-f4ff1689189e -> ../../sdf2
lrwxrwxrwx 1 root root   9 Nov  5 22:42 ea2199dd-8f9f-4065-a7ba-71bde11a462c -> ../../sda
lrwxrwxrwx 1 root root  10 Nov  5 22:41 fe9e886a-8031-439f-a909-ad06c494fadb -> ../../sdf1

I know that my USB disk attached to /dev/sde during boot. Now to tell SMARTd to monitor that disk simply use the /dev/disk/by-uuid/ path.

/etc/smartd.conf
/dev/disk/by-uuid/820cdd8a-866a-444d-833c-1edb0f4becac -a

Now your USB disk will be monitored even if the /dev/sdX path changes during reboot.

Email potential problems

To have an email sent when a failure or new error occurs, use the -m option:

/etc/smartd.conf
DEVICESCAN -m address@domain.com

To be able to send the email externally (i.e. not to the root mail account) a MTA (Mail Transport Agent) or a MUA (Mail User Agent) will need to be installed and configured. Common MTAs are Msmtp and SSMTP. Common MTUs are sendmail and Postfix. It is enough to simply configure S-nail if you do not want anything else, but you will need to follow these instructions.

The -M test option causes a test email to be sent each time the smartd daemon starts:

/etc/smartd.conf
DEVICESCAN -m address@domain.com -M test

E-Mail can take quite a long time to be delivered, but when your hard drive fails you want to be informed immediately to take the appropriate actions. Hence you should rather define a script to be executed instead of only emailing the problem:

/etc/smartd.conf
DEVICESCAN -m address@domain.com -M exec /usr/local/bin/smartdnotify

To send an e-mail and a system notification, put something like this into /usr/local/bin/smartdnotify:

#! /bin/sh

# Send mail
echo "$SMARTD_MESSAGE" | mail -s "$SMARTD_FAILTYPE" "$SMARTD_ADDRESS"

# Notify user
wall "$SMARTD_MESSAGE"

If you are running a desktop environment, you might also prefer having a popup to appear on your desktop. In this case, you can use this script (replace X_user and X_userid with the user and userid running X respectively) :

/usr/local/bin/smartdnotify
#!/bin/sh

sudo -u X_user DISPLAY=:0 DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/X_userid/bus notify-send "S.M.A.R.T Error ($SMARTD_FAILTYPE)" "$SMARTD_MESSAGE" --icon=dialog-warning

This requires libnotify and a compatible desktop environment. See Desktop notifications for more details.

Power management

If you use a computer under control of power management, you should instruct smartd how to handle disks in low power mode. Usually, in response to SMART commands issued by smartd, the disk platters are spun up. So if this option is not used, then a disk which is in a low-power mode may be spun up and put into a higher-power mode when it is periodically polled by smartd.

/etc/smartd.conf
DEVICESCAN -n standby,15,q

More info on smartmontools wiki.

On some devices the -n does not work. You get the following error message in syslog:

journalctl -u smartd
CHECK POWER MODE: incomplete response, ATA output registers missing
Device: /dev/sdb [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive

As an alternative you can user -i option of smartd. It controls how often smartd spins the disks up to check their status. Default is 30 minutes. To change it create and edit /etc/default/smartmontools.

/etc/default/smartmontools
SMARTD_ARGS="-i 21600"  Check status every 21600 seconds (3 hours)

For more info see smartd(8).

Schedule self-tests

smartd can tell disks to perform self-tests on a schedule. The following /etc/smartd.conf configuration will start a short self-test every day between 2-3am, and an extended self test weekly on Saturdays between 3-4am:

/etc/smartd.conf
DEVICESCAN -s (S/../.././02|L/../../6/03)

Alert on temperature changes

smartd can track disk temperatures and alert if they rise too quickly or hit a high limit. The following will log changes of 4 degrees or more, log when temp reaches 35 degrees, and log/email a warning when temp reaches 40:

/etc/smartd.conf
DEVICESCAN -W 4,35,40
Tip: You can determine the current disk temperature with the command smartctl -A /dev/<device> | grep Temperature_Celsius
Tip: If you have some disks that run a lot hotter/cooler than others, remove DEVICESCAN and define a separate configuration for each device with appropriate temperature settings.

Complete smartd.conf example

Putting together all of the above gives the following example configuration:

  • DEVICESCAN (smartd scans for disks and monitors all it finds)
  • -a (monitor all attributes)
  • -o on (enable automatic online data collection)
  • -S on (enable automatic attribute autosave)
  • -n standby,q (do not check if disk is in standby, and suppress log message to that effect so as not to cause a write to disk)
  • -s ... (schedule short and long self-tests)
  • -W ... (monitor temperature)
  • -m ... (mail alerts)
/etc/smartd.conf
DEVICESCAN -a -o on -S on -n standby,q -s (S/../.././02|L/../../6/03) -W 4,35,40 -m <username or email>

GUI Applications

  • Gsmartcontrol — A GNOME frontend for the smartctl hard disk drive health inspection tool
http://gsmartcontrol.sourceforge.net || gsmartcontrol or gsmartcontrol-svnAUR

See also