Difference between revisions of "Solid state drive"
(→Btrfs) |
(→Enable continuous TRIM by mount flag: update kernel link for 4.5; re Talk:Solid_State_Drives#TRIM and RAID) |
||
Line 133: | Line 133: | ||
{{Warning|1=Users need to be certain that their SSD supports TRIM before attempting to mount a partition with the {{ic|discard}} flag. Data loss can occur otherwise! Unfortunately, there are wide quality gaps of SSD's bios' to perform continuous TRIM, which is also why using the {{ic|discard}} mount flag is [http://thread.gmane.org/gmane.comp.file-systems.ext4/41974 recommended against] generally by filesystem developer Theodore Ts'o. If in doubt about your hardware, [[#Apply periodic TRIM via fstrim]] instead. | {{Warning|1=Users need to be certain that their SSD supports TRIM before attempting to mount a partition with the {{ic|discard}} flag. Data loss can occur otherwise! Unfortunately, there are wide quality gaps of SSD's bios' to perform continuous TRIM, which is also why using the {{ic|discard}} mount flag is [http://thread.gmane.org/gmane.comp.file-systems.ext4/41974 recommended against] generally by filesystem developer Theodore Ts'o. If in doubt about your hardware, [[#Apply periodic TRIM via fstrim]] instead. | ||
− | Also be aware of other [[WikiPedia:Trim_(computing)#Shortcomings|shortcomings]], most importantly that "TRIM commands [https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ have been linked] to serious data corruption in several devices, most notably Samsung 8* series." After the data corruption [https://github.com/torvalds/linux/blob/e64f638483a21105c7ce330d543fa1f1c35b5bc7/drivers/ata/libata-core.c#L4109-L4286 had been confirmed], the Linux kernel blacklisted queued TRIM command execution for a number of [https://github.com/torvalds/linux/blob/ | + | Also be aware of other [[WikiPedia:Trim_(computing)#Shortcomings|shortcomings]], most importantly that "TRIM commands [https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ have been linked] to serious data corruption in several devices, most notably Samsung 8* series." After the data corruption [https://github.com/torvalds/linux/blob/e64f638483a21105c7ce330d543fa1f1c35b5bc7/drivers/ata/libata-core.c#L4109-L4286 had been confirmed], the Linux kernel blacklisted queued TRIM command execution for a number of [https://github.com/torvalds/linux/blob/v4.5/drivers/ata/libata-core.c#L4223 popular devices] as of March 28, 2016. Read [http://linux.slashdot.org/story/15/07/30/1814200/samsung-finds-fixes-bug-in-linux-trim-code Samsung Finds, Fixes Bug In Linux Trim Code] on Slashdot for more recent updates.}} |
Using the {{ic|discard}} option for a mount in {{ic|/etc/fstab}} enables continuous TRIM in device operations: | Using the {{ic|discard}} option for a mount in {{ic|/etc/fstab}} enables continuous TRIM in device operations: |
Revision as of 11:15, 26 March 2016
zh-CN:Solid State Drives zh-TW:Solid State Drives
Solid State Drives (SSDs) are not PnP devices. Special considerations such as partition alignment, choice of file system, TRIM support, etc. are needed to set up SSDs for optimal performance. This article attempts to capture referenced, key learnings to enable users to get the most out of SSDs under Linux. Users are encouraged to read this article in its entirety before acting on recommendations.
Contents
Overview
Advantages over HDDs
- Fast read speeds - 2-3x faster than modern desktop HDDs (7,200 RPM using SATA2 interface).
- Sustained read speeds - no decrease in read speed across the entirety of the device. HDD performance tapers off as the drive heads move from the outer edges to the center of HDD platters.
- Minimal access time - approximately 100x faster than an HDD. For example, 0.1 ms (100 us) vs. 12-20 ms (12,000-20,000 us) for desktop HDDs.
- High degree of reliability.
- No moving parts.
- Minimal heat production.
- Minimal power consumption - fractions of a W at idle and 1-2 W while reading/writing vs. 10-30 W for a HDD depending on RPMs.
- Light-weight - ideal for laptops.
Limitations
- Per-storage cost (about a third of a dollar per GB, vs. around a dime or two per GB for rotating media).
- Capacity of marketed models is lower than that of HDDs.
- Large cells require different filesystem optimizations than rotating media. The flash translation layer hides the raw flash access which a modern OS could use to optimize access.
- Partitions and filesystems need some SSD-specific tuning. Page size and erase page size are not autodetected.
- Cells wear out. Consumer MLC cells at mature 50nm processes can handle 10000 writes each; 35nm generally handles 5000 writes, and 25nm 3000 (smaller being higher density and cheaper). If writes are properly spread out, are not too small, and align well with cells, this translates into a lifetime write volume for the SSD that is a multiple of its capacity. Daily write volumes have to be balanced against life expectancy. However, tests [1][2][3][4] performed on recent hardware suggest that SSD wear is negligible, with the lifetime expectancy of SSDs comparable to those of HDDs even with artificially high write-volumes.
- Firmwares and controllers are complex. They occasionally have bugs. Modern ones consume power comparable with HDDs. They implement the equivalent of a log-structured filesystem with garbage collection. They translate SATA commands traditionally intended for rotating media. Some of them do on the fly compression. They spread out repeated writes across the entire area of the flash, to prevent wearing out some cells prematurely. They also coalesce writes together so that small writes are not amplified into as many erase cycles of large cells. Finally they move cells containing data so that the cell does not lose its contents over time.
- Performance can drop as the disk gets filled. Garbage collection is not universally well implemented, meaning freed space is not always collected into entirely free cells.
Pre-purchase considerations
There are several key features to look for prior to purchasing a contemporary SSD.
- Native TRIM support is a vital feature that both prolongs SSD lifetime and reduces loss of performance for write operations over time.
- Buying the right sized SSD is key. As with all filesystems, target <75 % occupancy for all SSD partitions to ensure efficient use by the kernel.
Choice of filesystem
This section describes optimized filesystems to use on a SSD.
It's still possible/required to use other filesystems, e.g. FAT32 for the EFI System Partition.
Btrfs
Btrfs support has been included with the mainline 2.6.29 release of the Linux kernel, and since August 2014 it has been marked as stable. However, some features are experimental, and users are encouraged to read the Btrfs article for more info.
Ext4
Ext4 is another filesystem that has support for SSD. It is considered as stable since 2.6.28 and is mature enough for daily use. ext4 users can enable the TRIM command support using the discard
mount option in fstab (or with tune2fs -o discard /dev/sdaX
).
See the official in kernel tree documentation for further information on ext4.
XFS
Many users do not realize that in addition to ext4 and btrfs, XFS has TRIM support as well. This can be enabled in the usual ways. That is, the choice may be made of either using the discard option mentioned above, or by using the fstrim command. More information can be found on the XFS wiki.
JFS
As of Linux kernel version 3.7, proper TRIM support has been added. So far, there is not a great wealth of information of the topic but it has certainly been picked up by Linux news sites. It is apparent that it can be enabled via the discard
mount option, or by using the method of batch TRIMs with fstrim.
Other filesystems
There are other filesystems specifically designed for SSD, for example F2FS.
Tips for maximizing SSD performance
Partition alignment
See Partitioning#Partition alignment.
TRIM
Most SSDs support the ATA_TRIM command for sustained long-term performance and wear-leveling. For more including some before and after benchmark, see this tutorial.
As of Linux kernel version 3.8 onwards, the following filesystems support TRIM: Ext4, Btrfs, JFS, VFAT, XFS, F2FS.
As of ntfs-3g version 2015.3.14, TRIM is supported for NTFS filesystem too [5].
VFAT only supports TRIM by the mount option discard
, not manually with fstrim.
The Choice of Filesystem section of this article offers more details.
Verify TRIM support
# hdparm -I /dev/sda | grep TRIM * Data Set Management TRIM supported (limit 1 block)
Note that there are different types of TRIM support defined by the specification. Hence, the output may differ depending what the drive supports. See wikipedia:TRIM#ATA for more information.
Apply periodic TRIM via fstrim
The util-linux package (part of base and base-devel) provides fstrim.service
and fstrim.timer
systemd unit files. Enabling the timer will activate the service weekly, which will then trim all mounted filesystems on devices that support the discard operation.
The timer relies on the timestamp of /var/lib/systemd/timers/stamp-fstrim.timer
(which it will create upon first invocation) to know whether a week has elapsed since it last ran. Therefore there is no need to worry about too frequent invocations, in an anacron-like fashion.
It is also possible to query the units activity and status using standard journalctl
and systemctl status
commands:
# journalctl -u fstrim ... <shows several log entries if enabled> ... # systemctl status fstrim ● fstrim.service - Discard unused blocks Loaded: loaded (/usr/lib/systemd/system/fstrim.service; static; vendor preset: disabled) Active: inactive (dead) since lun. 2015-06-08 00:00:18 CEST; 2 days ago Process: 18152 ExecStart=/sbin/fstrim -a (code=exited, status=0/SUCCESS) Main PID: 18152 (code=exited, status=0/SUCCESS) juin 08 00:00:16 arch-clevo systemd[1]: Starting Discard unused blocks... juin 08 00:00:18 arch-clevo systemd[1]: Started Discard unused blocks.
.timer
suffix if you specifically want to inquire about it.If you wish to change the periodicity of the timer or the command run, simply edit the provided unit files.
Enable continuous TRIM by mount flag
discard
flag. Data loss can occur otherwise! Unfortunately, there are wide quality gaps of SSD's bios' to perform continuous TRIM, which is also why using the discard
mount flag is recommended against generally by filesystem developer Theodore Ts'o. If in doubt about your hardware, #Apply periodic TRIM via fstrim instead.
Also be aware of other shortcomings, most importantly that "TRIM commands have been linked to serious data corruption in several devices, most notably Samsung 8* series." After the data corruption had been confirmed, the Linux kernel blacklisted queued TRIM command execution for a number of popular devices as of March 28, 2016. Read Samsung Finds, Fixes Bug In Linux Trim Code on Slashdot for more recent updates.Using the discard
option for a mount in /etc/fstab
enables continuous TRIM in device operations:
/dev/sda2 /boot ext4 defaults,noatime,discard 0 2 /dev/sda1 /boot/efi vfat defaults,noatime,discard 0 2 /dev/sda3 / ext4 defaults,noatime,discard 0 2
The main benefit of continuous TRIM is speed; an SSD can perform more efficient garbage collection. However, results vary and particularly earlier SSD generations may also show just the opposite effect. Also for this reason, some distributions decided against using it (e.g. Ubuntu: see this article and the related blueprint).
- There is no need for the
discard
flag if you runfstrim
periodically. - Using the
discard
flag for an ext3 root partition will result in it being mounted read-only. - Before SATA 3.1, TRIM commands are synchronous and will block all I/O while running. This may cause short freezes while this happens, for example during a filesystem sync. You may not want to use
discard
in that case but #Apply periodic TRIM via fstrim instead. One way to check your SATA version is withsmartctl --info /dev/sdX
.
On the ext4 filesystem, the discard
flag can also be set as a default mount option using tune2fs:
# tune2fs -o discard /dev/sdXY
Using the default mount options instead of an entry in /etc/fstab
is useful for external drives, because such partition will be mounted with the default options also on other machines. There is no need to edit /etc/fstab
on every machine.
/proc/mounts
.Enable TRIM for LVM
Change the value of issue_discards
option from 0 to 1 in /etc/lvm/lvm.conf
.
man lvm.conf
and/or inline comments in /etc/lvm/lvm.conf
). As such it does not seem to be required for "regular" TRIM requests (file deletions inside a filesystem) to be functional.Enable TRIM for dm-crypt
For non-root filesystems, configure /etc/crypttab
to include discard
in the list of options for encrypted block devices located on a SSD (see Dm-crypt/System configuration#crypttab).
For the root filesystem, follow the instructions from Dm-crypt/TRIM support for SSD to add the right kernel parameter to the bootloader configuration.
I/O scheduler
See Maximizing performance#Tuning IO schedulers.
Swap space on SSDs
One can place a swap partition on an SSD. A recommended tweak for SSDs using a swap partition is to reduce the swappiness of the system to some very low value (for example 1
), and thus avoiding writes to swap.
Tips for SSD security
Hdparm shows "frozen" state
Some motherboard BIOS' issue a "security freeze" command to attached storage devices on initialization. Likewise some SSD (and HDD) BIOS' are set to "security freeze" in the factory already. Both result in the device's password security settings to be set to frozen, as shown in below output:
:~# hdparm -I /dev/sda
Security: Master password revision code = 65534 supported not enabled not locked frozen not expired: security count supported: enhanced erase 4min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT.
Operations like formatting the device or installing operating systems are not affected by the "security freeze".
The above output shows the device is not locked by a HDD-password on boot and the frozen state safeguards the device against malwares which may try to lock it by setting a password to it at runtime.
If you intend to set a password to a "frozen" device yourself, a motherboard BIOS with support for it is required. A lot of notebooks have support, because it is required for hardware encryption, but support may not be trivial for a desktop/server board. For the Intel DH67CL/BL motherboard, for example, the motherboard has to be set to "maintenance mode" by a physical jumper to access the settings (see [6], [7]).
hdparm
unless you know exactly what you are doing.If you intend to erase the SSD, see Securely wipe disk#hdparm and below.
SSD memory cell clearing
On occasion, users may wish to completely reset an SSD's cells to the same virgin state they were at the time the device was installed thus restoring it to its factory default write performance. Write performance is known to degrade over time even on SSDs with native TRIM support. TRIM only safeguards against file deletes, not replacements such as an incremental save.
The reset is easily accomplished in a three step procedure denoted on the SSD memory cell clearing wiki article. If the reason for the reset is to wipe data, you may not want to rely on the SSD bios to perform it securely. See Securely wipe disk#Flash memory for further information and examples to perform a wipe.
Tips for minimizing disk reads/writes
An overarching theme for SSD usage should be 'simplicity' in terms of locating high-read/write operations either in RAM (Random Access Memory) or on a physical HDD rather than on an SSD. Doing so will add longevity to an SSD. This is primarily due to the large erase block size (512 KiB in some cases); a lot of small writes result in huge effective writes.
Use iotop and sort by disk writes to see how much and how frequently are programs writing to the disk.
-b
option. -o
is used to show only processes actually doing I/O, and -qqq
is to suppress column names and I/O summary. See man iotop
for more options.
# iotop -boqqq
Intelligent partition scheme
- For systems with both an SSD and an HDD, consider relocating the
/var
partition to a magnetic disc on the system rather than on the SSD itself to avoid read/write wear.
noatime mount option
fstab atime option noatime
or relatime
eliminates the need by the system to make writes to the file system for files which are simply being read. Since writes can be somewhat expensive for Solid State Drives, this can result in measurable performance gains. See Fstab for detail.
Locate frequently used files to RAM
Browser profiles
One can easily mount browser profile(s) such as chromium, firefox, opera, etc. into RAM via tmpfs and also use rsync to keep them synced with HDD-based backups. In addition to the obvious speed enhancements, users will also save read/write cycles on their SSD by doing so.
The AUR contains several packages to automate this process, for example profile-sync-daemonAUR.
Others
For the same reasons a browser's profile can be relocated to RAM, so can highly used directories such as /srv/http
(if running a web server). A sister project to profile-sync-daemonAUR is anything-sync-daemonAUR, which allows users to define any directory to sync to RAM using the same underlying logic and safe guards.
Compiling in tmpfs
Intentionally compiling in tmpfs is great to minimize disk reads/writes. For more information, refer to Makepkg#Improving compile times.
Disabling journaling on the filesystem
Using a journaling filesystem such as ext4 on an SSD without a journal is an option to decrease read/writes. The obvious drawback of using a filesystem with journaling disabled is data loss as a result of an ungraceful dismount (i.e. post power failure, kernel lockup, etc.). With modern SSDs, Ted Tso advocates that journaling can be enabled with minimal extraneous read/write cycles under most circumstances:
Amount of data written (in megabytes) on an ext4 file system mounted with noatime
.
operation | journal | w/o journal | percent change |
---|---|---|---|
git clone | 367.0 | 353.0 | 3.81 % |
make | 207.6 | 199.4 | 3.95 % |
make clean | 6.45 | 3.73 | 42.17 % |
"What the results show is that metadata-heavy workloads, such as make clean, do result in almost twice the amount data written to disk. This is to be expected, since all changes to metadata blocks are first written to the journal and the journal transaction committed before the metadata is written to their final location on disk. However, for more common workloads where we are writing data as well as modifying filesystem metadata blocks, the difference is much smaller."
Firmware updates
ADATA
ADATA has a utility available for Linux (i686) on their support page here. The link to latest firmware will appear after selecting the model. The latest Linux update utility is packed with firmware and needs to be run as root. One may need to set correct permissions for binary file first.
Crucial
Crucial provides an option for updating the firmware with an ISO image. These images can be found after selecting the product here and downloading the "Manual Boot File." Owners of an M4 Crucial model, may check if a firmware upgrade is needed with smartctl
.
$ smartctl --all /dev/sdX
==> WARNING: This drive may hang after 5184 hours of power-on time: http://www.tomshardware.com/news/Crucial-m4-Firmware-BSOD,14544.html See the following web pages for firmware updates: http://www.crucial.com/support/firmware.aspx http://www.micron.com/products/solid-state-storage/client-ssd#software
Users seeing this warning are advised to backup all sensible data and consider upgrading immediately.
Intel
Intel has a Linux live system based Firmware Update Tool for operating systems that are not compatible with its Intel® Solid-State Drive Toolbox software.
Kingston
Kingston has a Linux utility to update the firmware of Sandforce controller based drives: SSD support page. Click the images on the page to go to a support page for your SSD model. Support specifically for, e.g. the SH100S3 SSD, can be found here: support page.
Mushkin
The lesser known Mushkin brand Solid State drives also use Sandforce controllers, and have a Linux utility (nearly identical to Kingston's) to update the firmware.
OCZ
OCZ has a command line utility available for Linux (i686 and x86_64) on their forum here.
Samsung
Samsung notes that update methods other than using their Magician Software are "not supported," but it is possible. The Magician Software can be used to make a USB drive bootable with the firmware update. Samsung provides pre-made bootable ISO images that can be used to update the firmware. Another option is to use Samsung's samsung_magicianAUR, which is available in the AUR. Magician only supports Samsung-branded SSDs; those manufactured by Samsung for OEMs (e.g., Lenovo) are not supported.
Users preferring to run the firmware update from a live USB created under Linux (without using Samsung's "Magician" software under Microsoft Windows) can refer to this post for reference.
Native upgrade
Alternatively, the firmware can be upgraded natively, without making a bootable USB stick, as shown below.
First visit the Samsung downloads page and download the latest firmware for Windows, which is available as a disk image. In the following, Samsung_SSD_840_EVO_EXT0DB6Q.iso
is used as an example file name, adjust it accordingly.
Setup the disk image:
$ udisksctl loop-setup -r -f Samsung_SSD_840_EVO_EXT0DB6Q.iso
This will make the ISO available as a loop device, and display the device path. Assuming it was /dev/loop0
:
$ udisksctl mount -b /dev/loop0
Get the contents of the disk:
$ mkdir Samsung_SSD_840_EVO_EXT0DB6Q $ cp -r /run/media/$USER/CDROM/isolinux/ Samsung_SSD_840_EVO_EXT0DB6Q
Unmount the iso:
$ udisksctl unmount -b /dev/loop0 $ cd Samsung_SSD_840_EVO_EXT0DB6Q/isolinux
There is a FreeDOS image here that contains the firmware. Mount the image as before:
$ udisksctl loop-setup -r -f btdsk.img $ udisksctl mount -b /dev/loop1 $ cp -r /run/media/$USER/C04D-1342/ Samsung_SSD_840_EVO_EXT0DB6Q $ cd Samsung_SSD_840_EVO_EXT0DB6Q/C04D-1342/samsung
Get the disk number from magician:
# magician -L
Assuming it was 0:
# magician --disk 0 -F -p DSRD
Verify that the latest firmware has been installed:
# magician -L
Finally reboot.
SanDisk
SanDisk makes ISO firmware images to allow SSD firmware update on operating systems that are unsupported by their SanDisk SSD Toolkit. One must choose the firmware for the right SSD model, as well as for the capacity that it has (e.g. 60GB, or 256GB). After burning the adequate ISO firmware image, simply restart the PC to boot with the newly created CD/DVD boot disk (may work from a USB stick).
The iso images just contain a linux kernel and an initrd. Extract them to /boot
partition and boot them with GRUB or Syslinux to update the firmware.
I could not find a single page listing the firmware updates yet (site is a mess IMHO), but here are some relevant links:
SanDisk Extreme SSD Firmware Release notes and Manual Firmware update version R211
SanDisk Ultra SSD Firmware release notes and Manual Firmware update version 365A13F0
Troubleshooting
It is possible that the issue you are encountering is a firmware bug which is not Linux specific, so before trying to troubleshoot an issue affecting the SSD device, you should first check if updates are available for:
Even if it is a firmware bug it might be possible to avoid it, so if there are no updates to the firmware or you hesitant on updating firmware then the following might help.
Resolving NCQ errors
Some SSDs and SATA chipsets do not work properly with Linux Native Command Queueing (NCQ). The tell-tale dmesg errors look like this:
[ 9.115544] ata9: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x10 frozen [ 9.115550] ata9.00: failed command: READ FPDMA QUEUED [ 9.115556] ata9.00: cmd 60/04:00:d4:82:85/00:00:1f:00:00/40 tag 0 ncq 2048 in [ 9.115557] res 40/00:18:d3:82:85/00:00:1f:00:00/40 Emask 0x4 (timeout)
To disable NCQ on boot, add libata.force=noncq
to the kernel command line in the bootloader configuration. To disable NCQ only for disk 0 on port 1 use: libata.force=1.00:noncq
Alternatively, you may disable NCQ for a specific drive without rebooting via sysfs:
# echo 1 > /sys/block/sdX/device/queue_depth
If this (and also updating the firmware) does not resolves the problem or cause other issues, then file a bug report.
Some SSDs (e.g. Transcend MTS400) are failing when SATA Active Link Power Management, ALPM, is enabled. ALPM is disabled by default and enabled by a power saving daemon (e.g. TLP, Laptop Mode Tools).
If you starting to encounter SATA related errors when using such daemon then you should try to disable ALPM by setting its state to max_performance
for both battery and AC powered profiles.
See also
- Discussion on Reddit about installing Arch on an SSD
- See the Flashcache article for advanced information on using solid-state with rotational drives for top performance.
- Speed Up Your SSD By Correctly Aligning Your Partitions (using GParted)
- Re: Varying Leafsize and Nodesize in Btrfs
- Re: SSD alignment and Btrfs sector size
- Erase Block (Alignment) Misinformation?
- Is alignment to erase block size needed for modern SSD's?
- Btrfs support for efficient SSD operation (data blocks alignment)
- SSD, Erase Block Size & LVM: PV on raw device, Alignment