https://wiki.archlinux.org/api.php?action=feedcontributions&user=Underhood&feedformat=atomArchWiki - User contributions [en]2024-03-29T13:09:06ZUser contributionsMediaWiki 1.41.0https://wiki.archlinux.org/index.php?title=Solid_state_drive&diff=235868Solid state drive2012-11-18T10:45:21Z<p>Underhood: Added info to enable TRIM on LVM level if used</p>
<hr />
<div>[[Category:Storage]]<br />
[[it:Solid State Drives]]<br />
[[ru:Solid State Drives]]<br />
[[zh-CN:Solid State Drives]]<br />
{{Article summary start}}<br />
{{Article summary text|This article covers many aspects of SSDs (solid state drives) as they relate to Linux; however, the underlying principals and key learning presented within are general enough to be applicable to users running SSDs on other operating systems such as the Windows family of products as well as Mac OS X. Beyond the aforementioned information, Linux users will benefit from the tweaks/optimization presented herein.}}<br />
{{Article summary heading|Related Articles}}<br />
{{Article summary wiki|SSD Benchmarking}}<br />
{{Article summary wiki|SSD Memory Cell Clearing}}<br />
{{Article summary wiki|profile-sync-daemon}} <br />
{{Article summary end}}<br />
<br />
==Introduction==<br />
Solid State Drives (SSDs) are not PnP devices. Special considerations such as partition alignment, choice of file system, TRIM support, etc. are needed to set up SSDs for optimal performance. This article attempts to capture referenced, key learnings to enable users to get the most out of SSDs under Linux. Users are encouraged to read this article in its entirety before acting on recommendations as the content is organized by topic, not necessarily by any systematic or chronologically relevant order.<br />
<br />
{{Note|This article is targeted at users running Linux, but much of the content is also relevant to our friends using other operating systems like BSD, Mac OS X or Windows.}}<br />
===Advantages over HDDs===<br />
*Fast read speeds - 2-3x faster than modern desktop HDDs (7,200 RPM using SATA2 interface).<br />
*Sustained read speeds - no decrease in read speed across the entirety of the device. HDD performance tapers off as the drive heads move from the outer edges to the center of HDD platters.<br />
*Minimal access time - approximately 100x faster than an HDD. For example, 0.1 ms (100 ns) vs. 12-20 ms (12,000-20,000 ns) for desktop HDDs.<br />
*High degree of reliability.<br />
*No moving parts.<br />
*Minimal heat production.<br />
*Minimal power consumption - fractions of a W at idle and 1-2 W while reading/writing vs. 10-30 W for a HDD depending on RPMs.<br />
*Light-weight - ideal for laptops.<br />
<br />
===Limitations===<br />
*Per-storage cost (dollars per GB, vs. pennies per GB for rotating media).<br />
*Capacity of marketed models is lower than that of HDDs.<br />
*Large cells require different filesystem optimizations than rotating media. The flash translation layer hides the raw flash access which a modern OS could use to optimize access.<br />
*Partitions and filesystems need some SSD-specific tuning. Page size and erase page size are not autodetected.<br />
*Cells wear out. Consumer MLC cells at mature 50nm processes can handle 10000 writes each; 35nm generally handles 5000 writes, and 25nm 3000 (smaller being higher density and cheaper). If writes are properly spread out, are not too small, and align well with cells, this translates into a lifetime write volume for the SSD that is a multiple of its capacity. Daily write volumes have to be balanced against life expectancy.<br />
*Firmwares and controllers are complex. They occasionally have bugs. Modern ones consume power comparable with HDDs. They [https://lwn.net/Articles/353411/ implement] the equivalent of a log-structured filesystem with garbage collection. They translate SATA commands traditionally intended for rotating media. Some of them do on the fly compression. They spread out repeated writes across the entire area of the flash, to prevent wearing out some cells prematurely. They also coalesce writes together so that small writes are not amplified into as many erase cycles of large cells. Finally they move cells containing data so that the cell does not lose its contents over time.<br />
*Performance can drop as the disk gets filled. Garbage collection is not universally well implemented, meaning freed space is not always collected into entirely free cells.<br />
<br />
==Pre-Purchase Considerations==<br />
There are several key features to look for prior to purchasing a contemporary SSD.<br />
===Key Features===<br />
*Native [http://en.wikipedia.org/wiki/TRIM TRIM] support is a vital feature that both prolongs SSD lifetime and reduces loss of performance for write operations over time.<br />
*Buying the right sized SSD is key. As with all filesystems, target <75 % occupancy for all SSD partitions to ensure efficient use by the kernel.<br />
<br />
===Reviews===<br />
This section is not meant to be all-inclusive, but does capture some key reviews.<br />
*[http://www.anandtech.com/show/2738 SSD Anthology (history lesson, a bit dated)]<br />
*[http://www.anandtech.com/show/2829 SSD Relapse (refresher and more up to date])<br />
*[http://forums.anandtech.com/showthread.php?t=2069761 One user's recommendations]<br />
*[http://techgage.com/article/enabling_and_testing_ssd_trim_support_under_linux/ Enabling and Testing SSD TRIM Support Under Linux]<br />
<br />
==Tips for Maximizing SSD Performance==<br />
===Mount Flags===<br />
There are several key mount flags to use in one's {{ic|/etc/fstab}} entries for SSD partitions.<br />
<br />
*'''noatime''' - Reading accesses to the file system will no longer result in an update to the atime information associated with the file. The importance of the noatime setting is that it eliminates the need by the system to make writes to the file system for files which are simply being read. Since writes can be somewhat expensive as mentioned in previous section, this can result in measurable performance gains. Note that the write time information to a file will continue to be updated anytime the file is written to with this option enabled.<br />
** However, this will cause issues with some programs such as [[Mutt]], as the access time of the file will eventually be previous than the modification time, which would make no sense. Using the '''relatime''' option instead of noatime will ensure that the atime field will never be prior to the last modification time of a file.<br />
*'''discard''' - The discard flag will enable the benefits of the TRIM command as long as one is using kernel version >=2.6.33. It does not work with ext3; using the discard flag for an ext3 root partition will result in it being mounted read-only.<br />
<br />
/dev/sda1 / ext4 defaults,relatime,discard 0 1<br />
/dev/sda2 /home ext4 defaults,relatime,discard 0 1<br />
<br />
{{Warning|Users need to be certain that kernel version 2.6.33 or above is being used AND that their SSD supports TRIM before attempting to mount a partition with the {{ic|discard}} flag. Data loss can occur otherwise!}}<br />
===Enable TRIM for LVM===<br />
In case you use LVM on your system enable issue_discards option in /etc/lvm/lvm.conf<br />
<br />
===Enable TRIM With mkfs.ext4 or tune2fs (Discouraged) ===<br />
One can set the trim flag statically with tune2fs or when the filesystem is created.<br />
# tune2fs -o discard /dev/sdXY<br />
or<br />
# mkfs.ext4 -E discard /dev/sdXY<br />
<br />
{{Note|After this option is set as described above, any time the user checks mounted filesystems with "mount", the discard option will not show up. Even when discard is passed on the CLI in addition to the option being set with tune2fs or mkfs.ext4, it will not show up. See the following thread for a discussion about his: https://bbs.archlinux.org/viewtopic.php?id&#61;137314 }}<br />
<br />
===Apply TRIM via cron===<br />
Enabling TRIM on supported SSDs is definitely recommended. But sometimes it may cause some SSDs to [https://patrick-nagel.net/blog/archives/337 perform slowly] during deletion of files. If this is the case, one may choose to use fstrim as an alternative.<br />
# fstrim -v /<br />
The partition for which fstrim is to be applied must be mounted, and must be inidcated by the mount point. <br />
<br />
If this method seems like a better alternative, it might be a good idea to have this run from time to time using cron. To have this run daily, the default cron package (cronie) includes an anacron implementation which, by default, is set up for hourly, daily, weekly, and monthly jobs. To add to the list of daily cron tasks, simply create a script that takes care of the desired actions and put it in /etc/cron.daily, /etc/cron.weekly, etc. Appropriate nice and ionice values are recommended if this method is chosen. If implemented, you may remove the "discard" option from your fstab.<br />
<br />
{{Note|Use the 'discard' mount option as a first choice. This method should be considered second to the normal implementation of TRIM.}}<br />
<br />
=== I/O Scheduler ===<br />
Consider switching from the default [https://en.wikipedia.org/wiki/CFQ CFQ] scheduler (Completely Fair Queuing) since 2.6.18 to [http://en.wikipedia.org/wiki/NOOP_scheduler NOOP] or [http://en.wikipedia.org/wiki/Deadline_scheduler Deadline]. The latter two offer performance boosts for SSDs. The NOOP scheduler, for example, implements a simple queue for all incoming I/O requests, without re-ordering and grouping the ones that are physically closer on the disk. On SSDs seek times are identical for all sectors, thus invalidating the need to re-order I/O queues based on them.<br />
<br />
For more on schedulers, see these: [http://www.linux-mag.com/id/7564 Linux Magazine article], [http://www.phoronix.com/scan.php?page=article&item=linux_iosched_2012 Phoronix benchmark] and {{Bug|22605}}.<br />
<br />
The CFQ scheduler is enabled by default on Arch. Verify this by viewing the contents {{ic|/sys/block/sdX/queue/scheduler}}:<br />
$ cat /sys/block/sdX/queue/scheduler<br />
noop deadline [cfq]<br />
The scheduler currently in use is denoted from the available schedulers by the brackets. <br />
<br />
===== Kernel parameter (for a single device) =====<br />
If the sole storage device in the system is an SSD, consider setting the I/O scheduler for the entire system via the {{ic|1=elevator=noop}} kernel parameter. See [[Kernel parameters]] for more info.<br />
<br />
===== Using the sys virtual filesystem (for multiple devices) =====<br />
This method is preferred when the system has several physical storage devices (for example an SSD and an HDD).<br />
<br />
Create the following tmpfile where "X" is the letter for the SSD device.<br />
<br />
{{hc| /etc/tmpfiles.d/set_IO_scheduler.conf |<nowiki><br />
w /sys/block/sdX/queue/scheduler - - - - noop<br />
</nowiki>}}<br />
<br />
Because of the potential for udev to assign different {{ic|/dev/}} nodes to drives before and after a kernel update, users must take care that the NOOP scheduler is applied to the correct device upon boot. One way to do this is by using the SSD's device ID to determine its {{ic|/dev/}} node. To do this automatically, use the following snippet instead of the line above and add it to {{ic|/etc/rc.local}}:<br />
declare -ar SSDS=(<br />
'scsi-SATA_SAMSUNG_SSD_PM8_S0NUNEAB861972'<br />
'ata-SAMSUNG_SSD_PM810_2.5__7mm_256GB_S0NUNEAB861972'<br />
)<br />
<br />
for SSD in "${SSDS[@]}" ; do<br />
BY_ID=/dev/disk/by-id/$SSD<br />
<br />
if &#91;&#91; -e $BY_ID ]] ; then<br />
DEV_NAME=`ls -l $BY_ID | awk '{ print $NF }' | sed -e 's/[/\.]//g'`<br />
SCHED=/sys/block/$DEV_NAME/queue/scheduler<br />
<br />
if &#91;&#91; -w $SCHED ]] ; then<br />
echo noop > $SCHED<br />
fi<br />
fi<br />
done<br />
where {{ic|SSDS}} is a Bash array containing the device IDs of all SSD devices. Device IDs are listed in {{ic|/dev/disk/by-id/}} as symbolic links pointing to their corresponding {{ic|/dev/}} nodes. To view the links listed with their targets, issue the following command:<br />
ls -l /dev/disk/by-id/<br />
<br />
=====Using udev for one device or HDD/SSD mixed environment=====<br />
Though the above will undoubtedly work, it is probably considered a reliable workaround. It should also be noted that with the move to [[systemd]] there will be no rc.local. Ergo, it would be preferred to use the system that is responsible for the devices in the first place to implement the scheduler. In this case it is udev, and to do this, all one needs is a simple [[udev]] rule.<br />
<br />
To do this, create and edit a file in /etc/udev/rules.d named something like '60-schedulers.rules'. In the file include the following:<br />
# set deadline scheduler for non-rotating disks<br />
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"<br />
<br />
# set cfq scheduler for rotating disks<br />
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="cfq"<br />
Of course, set deadline/cfq to the desired schedulers. Changes should occur upon next boot. To check success of the new rule:<br />
$ cat /sys/block/sdX/queue/scheduler #where X is the device in question<br />
<br />
{{note|Keep in mind cfq is the default scheduler, so the second rule with the standard kernel is not actually necessary. Also, in the example sixty is chosen because that is the number udev uses for its own persistent naming rules. Thus, it would seem that block devices are at this point able to be modified and this is a safe position for this particular rule. But the rule can be named anything so long as it ends in '.rules'. (Credit: falconindy and w0ng for posting on his blog)}}<br />
<br />
=== Swap Space on SSDs ===<br />
One can place a swap partition on an SSD. Note that most modern desktops with an excess of 2 Gigs of memory rarely use swap at all. The notable exception is systems which make use of the hibernate feature. The following is recommended tweak for SSDs using a swap partition that will reduce the "swappiness" of the system thus avoiding writes to swap:<br />
<br />
# echo 1 > /proc/sys/vm/swappiness<br />
<br />
Or one can simply modify {{ic|/etc/sysctl.conf}} as recommended in the [[Maximizing_performance#Swappiness|Maximizing Performance]] wiki article:<br />
<br />
vm.swappiness=1<br />
vm.vfs_cache_pressure=50<br />
<br />
=== SSD Memory Cell Clearing ===<br />
On occasion, users may wish to completely reset an SSD's cells to the same virgin state they were at the time he/she installed the device thus restoring it to its [http://www.anandtech.com/storage/showdoc.aspx?i=3531&p=8 factory default write performance]. Write performance is known to degrade over time even on SSDs with native TRIM support. TRIM only safeguards against file deletes, not replacements such as an incremental save.<br />
<br />
The reset is easily accomplished in a three step procedure denoted on the [[SSD Memory Cell Clearing]] wiki article.<br />
<br />
==Tips for Minimizing SSD Read/Writes==<br />
An overarching theme for SSD usage should be 'simplicity' in terms of locating high-read/write operations either in RAM (Random Access Memory) or on a physical HDD rather than on an SSD. Doing so will add longevity to an SSD. This is primarily due to the large erase block size (512 KiB in some cases); a lot of small writes result in huge effective writes.<br />
<br />
{{Note|A 32GB SSD with a mediocre 10x write amplification factor, a standard 10000 write/erase cycle, and '''10GB of data written per day''', would get an '''8 years life expectancy'''. It gets better with bigger SSDs and modern controllers with less write amplification.}}<br />
<br />
Use {{ic|iotop -oPa}} and sort by disk writes to see how much programs are writing to disk.<br />
<br />
=== Intelligent Partition Scheme ===<br />
*For systems with both an SSD and an HDD, consider relocating the {{ic|/var}} partition to a physical disc on the system rather than on the SSD itself to avoid read/write wear. <br />
*For systems with only an SSD (i.e. no HDDs), consider allocating a separate partition for {{ic|/var}} to allow for better crash recovery for example in the event of a broken program wasting all the space on {{ic|/}} or if some run away log file maxes out the space, etc.<br />
<br />
=== The noatime Mount Flag ===<br />
Assign the {{ic|noatime}} flag to partitions residing on SSDs. See the [[#Mount_Flags|Mount Flags]] section above for more.<br />
<br />
=== Locate High-Use Files to RAM ===<br />
==== Browser Profiles ====<br />
One can ''easily'' mount browser profile(s) such as chromium, firefox, opera, etc. into RAM via tmpfs and also use rsync to keep them synced with HDD-based backups. In addition to the obvious speed enhancements, users will also save read/write cycles on their SSD by doing so.<br />
<br />
The AUR contains several packages to automate this process, for example {{AUR|profile-sync-daemon}}.<br />
<br />
==== Others ====<br />
For the same reasons a browser's profile can be relocated to RAM, so can highly used directories such as {{ic|/srv/http}} (if running a web server). A sister project to {{AUR|profile-sync-daemon}} is {{AUR|anything-sync-daemon}}, which allows users to define '''any''' directory to sync to RAM using the same underlying logic and safe guards.<br />
<br />
{{Warning|Do NOT attempt to add /var/log to anything-sync-daemon. Systemd gets really pissed if you do.}}<br />
<br />
=== Compiling in tmpfs ===<br />
Intentionally compiling in {{ic|/tmp}} is a great idea to minimize this problem. For systems with >4 GB of memory, the tmp line in {{ic|/etc/fstab}} can be tweaked to use more than 1/2 the physical memory on the system via the {{ic|1=size=}} flag as {{ic|/tmp}} keeps filling up.<br />
<br />
Example of a machine with 8 GB of physical memory:<br />
tmpfs /tmp tmpfs nodev,nosuid,size=7G 0 0<br />
<br />
=== Disabling Journaling on the filesystem ===<br />
Using a journaling filesystem such as ext3 or ext4 on an SSD WITHOUT a journal is an option to decrease read/writes. The obvious drawback of using a filesystem with journaling disabled is data loss as a result of an ungraceful dismount (i.e. post power failure, kernel lockup, etc.). With modern SSDs, [http://tytso.livejournal.com/61830.html Ted Tso] advocates that journaling can be enabled with minimal extraneous read/write cycles under most circumstances:<br />
<br />
'''Amount of data written (in megabytes) on an ext4 file system mounted with noatime.'''<br />
{| border="1" cellpadding="5"<br />
! operation !! journal !! w/o journal !! percent change<br />
|-<br />
!git clone<br />
|367.0<br />
|353.0<br />
|3.81 %<br />
|-<br />
!make<br />
|207.6<br />
|199.4<br />
|3.95 %<br />
|-<br />
!make clean<br />
|6.45<br />
|3.73<br />
|42.17 %<br />
|}<br />
<br />
''"What the results show is that metadata-heavy workloads, such as make clean, do result in almost twice the amount data written to disk. This is to be expected, since all changes to metadata blocks are first written to the journal and the journal transaction committed before the metadata is written to their final location on disk. However, for more common workloads where we are writing data as well as modifying filesystem metadata blocks, the difference is much smaller."''<br />
<br />
{{Note|The make clean example from the table above typifies the importance of intentionally doing compiling in tmpfs as recommended in the [[Solid_State_Drives#Compiling_in_tmpfs|preceding section]] of this article!}}<br />
<br />
=== Choice of Filesystem ===<br />
Many options exist for file systems including Ext2/3/4, Btrfs, etc.<br />
<br />
==== Btrfs ====<br />
[http://en.wikipedia.org/wiki/Btrfs Btrfs] support has been included with the mainline 2.6.29 release of the Linux kernel. Some feel that it is not mature enough for production use while there are also early adopters of this potential successor to ext4. Users are encouraged to read the [[Btrfs]] article for more info.<br />
<br />
==== Ext4 ====<br />
[http://en.wikipedia.org/wiki/Ext4 Ext4] is another filesystem that has support for SSD. It is considered as stable since 2.6.28 and is mature enough for daily use. Contrary to Btrfs, ext4 does not automatically detect the disk nature; users must explicitly enable the TRIM command support using the {{ic|discard}} mount option in [[fstab]] (or with {{ic|tune2fs -o discard /dev/sdaX}}).<br />
See the [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=Documentation/filesystems/ext4.txt official in kernel tree documentation] for further information on ext4.<br />
<br />
== SSD Benchmarking ==<br />
See the [[SSD Benchmarking]] article for a general process of benchmarking SSDs or to see some of the SSDs in the database.<br />
<br />
== Firmware Updates ==<br />
=== OCZ ===<br />
OCZ has a command line utility available for Linux (i686 and x86_64) on their forum [http://www.ocztechnology.com/ssd_tools/ here].<br />
<br />
== See also ==<br />
* [http://www.reddit.com/r/archlinux/comments/rkwjm/what_should_i_keep_in_mind_when_installing_on_ssd/ Discussion on Reddit about installing Arch on an SSD]<br />
* See the [[Flashcache]] article for advanced information on using solid-state with rotational drives for top performance.</div>Underhood