Difference between revisions of "Solid State Drives"

From ArchWiki
Jump to: navigation, search
(Ext4: link to official in kernel tree documentation)
Line 290: Line 290:
==== Ext4 ====
==== Ext4 ====
[http://en.wikipedia.org/wiki/Ext4 Ext4] is another filesystem that has support for SSD. It is considered as stable since 2.6.28 and is mature enough for daily use. Contrary to Btrfs, ext4 does not automatically detect the disk nature; users must explicitly enable the TRIM command support using the '''discard''' mounting option in [[fstab]] (or with tune2fs -o discard /dev/sdaX).
[http://en.wikipedia.org/wiki/Ext4 Ext4] is another filesystem that has support for SSD. It is considered as stable since 2.6.28 and is mature enough for daily use. Contrary to Btrfs, ext4 does not automatically detect the disk nature; users must explicitly enable the TRIM command support using the '''discard''' mounting option in [[fstab]] (or with tune2fs -o discard /dev/sdaX).
See the [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/btrfs.txt;h=64087c34327fe0ba11e790e0a41224b8e7c1d30c;hb=HEAD official in kernel tree documentation] for further information on ext4.
See the [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=Documentation/filesystems/ext4.txt official in kernel tree documentation] for further information on ext4.
== SSD Benchmarking ==
== SSD Benchmarking ==

Revision as of 22:17, 23 April 2012

This template has only maintenance purposes. For linking to local translations please use interlanguage links, see Help:i18n#Interlanguage links.

Local languages: Català – Dansk – English – Español – Esperanto – Hrvatski – Indonesia – Italiano – Lietuviškai – Magyar – Nederlands – Norsk Bokmål – Polski – Português – Slovenský – Česky – Ελληνικά – Български – Русский – Српски – Українська – עברית – العربية – ไทย – 日本語 – 正體中文 – 简体中文 – 한국어

External languages (all articles in these languages should be moved to the external wiki): Deutsch – Français – Română – Suomi – Svenska – Tiếng Việt – Türkçe – فارسی

Template:Article summary start Template:Article summary text Template:Article summary heading Template:Article summary wiki Template:Article summary wiki Template:Article summary end


Solid State Drives (SSDs) are not PnP devices. Special considerations such as partition alignment, choice of file system, TRIM support, etc. are needed to setup SSDs for optimal performance. This article attempts to capture referenced, key learnings to enable users to get the most out of SSDs under Linux. Users are encouraged to read this article in its entirety before acting on recommendations as the content is organized by topic, not necessarily by any systematic or chronologically relevant order.

Note: This article is targeted at users running Linux, but much of the content is also relevant to our friends using both Windows and Mac OS X.

Advantages over HDDs

  • Fast read speeds - 2-3x faster than modern desktop HDDs (7,200 RPM using SATA2 interface).
  • Sustained read speeds - No decrease in read speed across the entirety of the device. HDD performance tapers off as the drive heads move from the outer edges to the center of HDD platters.
  • Minimal access time - Approx. 100x faster than an HDD. For example, 0.1 ms (100 ns) vs. 12-20 ms (12,000-20,000 ns) for desktop HDDs.
  • High degree of reliability.
  • No moving parts.
  • Minimal heat production.
  • Minimal power consumption - Fractions of a W at idle and 1-2 W while reading/writing vs. 10-30 W for a HDD depending on RPMs.
  • Light-weight - ideal for laptops.


  • Per-storage cost (dollars per GB, vs. pennies per GB for rotating media).
  • Capacity of marketed models is lower than that of HDDs.
  • Large cells require different filesystem optimizations than rotating media. The flash translation layer hides the raw flash access which a modern OS could use to optimize access.
  • Partitions and filesystems need some SSD-specific tuning. Page size and erase page size are not autodetected.
  • Cells wear out. Consumer MLC cells at mature 50nm processes can handle 10000 writes each; 35nm generally handles 5000 writes, and 25nm 3000 (smaller being higher density and cheaper). If writes are properly spread out, are not too small, and align well with cells, this translates into a lifetime write volume for the SSD that is a multiple of its capacity. Daily write volumes have to be balanced against life expectancy.
  • Firmwares and controllers are complex. They occasionally have bugs. Modern ones consume power comparable with HDDs. They implement the equivalent of a log-structured filesystem with garbage collection. They translate SATA commands traditionally intended for rotating media. Some of them do on the fly compression. They spread out repeated writes across the entire area of the flash, to prevent wearing out some cells prematurely. They also coalesce writes together so that small writes are not amplified into as many erase cycles of large cells. Finally they move cells containing data so that the cell does not lose its contents over time.
  • Performance can drop as the disk gets filled. Garbage collection is not universally well implemented, meaning freed space is not always collected into entirely free cells.

Pre-Purchase Considerations

There are several key features to look for prior to purchasing a contemporary SSD.

Key Features

  • Native TRIM support is a vital feature that both prolongs SSD lifetime and reduces loss of performance for write operations over time.
  • Buying the right sized SSD is key. As with all filesystems, target <75 % occupancy for all SSD partitions to ensure efficient use by the kernel.


This section is not meant to be all-inclusive, but does capture some key reviews.

Tips for Maximizing SSD Performance

Partition Alignment

High-level Overview

Proper partition alignment is essential for optimal performance and longevity. Key to alignment is partitioning to (at least) the EBS (erase block size) of the SSD.

Note: The EBS is largely vendor specific; a Google search on the model of interest would be a good idea! The Intel X25-M for example is thought to have an EBS of 512 KiB, but Intel has yet to publish anything officially to this end.
Note: If one do not know the EBS of one's SSD, use a size of 512 KiB. Those numbers are greater or equal than all the current EBS. Aligning partitions for such an EBS will result in partitions also aligned for all lesser sizes. This is how Windows Seven and Ubuntu "optimizes" partitions to work with SSD.

If the partitions are not aligned to begin at multiples of the EBS (512 KiB for example), aligning the file system is a pointless exercise because everything is skewed by the start offset of the partition. Traditionally, hard drives were addressed by indicating the cylinder, the head, and the sector at which data was to be read or written. These represented the radial position, the drive head (= platter and side) and the axial position of the data respectively. With LBA (logical block addressing), this is no longer the case. Instead, the entire hard drive is addressed as one continuous stream of data.

Using GPT - Modern Method

GPT is an alternative, contemporary partitioning style. It is intended to replace the old Master Boot Record (MBR) system. GPT has several advantages over MBR, which has quirks dating back to MS-DOS times. With recent developments to the formatting tools fdisk (MBR) and gdisk (GPT), it is equally easy to use GPT or MBR and get maximum performance.

Choosing between GPT and MBR

The choice basically boils down to this:

  • If using GRUB Legacy as the bootloader, one must use MBR. See #Using MBR - Legacy Method.
  • To dual-boot with Windows, one must use MBR. See #Using MBR - Legacy Method.
    • A special exception to this rule: dual-booting Windows Vista/7 64 bit, and using UEFI instead of BIOS, one must use GPT.
  • If none of the above apply, choose freely between GPT and MBR. Since GPT is more modern, it is recommended in this case.
Gdisk Usage Summary

The GPT-able tool equivalent to fdisk, gdisk, can perform partitions alignment automatically on a 2048 sectors (or 1024KiB) block size base which should be compatible with the vast majority of SSD if not all. GNU parted also support GPT, but is less user-friendly for aligning partitions. A summary of the typical usage of gdisk:

  • Install gdisk (gptfdisk package) from the extra repository.
  • Simply start gdisk against your SSD.
  • If the SSD is brand new or if wanting to start over, create a new empty GUID partition table (aka GPT) with the 'o' command.
  • Create a new partition with the 'n' command (primary type/1st partition).
  • Assuming the partition is new, gdisk will pick the highest possible alignment. Otherwise, it will pick the largest power of two that divides all partition offsets.
  • If choosing to start on a sector before the 2048th gdisk will automatically shift the partition start to the 2048th disk sector. This is to ensure a 2048-sectors alignment (as a sector is 512B, this is a 1024KiB alignment which should fit any SSD NAND erase block).
  • Use the +x{M,G} format to extend the partition x megabytes or gigabytes, if choosing a size that is not a multiple of the alignment size (1024kiB), gdisk will shrink the partition to the nearest inferior multiple).
  • Select the partition's type id, the default, 'Linux/Windows data' (code 0700), should be fine for most use. Press L to show the codes list.
  • Assign other partitions in a like fashion.
  • Write the table to disk and exit via the 'w' command.
  • Create the filesystems as usual.
Warning: If planning to use the GPT partitioned SSD as a boot-disk on a BIOS based system (most systems except Apple computers and some very rare motherboard models with Intel chipset) one may have to create, preferably at the disk's beginning, a 1 MiB partition with the partition type as BIOS boot or bios_grub partition (gdisk type code EF02) for booting from the disk using GRUB2. For Syslinux, one does not need to create a separate 1 MiB bios_grub partition, but one needs to have separate /boot partition and enable Legacy BIOS Bootable partition attribute for that partition (using gdisk). See GPT for more information.
Warning: GRUB legacy does not support GUID partitioning scheme, users must use burg, GRUB2 or Syslinux.
Warning: If planning to dual boot with Windows (XP, Vista or 7) do NOT use GPT since they do NOT support booting from a GPT disk in BIOS systems! Users need to use the legacy MBR method described below for dual-boot in BIOS systems! This limitation does not apply if booting in UEFI mode and using Windows Vista (64bits) or 7 (64bits). For 32-bit Windows Vista and 7, and 32 and 64-bit Windows XP, users need to use MBR partitioning and boot in BIOS mode only.

Using MBR - Legacy Method

Using MBR, the utility for editing the partition table is called fdisk. Recent versions of fdisk have abandoned the deprecated system of using cylinders as the default display unit, as well as MS-DOS compatibility by default. The latest fdisk automatically aligns all partitions to 2048 sectors, or 1024 KiB, which should work for all EBS sizes that are known to be used by SSD manufacturers. This means that the default settings will give you proper alignment.

Note that in the olden days, fdisk used cylinders as the default display unit, and retained an MS-DOS compatibility quirk that messed with SSD alignment. Therefore one will find many guides around the internet from around 2008-2009 making a big deal out of getting everything correct. With the latest fdisk, things are much simpler, as reflected in this guide.

Fdisk Usage Summary
  • Start fdisk.
  • If the SSD is brand new, create a new empty DOS partition table with the 'o' command.
  • Create a new partition with the 'n' command (primary type/1st partition).
  • Use the +xG format to extend the partition x gigabytes.
  • Change the partition's system id from the default type of Linux (type 83) to the desired type via the 't' command. This is an optional step should the user wish to create another type of partition for example, swap, NTFS, LVM, etc. Note that a complete listing of all valid partition types is available via the 'l' command.
  • Assign other partitions in a like fashion.
  • Write the table to disk and exit via the 'w' command.

When finished, users may format their newly created partitions with the 'mkfs.x /dev/sdXN' where x is the filesystem, X is the drive letter, and N is the partition number. The following example will format the first partition on the first disk to ext4 using the defaults specified in /etc/mke2fs.conf:

# mkfs.ext4 /dev/sda1
Warning: Using the mkfs command can be dangerous as a simple mistake can result in formatting the WRONG partition and in data loss! TRIPLE check the target of this command before hitting the Enter key!
Special Considerations for Logical Partitions

---Place holder for content.

Special Considerations for RAID0 Setups with Multiple SSDs

---Place holder for content.

Encrypted partition

When using cryptsetup, define a sufficient payload (see here):

cryptsetup luksFormat --align-payload=8192 ...

But remember that DISCARD/TRIM feature is NOT SUPPORTED by device-mapper (but they are working on it, see here. August 2011 news: support will be in Linux 3.1, and involves a userspace dm-crypt update as well [1])

Mount Flags

There are several key mount flags to use in one's /etc/fstab entries for SSD partitions.

  • noatime - Reading accesses to the file system will no longer result in an update to the atime information associated with the file. The importance of the noatime setting is that it eliminates the need by the system to make writes to the file system for files which are simply being read. Since writes can be somewhat expensive as mentioned in previous section, this can result in measurable performance gains. Note that the write time information to a file will continue to be updated anytime the file is written to with this option enabled.
  • discard - The discard flag will enable the benefits of the TRIM command so long as one is using kernel version >=2.6.33. It does not work with ext3; using the discard flag for an ext3 root partition will result in it being mounted read-only.
/dev/sda1 / ext4 defaults,noatime,discard 0 1
/dev/sda2 /home ext4 defaults,noatime,discard 0 1
Warning: It is critically important that users switch the controller driving the SSD to AHCI mode (not IDE mode) to ensure that the kernel is able to use the TRIM command.
Warning: Users need to be certain that kernel version 2.6.33 or above is being used AND that their SSD supports TRIM before attempting to mount a partition with the discard flag. Data loss can occur otherwise!

Special considerations for Mac computers

By default, Apple's firmware switches SATA drives into IDE mode (not AHCI mode) when booting any OS besides Mac OS. It is easy to switch back to AHCI if using GRUB2 with an Intel SATA controller.

First determine the PCI identifier of the SATA controller. Run the command

# lspci -nn

and find the line that says "SATA AHCI Controller". The PCI identifier is in square brackets and should look like 8086:27c4 (but the last digits may be different).

Now edit /boot/grub/grub.cfg and add the line:

# setpci -d 8086:27c4 90.b=40

right above the "set root" line of each OS for which AHCI wil be enabled. Be sure to substitute the appropriate PCI identifier.

(credit: http://darkfader.blogspot.com/2010/04/windows-on-intel-mac-and-ahci-mode.html)

If you have a macbook unibody late 2008 (5.1) you doesn't have an intel controler. You have got an "nVidia Corporation MCP79 SATA Controller".

add this line to /boot/grub/grub.cfg

# setpci -d 10de:0ab5 9c.b=06

I/O Scheduler

Consider switching from the default scheduler, which under Arch is cfq (completely fair queuing), to the noop or deadline scheduler for an SSD. The later two offer performance boosts over cfq. Using the noop scheduler, for example, simply processes requests in the order they are received, without giving any consideration to where the data physically resides on the disk. This option is thought to be advantageous for SSDs since seek times are identical for all sectors on the SSD.

However, some SSDs, particularly earlier, JMicron-based ones, may experience better performance sticking with the default scheduler (see hereTemplate:Linkrot for one such benchmark); on these, while seek times are similar for all sectors, random access throughput is bad enough to offset any advantage. If the SSD was manufactured within the last year or so, or is made by Intel, this probably does not apply.

For more on schedulers, see this Linux Magazine article (needs registration).

About the default scheduler for ssd drives: FS#22605

The cfq scheduler is enabled by default on Arch. Verify this by viewing the contents /sys/block/sda/queue/scheduler:

$ cat /sys/block/sdX/queue/scheduler
noop deadline [cfq]

The scheduler currently in use is denoted from the available schedulers by the brackets.

There are several ways to change the scheduler.

Note: Only switch the scheduler to noop or deadline for SSDs. Keeping the cfq scheduler for all other physical HDDs is highly recommended.
Using the sys virtual filesystem

This method is preferred when the system has several physical storage devices (for example an SSD and an HDD). Add the following line in /etc/rc.local:

echo noop > /sys/block/sdX/queue/scheduler

where X is the letter for the SSD device.

Because of the potential for udev to assign different /dev/ nodes to drives before and after a kernel update, users must take care that the noop scheduler is applied to the correct device upon boot. One way to do this is by using the SSD's device ID to determine its /dev/ node. To do this automatically, use the following snippet instead of the line above and add it to /etc/rc.local:


declare -i i=0
while [ "${SSD[$i]}" != "" ]; do
  NODE=`ls -l /dev/disk/by-id/${SSD[$i]} | awk '{ print $NF }' | sed -e 's/[/\.]//g'`
  echo noop > /sys/block/$NODE/queue/scheduler

where SSD is a Bash array containing the device IDs of all SSD devices. Device IDs are listed in /dev/disk/by-id/ as symbolic links pointing to their corresponding /dev/ nodes. To view the links listed with their targets, issue the following command:

ls -l /dev/disk/by-id/
Kernel parameter

If the sole storage device in the system is an SSD, consider setting the I/O scheduler for the entire system via the elevator kernel parameter:


For example, with GRUB, in /boot/grub/menu.lst:

kernel /vmlinuz26 root=/dev/sda3 ro elevator=noop

or with GRUB2, in /etc/default/grub: (remember to run update-grub afterwards)


Swap Space on SSDs

One can place a swap partition on an SSD. Note that most modern desktops with an excess of 2 Gigs of memory rarely use swap at all. The notable exception is systems which make use of the hibernate feature. The following is recommended tweak for SSDs using a swap partition that will reduce the "swapiness" of the system thus avoiding writes to swap.

# echo 1 > /proc/sys/vm/swappiness

Or one can simply modify /etc/sysctl.conf as recommended in the Maximizing Performance wiki article.


SSD Memory Cell Clearing

On occasion, users may wish to completely reset an SSD's cells to the same virgin state they were at the time he/she installed the device thus restoring it to its factory default write performance. Write performance is known to degrade over time even on SSDs with native TRIM support. TRIM only safeguards against file deletes, not replacements such as an incremental save.

The reset is easily accomplished in a three step procedure denoted on the SSD Memory Cell Clearing wiki article.

Tips for Minimizing SSD Read/Writes

An overarching theme for SSD usage should be 'simplicity' in terms of locating high-read/write operations either in RAM (Random Access Memory) or on a physical HDD rather than on an SSD. Doing so will add longevity to an SSD. This is primarily due to the large erase block size (512 KiB in some cases); a lot of small writes result in huge effective writes.

Note: A 32GB SSD with a mediocre 10x write amplification factor, a standard 10000 write/erase cycle, and 10GB of data written per day, would get an 8 years life expectancy. It gets better with bigger SSDs and modern controllers with less write amplification.

Use iotop -oPa and sort by disk writes to see how much programs are writing to disk.

Intelligent Partition Scheme

Consider relocating the /var partition to a physical disc on the system rather than on the SSD itself to avoid read/write wear. Many users elect to keep only /, and /home on the SSD (/boot is okay too) locating /var and /tmp on a physical HDD.


/media/data (and other extra partitions, etc.)

If the SSD is the only storage device on the system (i.e. no HDDs), consider allocating a separate partition for /var to allow for better crash recovery for example in the event of a broken program wasting all the space on / or if some run away log file maxes out the space, etc.

Another intelligent option is to locate /tmp is into RAM provided the system has enough to spare. See the next section for more on this procedure.

The noatime Mount Flag

Assign the noatime flag to partitions residing on SSDs. See the Mount Flags section below for more.

Locate Browser Profiles to RAM

One can easily mount browser profile(s) such as chromium, firefox, opera, etc. into RAM via tmpfs and also use rsync to keep them synced with HDD-based backups. In addition to the obvious speed enhancements, users will also save read/write cycles on their SSD by doing so.

The AUR contains several packages to automate this process, for example:

Locate /var/log and others to RAM

For the same reasons a browser's profile can be relocated to RAM, so can highly used directories such as /var/log and /srv/http (if running a web server). A sister project to profile-sync-daemonAUR is anything-sync-daemonAUR which allows users to define any directory to sync to RAM using the same underlying logic and safe guards.

The AUR contains the needed package for anything-sync-daemonAUR.

Compiling in tmpfs

Intentionally compiling in /tmp is a great idea to minimize this problem. For systems with >4 GB of memory, the tmp line in /etc/fstab can be tweaked to use more than 1/2 the physical memory on the system via the size flag.

Example of a machine with 8 GB of physical memory:

tmpfs /tmp tmpfs nodev,nosuid,size=7G 0 0

Disabling Journaling on the Filesystem?

Using a journaling filesystem such as ext3 or ext4 on an SSD WITHOUT a journal is an option to decrease read/writes. The obvious drawback of using a filesystem with journaling disabled is data loss as a result of an ungraceful dismount (i.e. post power failure, kernel lockup, etc.). With modern SSDs, Ted TsoTemplate:Linkrot advocates that journaling can be enabled with minimal extraneous read/write cycles under most circumstances:

Amount of data written (in megabytes) on an ext4 file system mounted with noatime.

operation journal w/o journal percent change
git clone 367.0 353.0 3.81 %
make 207.6 199.4 3.95 %
make clean 6.45 3.73 42.17 %

"What the results show is that metadata-heavy workloads, such as make clean, do result in almost twice the amount data written to disk. This is to be expected, since all changes to metadata blocks are first written to the journal and the journal transaction committed before the metadata is written to their final location on disk. However, for more common workloads where we are writing data as well as modifying filesystem metadata blocks, the difference is much smaller."

Note: The make clean example from the table above typifies the importance of intentionally doing compiling in tmpfs as recommended in the preceding section of this article!

Choice of Filesystem

Many options exist for file systems including ext2, ext3, ext4, btrfs, etc.


Btrfs support has been included with the mainline 2.6.29 release of the Linux kernel. Some feel that it is not mature enough for production use while there are also early adopters of this potential successor to ext4. Users are encouraged to read the Btrfs article for more.


Ext4 is another filesystem that has support for SSD. It is considered as stable since 2.6.28 and is mature enough for daily use. Contrary to Btrfs, ext4 does not automatically detect the disk nature; users must explicitly enable the TRIM command support using the discard mounting option in fstab (or with tune2fs -o discard /dev/sdaX). See the official in kernel tree documentation for further information on ext4.

SSD Benchmarking

See the SSD Benchmarking article for a general process of benchmarking SSDs or to see some of the SSDs in the database.

Firmware Updates


OCZ has a command line utility available for Linux (i686 and x86_64) on their forum here.

See also