Difference between revisions of "Ext4"

From ArchWiki
Jump to: navigation, search
m (Creating ext4 partitions from scratch)
(Enabling metadata checksums: 'crc32c*' module is required; use module alias)
 
(122 intermediate revisions by 27 users not shown)
Line 5: Line 5:
 
[[fr:Ext4]]
 
[[fr:Ext4]]
 
[[it:Ext4]]
 
[[it:Ext4]]
 +
[[ja:Ext4]]
 
[[ru:Ext4]]
 
[[ru:Ext4]]
 
[[tr:Ext4]]
 
[[tr:Ext4]]
 
[[zh-CN:Ext4]]
 
[[zh-CN:Ext4]]
Ext4 is the evolution of the most used Linux filesystem, Ext3. In many ways, Ext4 is a deeper improvement over Ext3 than Ext3 was over Ext2. Ext3 was mostly about adding journaling to Ext2, but Ext4 modifies important data structures of the filesystem such as the ones destined to store the file data. The result is a filesystem with an improved design, better performance, reliability, and features.
+
{{Related articles start}}
 +
{{Related|File systems}}
 +
{{Related|Ext3}}
 +
{{Related articles end}}
 +
From [http://kernelnewbies.org/Ext4 Ext4 - Linux Kernel Newbies]:
 +
:Ext4 is the evolution of the most used Linux filesystem, Ext3. In many ways, Ext4 is a deeper improvement over Ext3 than Ext3 was over Ext2. Ext3 was mostly about adding journaling to Ext2, but Ext4 modifies important data structures of the filesystem such as the ones destined to store the file data. The result is a filesystem with an improved design, better performance, reliability, and features.
  
Source: [http://kernelnewbies.org/Ext4 Ext4 - Linux Kernel Newbies]
+
== Create a new ext4 filesystem ==
  
==Creating ext4 partitions from scratch==
+
To format a partition do:
  
# Upgrade your system: {{Ic|pacman -Syu}}
+
# mkfs.ext4 /dev/''partition''
# Format the partition: {{Ic|mkfs.ext4 /dev/sdxY}} (replace {{Ic|sdxY}} with the device to format (e.g. {{Ic|sda1}}))
+
# Mount the partition
+
# Add an entry to {{ic|/etc/[[fstab]]}}, using the filesystem 'type' ext4
+
# Remember to set appropriate permissions using [[chmod]] if the drive is not writable
+
  
{{Tip|See the mkfs.ext4 man page for more options; edit {{ic|/etc/mke2fs.conf}} to view/configure default options.}}
+
{{Tip|See the mkfs.ext4 [[man page]] for more options; edit {{ic|/etc/mke2fs.conf}} to view/configure default options.}}
  
Be aware that by default, {{Ic|mkfs.ext4}} uses a rather low bytes-per-inode ratio to calculate the fixed amount of inodes to be created.
+
=== Bytes-per-inode ratio ===
  
{{Note|Especially for contemporary HDDs (750 GB+) this usually results in a much too large inode number and thus many likely wasted GB.
+
From {{ic|man mkfs.ext4}}:
  
The ratio can be set directly via the {{Ic|-i}} option; one of 6291456 resulted in 476928 inodes for a 2 TB partition. After three years, this author's root partition uses about 415 thousand inodes.}}
+
:'''''mke2fs''' creates an inode for every ''bytes-per-inode'' bytes of space on the disk. The larger the ''bytes-per-inode'' ratio, the fewer inodes will be created.''
  
==Migrating from ext3 to ext4==
+
Creating a new file, directory, symlink etc. requires at least one free [[Wikipedia:Inode|inode]]. If the inode count is too low, no file can be created on the filesystem even though there is still space left on it.
  
There are two ways of migrating partitions from ext3 to ext4:
+
Because it is not possible to change either the bytes-per-inode ratio or the inode count after the filesystem is created, {{ic|mkfs.ext4}} uses by default a rather low ratio of one inode every 16384 bytes (16 Kb) to avoid this situation.
* mounting ext3 partitions as ext4 without converting (compatibility)
+
* converting ext3 partitions to ext4 (performance)
+
  
These two approaches are described below.
+
However, for partitions with size in the hundreds or thousands of GB and average file size in the megabyte range, this usually results in a much too large inode number because the number of files created never reaches the number of inodes.
  
===Mounting ext3 partitions as ext4 without converting===
+
This results in a waste of disk space, because all those unused inodes each take up 256 bytes on the filesystem (this is also set in {{ic|/etc/mke2fs.conf}} but should not be changed). 256 * several millions = quite a few gigabytes wasted in unused inodes.
 +
 
 +
This situation can be evaluated by comparing the {{ic|{I}Use%}} figures provided by {{ic|df}} and {{ic|df -i}}:
 +
 
 +
{{hc|$ df -h /home|
 +
Filesystem              Size    Used  Avail  '''Use%'''  Mounted on
 +
/dev/mapper/lvm-home    115G    56G    59G    '''49%'''    /home}}
 +
{{hc|$ df -hi /home|
 +
Filesystem              Inodes  IUsed  IFree  '''IUse%'''  Mounted on
 +
/dev/mapper/lvm-home    1.8M    1.1K  1.8M  '''1%'''    /home}}
 +
 
 +
To specify a different bytes-per-inode ratio, you can use the {{ic|-T ''usage-type''}} option which hints at the expected usage of the filesystem using types defined in {{ic|/etc/mke2fs.conf}}. Among those types are the bigger {{ic|largefile}} and {{ic|largefile4}} which offer more relevant ratios of one inode every 1 MiB and 4 MiB respectively. It can be used as such:
 +
 
 +
# mkfs.ext4 -T largefile /dev/''device''
 +
 
 +
The bytes-per-inode ratio can also be set directly via the {{ic|-i}} option: ''e.g.'' use {{ic|-i 2097152}} for a 2 MiB ratio and  {{ic|-i 6291456}} for a 6 MiB ratio.
 +
 
 +
{{Tip|Conversely, if you are setting up a partition dedicated to host millions of small files like emails or newsgroup items, you can use smaller ''usage-type'' values such as {{ic|news}} (one inode for every 4096 bytes) or {{ic|small}} (same plus smaller inode and block sizes).}}
 +
 
 +
{{Warning|If you make a heavy use of symbolic links, make sure to keep the inode count high enough with a low bytes-per-inode ratio, because while not taking more space every new symbolic link consumes one new inode and therefore the filesystem may run out of them quickly.}}
 +
 
 +
=== Reserved blocks ===
 +
 
 +
By default, 5% of the filesystem blocks will be reserved for the super-user, to avoid fragmentation and "''allow root-owned daemons to continue to function correctly after non-privileged processes are prevented from writing to the filesystem''" (from {{ic|man mkfs.ext4}}).
 +
 
 +
For modern high-capacity disks, this is higher than necessary if the partition is used as a long-term archive or not crucial to system operations (like {{ic|/home}}). See [http://www.redhat.com/archives/ext3-users/2009-January/msg00026.html this email] for the opinion of ext4 developer Ted Ts'o on reserved blocks.
 +
 
 +
It is generally safe to reduce the percentage of reserved blocks to free up disk space when the partition is either:
 +
 
 +
* Very large (for example > 50G)
 +
* Used as long-term archive, i.e., where files will not be deleted and created very often
 +
 
 +
The {{ic|-m}} option of ext4-related utilities allows to specify the percentage of reserved blocks.
 +
 
 +
To totally prevent reserving blocks upon filesystem creation, use:
 +
 
 +
# mkfs.ext4 -m 0 /dev/''device''
 +
 
 +
To reduce it to 1% afterwards, use:
 +
 
 +
# tune2fs -m 1 /dev/''device''
 +
 
 +
You can use {{ic|findmnt(8)}} to find the device name:
 +
 
 +
$ findmnt ''/the/mount/point''
 +
 
 +
==Migrating from ext2/ext3 to ext4==
 +
 
 +
===Mounting ext2/ext3 partitions as ext4 without converting===
  
 
====Rationale====
 
====Rationale====
  
A compromise between fully converting to ext4 and simply remaining with ext3 is to mount existing ext3 partitions as ext4.
+
A compromise between fully converting to ext4 and simply remaining with ext2/ext3 is to mount the partitions as ext4.
  
 
'''Pros:'''
 
'''Pros:'''
* Compatibility (the filesystem can continue to be mounted as ext3) – This allows users to still read the filesystem from other distributions/operating systems without ext4 support (e.g. Windows with ext3 drivers)
+
* Compatibility (the filesystem can continue to be mounted as ext3) – This allows users to still read the filesystem from other operating systems without ext4 support (e.g. Windows with ext2/ext3 drivers)
* Improved performance (though not as much as a fully-converted ext4 partition) – See [http://kernelnewbies.org/Ext4 Ext4 - Linux Kernel Newbies] for details
+
* Improved performance (though not as much as a fully-converted ext4 partition).[http://kernelnewbies.org/Ext4] [http://events.linuxfoundation.org/slides/2010/linuxcon_japan/linuxcon_jp2010_fujita.pdf]
  
 
'''Cons:'''
 
'''Cons:'''
Line 53: Line 101:
 
====Procedure====
 
====Procedure====
  
# Edit {{ic|/etc/fstab}} and change the 'type' from ext3 to ext4 for any partitions you would like to mount as ext4.
+
# Edit {{ic|/etc/fstab}} and change the 'type' from ext2/ext3 to ext4 for any partitions you would like to mount as ext4.
 
# Re-mount the affected partitions.
 
# Re-mount the affected partitions.
# Done.
 
  
===Converting ext3 partitions to ext4===
+
===Converting ext2/ext3 partitions to ext4===
  
 
====Rationale====
 
====Rationale====
Line 64: Line 111:
  
 
'''Pros:'''
 
'''Pros:'''
* Improved performance and new features – See [http://kernelnewbies.org/Ext4 Ext4 - Linux Kernel Newbies] for details
+
* Improved performance and new features.[http://kernelnewbies.org/Ext4] [http://events.linuxfoundation.org/slides/2010/linuxcon_japan/linuxcon_jp2010_fujita.pdf]
  
 
'''Cons:'''
 
'''Cons:'''
* Read-only access from Windows can be provided by Ext2Explore, but there is currently no driver for writing data.
+
* Partitions that contain mostly static files, such as a {{ic|/boot}} partition, may not benefit from the new features. Also, adding a journal (which is implied by moving a ext2 partition to ext3/4) always incurs performance overhead.  
* Irreversible (ext4 partitions cannot be 'downgraded' to ext3)
+
* Irreversible (ext4 partitions cannot be 'downgraded' to ext2/ext3. It is, however, backwards compatible until extent and other unique options are enabled)
  
 
====Procedure====
 
====Procedure====
  
These instructions were adapted from http://ext4.wiki.kernel.org/index.php/Ext4_Howto and https://bbs.archlinux.org/viewtopic.php?id=61602. They have been tested and confirmed by this author as of January 16, 2009.
+
These instructions were adapted from [http://ext4.wiki.kernel.org/index.php/Ext4_Howto Kernel documentation] and an [https://bbs.archlinux.org/viewtopic.php?id=61602 BBS thread].  
  
* '''UPGRADE!''' Perform a sysupgrade to ensure all required packages are up-to-date: {{Ic|pacman -Syu}}
+
{{Warning|
* '''[[Backup programs|BACK-UP!]]''' Back-up all data on any ext3 partitions that are to be converted to ext4. Although ext4 is considered 'stable' for general use, it is still a relatively young and untested file system. Furthermore, this conversion process was only tested on a relatively simple setup; it is impossible to test each of the many possible configurations the user may be running.
+
* If you convert the system's root filesystem, ensure that the 'fallback' initramfs is available at reboot. Alternatively, add {{ic|ext4}} according to [[Mkinitcpio#MODULES]] and re-create the 'default' initial ramdisk with {{Ic|mkinitcpio -p linux}} before starting.
* Edit {{ic|/etc/fstab}} and change the 'type' from ext3 to ext4 for any partitions that are to be converted to ext4.
+
* If you decide to convert a separate {{ic|/boot}} partition, ensure the [[bootloader]] supports booting from ext4.}}
  
{{Warning|ext4 is backwards-compatible with ext3 until extents and other new fancy options are enabled. If the user has a partition that is shared with another OS that cannot yet read ext4 partitions, it is possible to mount said partition as ext4 in Arch and still be able to use it as ext3 elsewhere at this point... Not so after the next step! Note, however, that there are fewer benefits to using ext4 if the partition is not fully converted.}}
+
In the following steps {{Ic|/dev/sdxX}} denotes the path to the partition to be converted, such as {{Ic|/dev/sda1}}.
  
* The conversion process with {{Ic|e2fsprogs}} must be done when the drive is not mounted. If converting one's root (/) partition, the simplest way to achieve this is to boot from some other live medium, as described in the 'Prerequisites' section above.
+
# '''[[Backup programs|BACK-UP!]]''' Back-up all data on any ext3 partitions that are to be converted to ext4. A useful package, especially for root partitions, is [http://clonezilla.org Clonezilla].
** Boot the live medium (if necessary).
+
# Edit {{ic|/etc/fstab}} and change the 'type' from ext3 to ext4 for any partitions that are to be converted to ext4.
** For each partition to be converted to ext4:
+
# Boot the live medium (if necessary). The conversion process with {{Pkg|e2fsprogs}} must be done when the drive is not mounted. If converting a root partition, the simplest way to achieve this is to boot from some other live medium.
*** Ensure the partition is '''NOT''' mounted
+
# Ensure the partition is '''NOT''' mounted
*** Run {{Ic|tune2fs -O extents,uninit_bg,dir_index /dev/the_partition}} (where {{Ic|/dev/the_partition}} is replaced by the path to the desired partition, such as {{Ic|/dev/sda1}})
+
# If you want to convert a ext2 partition, the first conversion step is to add a [[File systems#Journaling|journal]] by running {{ic|tune2fs -j /dev/sdxX}} as root; making it a ext3 partition. 
*** Run {{Ic|fsck -fDp /dev/the_partition}}
+
# Run {{Ic|tune2fs -O extent,uninit_bg,dir_index /dev/sdxX}} as root. This command converts the ext3 filesystem to ext4 (irreversibly).
 +
# Run {{Ic|fsck -f /dev/sdxX}} as root.
 +
#* The user '''must ''fsck''''' the filesystem, or it '''will be unreadable'''! This ''fsck'' run is needed to return the filesystem to a consistent state. It '''will''' find checksum errors in the group descriptors - this '''is''' expected. The {{ic|-f}} option asks ''fsck'' to force checking even if the file system seems clean. The {{ic|-p}} option may be used on top to 'automatically repair' (otherwise, the user will be asked for input for each error).
 +
# Recommended: mount the partition and run {{Ic|e4defrag -c -v /dev/sdxX}} as root.
 +
#* Even though the filesystem is now converted to ext4, all files that have been written before the conversion do not yet take advantage of the extent option of ext4, which will improve large file performance and reduce fragmentation and filesystem check time. In order to fully take advantage of ext4, all files would have to be rewritten on disk. Use ''e4defrag'' to take care of this problem.
 +
# Reboot Arch Linux!
  
{{Note|The user '''MUST''' fsck the filesystem, or it will be unreadable! This fsck run is needed to return the filesystem to a consistent state. '''It WILL find checksum errors in the group descriptors''' -- this is expected. The '-f' parameter asks fsck to force checking even if the file system seems clean. The '-p' parameter asks fsck to 'automatically repair' (otherwise, the user will be asked for input for each error).
+
== Using file-based encryption ==
You may need to run fsck -f rather than fsck -fp.}}
+
  
* Reboot Arch Linux!
+
Since Linux 4.1, ext4 supports file-based encryption.  In a directory tree marked for encryption, file contents, filenames, and symbolic link targets are all encrypted.  Encryption keys are stored in the kernel keyring.  See also [http://blog.quarkslab.com/a-glimpse-of-ext4-filesystem-level-encryption.html Quarkslab's blog] entry with a write-up of the feature, an overview of the implementation state, and practical test results with kernel 4.1.
  
{{Warning|1=If the user converted their root (/) partition, a kernel panic may be encountered when attempting to boot. If this happens, simply reboot using the 'fallback' initial ramdisk and re-create the 'default' initial ramdisk: {{Ic|mkinitcpio -p linux}}}}
+
Make sure you are using a kernel with the option {{ic|CONFIG_EXT4_ENCRYPTION}} enabled and have the {{Pkg|e2fsprogs}} package updated to at least version 1.43.
  
====Migrating files to extents====
+
Then verify that your filesystem is using a supported block size for encryption:
{{warning|Do '''NOT''' use the following method with Mercurial repository that have been cloned locally, as doing so will corrupt the repository. It might also corrupt other hard link in the filesystem.}}
+
Even though the filesystem is now converted to ext4, all files that have been written before the conversion do not yet take advantage of the new ''extents'' of ext4, which will improve large file performance and reduce fragmentation and filesystem check time. In order to fully take advantage of ext4, all files would have to be rewritten on disk. A utility called ''e4defrag'' is being developed and will take care of this task ; however, it is not yet ready for production.
+
  
Fortunately, it is possible to use the ''chattr'' program, which will cause the kernel to rewrite the file using extents. It is possible to run this command on all files and directories of one partition (e.g. if /home is on a dedicated partition):
+
{{hc|# tune2fs -l /dev/''device'' {{!}} grep 'Block size'|
(Must be run as root)
+
Block size:              4096
find /home -xdev -type f -print0 | xargs -0 chattr +e
+
}}
find /home -xdev -type d -print0 | xargs -0 chattr +e
+
It is recommended to test this command on a small number of files first, and check if everything is going all right. It may also be useful to check the filesystem after conversion.
+
  
Using the ''lsattr'' command, it is possible to check that files are now using ''extents''. The letter 'e' should appear in the attribute list of the listed files.
+
{{hc|# getconf PAGE_SIZE|
 +
4096
 +
}}
  
==Tips and tricks==
+
If these values are not the same, then your filesystem will not support encryption, so '''do not proceed further'''.
===Remove reserved blocks===
+
  
By default 5% of a filesystem will be flagged as reserved for root user. For modern high-capacity disks, this is much higher than necessary - particularly if the partition is not being used for system files.  It is generally safe to reduce the percentage of reserved blocks to free up disk space when the partition is either
+
Next, enable the encryption feature flag on your filesystem:
  
*Very large (for example >50 G)
+
# tune2fs -O encrypt /dev/''device''
*Not being used for system files
+
  
Use the tune2fs utility to do this. The command below would set the percentage of reserved blocks on the partition /dev/sdXY to 1.0%:
+
{{Warning|Once the encryption feature flag is enabled, kernels older than 4.1 will be unable to mount the filesystem.}}
  
tune2fs -m 1.0 /dev/sdXY
+
Next, make a directory to encrypt:
  
If you need to know your drives labels type the following:
+
# mkdir /encrypted
  
  df -T | awk '{print $1,$2,$NF}' | grep "^/dev"
+
Note that encryption can only be applied to an empty directory. The encryption setting (or "encryption policy") is inherited by new files and subdirectories.  Encrypting existing files is not yet supported.
  
==Troubleshooting==
+
Now generate and add a new key to your keyring.  This step must be repeated every time you flush your keyring (reboot):
  
===Data corruption===
+
# e4crypt add_key
Some early adopters of ext4 encountered data corruption after a hard reboot. Please read [http://www.h-online.com/open/Ext4-data-loss-explanations-and-workarounds--/news/112892 Ext4 data loss; explanations and workarounds] for more information.
+
Enter passphrase (echo disabled):  
 +
Added key with descriptor [f88747555a6115f5]
  
Since kernel 2.6.30, ext4 is considered "safe(r)." Several patches improved the robustness of ext4 - albeit at a slight performance cost. A new mount option ({{Ic|auto_da_alloc}}) can be used to disable this behavior. For more information, please read [http://kernelnewbies.org/Linux_2_6_30#head-329ba44b44a7f58c98ae22b8f2730418cdd6630d Linux 2 6 30 - Filesystems performance improvements].
+
{{Warning|If you forget your passphrase, there will be no way to decrypt your files!  It also isn't yet possible to change a passphrase after you've set it.}}
  
For kernel versions earlier than 2.6.30, consider adding {{Ic|1=rootflags=data=ordered}} to the {{Ic|kernel}} line in GRUB's {{ic|menu.lst}} as a preventative measure.
+
{{Note|To help prevent [[Wikipedia:Dictionary_attack|dictionary attacks]] on your passphrase, a random [[Wikipedia:Salt_(cryptography)|salt]] is automatically generated and stored in the ext4 filesystem superblock. Both the passphrase ''and'' the salt are used to derive the actual encryption key. As a consequence of this, if you have multiple ext4 filesystems with encryption enabled mounted, then {{ic|e4crypt add_key}} will actually add multiple keys, one per filesystem.  Although any key can be used on any filesystem, it would be wise to only use, on a given filesystem, keys using that filesystem's salt.  Otherwise, you risk being unable to decrypt files on filesystem A if filesystem B is unmounted.  Alternatively, you can use the {{ic|-S}} option to {{ic|e4crypt add_key}} to specify a salt yourself.}}
  
=== Barriers and Performance ===
+
Now you know the descriptor for your key.  Make sure the key is in your session keyring:
  
Since kernel 2.6.30, ext4 performance has decreased due to changes that serve to improve data integrity [http://www.phoronix.com/scan.php?page=article&item=ext4_then_now&num=1].
+
# keyctl show
 +
Session Keyring
 +
1021618178 --alswrv  1000  1000  keyring: _ses
 +
  176349519 --alsw-v  1000  1000  \_ logon: ext4:f88747555a6115f5
  
''Most file systems (XFS, ext3, ext4, reiserfs) send write barriers to disk after fsync or during transaction commits. Write barriers enforce proper ordering of writes, making volatile disk write caches safe to use (at some performance penalty). If your disks are battery-backed in one way or another, disabling barriers may safely improve performance.''
+
Almost done. Now set an encryption policy on the directory (assign the key to it):
  
''Sending write barriers can be disabled using the barrier=0 mount option (for ext3, ext4, and reiserfs), or using the nobarrier mount option (for XFS)'' [http://doc.opensuse.org/products/draft/SLES/SLES-tuning_sd_draft/cha.tuning.io.html].
+
# e4crypt set_policy f88747555a6115f5 /encrypted
 +
 
 +
That is all. If you try accessing the directory without adding the key into your keyring, filenames and their contents will be seen as encrypted gibberish.
 +
 
 +
== Tips and tricks ==
 +
 
 +
=== E4rat ===
 +
 
 +
[[E4rat]] is a preload application designed for the ext4 filesystem. It monitors files opened during boot, optimizes their placement on the partition to improve access time, and preloads them at the very beginning of the boot process. [[E4rat]] does not offer improvements with [[SSD]]s, whose access time is negligible compared to hard disks.
 +
 
 +
=== Barriers and performance ===
 +
 
 +
Since kernel 2.6.30, ext4 performance has decreased due to changes that serve to improve data integrity.[http://www.phoronix.com/scan.php?page=article&item=ext4_then_now&num=1]
 +
 
 +
:Most file systems (XFS, ext3, ext4, reiserfs) send write barriers to disk after fsync or during transaction commits. Write barriers enforce proper ordering of writes, making volatile disk write caches safe to use (at some performance penalty). If your disks are battery-backed in one way or another, disabling barriers may safely improve performance.
 +
 
 +
:Sending write barriers can be disabled using the {{Ic|1=barrier=0}} mount option (for ext3, ext4, and reiserfs), or using the {{Ic|1=nobarrier}} mount option (for XFS).
  
 
{{Warning|Disabling barriers when disks cannot guarantee caches are properly written in case of power failure can lead to severe file system corruption and data loss.}}
 
{{Warning|Disabling barriers when disks cannot guarantee caches are properly written in case of power failure can lead to severe file system corruption and data loss.}}
  
To turn barriers off add the option {{Ic|1=barrier=0}} to the desired filesystem in {{ic|/etc/fstab}}. For example:
+
To turn barriers off add the option {{Ic|1=barrier=0}} to the desired filesystem. For example:
 +
 
 +
{{hc|/etc/fstab|2=
 +
/dev/sda5    /    ext4    noatime,barrier=0    0    1
 +
}}
 +
 
 +
==Enabling metadata checksums==
 +
 
 +
In both cases of enabling metadata checksums for new and existing filesystems, you will need to load some kernel modules.
 +
 
 +
If your CPU supports SSE 4.2, make sure the {{Ic|crc32c_intel}} kernel module is loaded in order to enable the hardware accelerated CRC32C algorithm. If not you will need to load the {{Ic|crc32c_generic}} module.
 +
 
 +
If this is the root file-system, add {{Ic|crypto-crc32c}} module (an [[Kernel modules#Obtaining information|alias]] to all CRC32C modules) to {{Ic|/etc/mkinitcpio.conf}}:
 +
 
 +
MODULES="... crypto-crc32c"
 +
 
 +
And then regenerate the initramfs. See [[Mkinitcpio#Image creation and activation]].
 +
 
 +
After this, you are ready to enable support for metadata checksums as described in the following two sections. In both cases the file system must not be mounted.
 +
 
 +
More about metadata checksums can be read on the [https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums ext4 wiki].
 +
 
 +
=== New filesystem ===
 +
 
 +
To enable support for ext4 metadata checksums on a new file system make sure that you have {{Ic|e2fsprogs 1.43}} or newer and simply do a:
 +
 
 +
# mkfs.ext4 ''/dev/path/to/disk''
 +
 
 +
The {{Ic |metadata_csum}} and {{Ic|64bit}} options will be enabled by default.
 +
 
 +
The file-system can then be mounted as usual.
 +
 
 +
=== Existing filesystem ===
 +
 
 +
To enable support on an existing ext4 file system do the following.
 +
 
 +
This needs to be done with the partition unmounted, so if you want to convert the root, you'll need to run off an USB live distro.
 +
 
 +
First the partition needs to be checked and optimized using:
 +
 
 +
# e2fsck -Df ''/dev/path/to/disk'' 
 +
 
 +
Then the file-system needs to be converted to 64bit:
 +
 
 +
# resize2fs -b ''/dev/path/to/disk''
 +
 
 +
Finally checksums can be added
 +
 
 +
# tune2fs -O metadata_csum ''/dev/path/to/disk''
 +
 
 +
The file-system can then be mounted as usual.
 +
 
 +
You can check whether the features were successfully enabled by running:
 +
 
 +
# dumpe2fs ''/dev/path/to/disk''
 +
 
 +
=== Impact on performance ===
  
# /dev/sda5    /    ext4   noatime,barrier=0    0    1
+
Keep in mind that the intel module consistently performs 10x faster than the generic one, peaking at 20x faster as can be seen in [https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums#Benchmarking this benchmark].
  
==== E4rat ====
+
== See also ==
  
[[E4rat]] is a preload application designed for the ext4 filesystem. It monitors files opened during boot, optimizes their placement on the partition to improve access time, and preloads them at the very beginning of the boot process.
+
* [https://ext4.wiki.kernel.org/ Official Ext4 wiki]
 +
* [https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout Ext4 Disk Layout] described in its wiki
 +
* [http://lwn.net/Articles/639427/ Ext4 Encryption] LWN article
 +
* Kernel commits for ext4 encryption [https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6162e4b0bedeb3dac2ba0a5e1b1f56db107d97ec] [https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8663da2c0919896788321cd8a0016af08588c656]
 +
* [http://e2fsprogs.sourceforge.net/e2fsprogs-release.html e2fsprogs Changelog]
 +
* [https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums Ext4 Metadata Checksums]

Latest revision as of 10:51, 30 November 2016

Related articles

From Ext4 - Linux Kernel Newbies:

Ext4 is the evolution of the most used Linux filesystem, Ext3. In many ways, Ext4 is a deeper improvement over Ext3 than Ext3 was over Ext2. Ext3 was mostly about adding journaling to Ext2, but Ext4 modifies important data structures of the filesystem such as the ones destined to store the file data. The result is a filesystem with an improved design, better performance, reliability, and features.

Create a new ext4 filesystem

To format a partition do:

# mkfs.ext4 /dev/partition
Tip: See the mkfs.ext4 man page for more options; edit /etc/mke2fs.conf to view/configure default options.

Bytes-per-inode ratio

From man mkfs.ext4:

mke2fs creates an inode for every bytes-per-inode bytes of space on the disk. The larger the bytes-per-inode ratio, the fewer inodes will be created.

Creating a new file, directory, symlink etc. requires at least one free inode. If the inode count is too low, no file can be created on the filesystem even though there is still space left on it.

Because it is not possible to change either the bytes-per-inode ratio or the inode count after the filesystem is created, mkfs.ext4 uses by default a rather low ratio of one inode every 16384 bytes (16 Kb) to avoid this situation.

However, for partitions with size in the hundreds or thousands of GB and average file size in the megabyte range, this usually results in a much too large inode number because the number of files created never reaches the number of inodes.

This results in a waste of disk space, because all those unused inodes each take up 256 bytes on the filesystem (this is also set in /etc/mke2fs.conf but should not be changed). 256 * several millions = quite a few gigabytes wasted in unused inodes.

This situation can be evaluated by comparing the {I}Use% figures provided by df and df -i:

$ df -h /home
Filesystem              Size    Used   Avail  Use%   Mounted on
/dev/mapper/lvm-home    115G    56G    59G    49%    /home
$ df -hi /home
Filesystem              Inodes  IUsed  IFree  IUse%  Mounted on
/dev/mapper/lvm-home    1.8M    1.1K   1.8M   1%     /home

To specify a different bytes-per-inode ratio, you can use the -T usage-type option which hints at the expected usage of the filesystem using types defined in /etc/mke2fs.conf. Among those types are the bigger largefile and largefile4 which offer more relevant ratios of one inode every 1 MiB and 4 MiB respectively. It can be used as such:

# mkfs.ext4 -T largefile /dev/device

The bytes-per-inode ratio can also be set directly via the -i option: e.g. use -i 2097152 for a 2 MiB ratio and -i 6291456 for a 6 MiB ratio.

Tip: Conversely, if you are setting up a partition dedicated to host millions of small files like emails or newsgroup items, you can use smaller usage-type values such as news (one inode for every 4096 bytes) or small (same plus smaller inode and block sizes).
Warning: If you make a heavy use of symbolic links, make sure to keep the inode count high enough with a low bytes-per-inode ratio, because while not taking more space every new symbolic link consumes one new inode and therefore the filesystem may run out of them quickly.

Reserved blocks

By default, 5% of the filesystem blocks will be reserved for the super-user, to avoid fragmentation and "allow root-owned daemons to continue to function correctly after non-privileged processes are prevented from writing to the filesystem" (from man mkfs.ext4).

For modern high-capacity disks, this is higher than necessary if the partition is used as a long-term archive or not crucial to system operations (like /home). See this email for the opinion of ext4 developer Ted Ts'o on reserved blocks.

It is generally safe to reduce the percentage of reserved blocks to free up disk space when the partition is either:

  • Very large (for example > 50G)
  • Used as long-term archive, i.e., where files will not be deleted and created very often

The -m option of ext4-related utilities allows to specify the percentage of reserved blocks.

To totally prevent reserving blocks upon filesystem creation, use:

# mkfs.ext4 -m 0 /dev/device

To reduce it to 1% afterwards, use:

# tune2fs -m 1 /dev/device

You can use findmnt(8) to find the device name:

$ findmnt /the/mount/point

Migrating from ext2/ext3 to ext4

Mounting ext2/ext3 partitions as ext4 without converting

Rationale

A compromise between fully converting to ext4 and simply remaining with ext2/ext3 is to mount the partitions as ext4.

Pros:

  • Compatibility (the filesystem can continue to be mounted as ext3) – This allows users to still read the filesystem from other operating systems without ext4 support (e.g. Windows with ext2/ext3 drivers)
  • Improved performance (though not as much as a fully-converted ext4 partition).[1] [2]

Cons:

  • Fewer features of ext4 are used (only those that do not change the disk format such as multiblock allocation and delayed allocation)
Note: Except for the relative novelty of ext4 (which can be seen as a risk), there is no major drawback to this technique.

Procedure

  1. Edit /etc/fstab and change the 'type' from ext2/ext3 to ext4 for any partitions you would like to mount as ext4.
  2. Re-mount the affected partitions.

Converting ext2/ext3 partitions to ext4

Rationale

To experience the benefits of ext4, an irreversible conversion process must be completed.

Pros:

  • Improved performance and new features.[3] [4]

Cons:

  • Partitions that contain mostly static files, such as a /boot partition, may not benefit from the new features. Also, adding a journal (which is implied by moving a ext2 partition to ext3/4) always incurs performance overhead.
  • Irreversible (ext4 partitions cannot be 'downgraded' to ext2/ext3. It is, however, backwards compatible until extent and other unique options are enabled)

Procedure

These instructions were adapted from Kernel documentation and an BBS thread.

Warning:
  • If you convert the system's root filesystem, ensure that the 'fallback' initramfs is available at reboot. Alternatively, add ext4 according to Mkinitcpio#MODULES and re-create the 'default' initial ramdisk with mkinitcpio -p linux before starting.
  • If you decide to convert a separate /boot partition, ensure the bootloader supports booting from ext4.

In the following steps /dev/sdxX denotes the path to the partition to be converted, such as /dev/sda1.

  1. BACK-UP! Back-up all data on any ext3 partitions that are to be converted to ext4. A useful package, especially for root partitions, is Clonezilla.
  2. Edit /etc/fstab and change the 'type' from ext3 to ext4 for any partitions that are to be converted to ext4.
  3. Boot the live medium (if necessary). The conversion process with e2fsprogs must be done when the drive is not mounted. If converting a root partition, the simplest way to achieve this is to boot from some other live medium.
  4. Ensure the partition is NOT mounted
  5. If you want to convert a ext2 partition, the first conversion step is to add a journal by running tune2fs -j /dev/sdxX as root; making it a ext3 partition.
  6. Run tune2fs -O extent,uninit_bg,dir_index /dev/sdxX as root. This command converts the ext3 filesystem to ext4 (irreversibly).
  7. Run fsck -f /dev/sdxX as root.
    • The user must fsck the filesystem, or it will be unreadable! This fsck run is needed to return the filesystem to a consistent state. It will find checksum errors in the group descriptors - this is expected. The -f option asks fsck to force checking even if the file system seems clean. The -p option may be used on top to 'automatically repair' (otherwise, the user will be asked for input for each error).
  8. Recommended: mount the partition and run e4defrag -c -v /dev/sdxX as root.
    • Even though the filesystem is now converted to ext4, all files that have been written before the conversion do not yet take advantage of the extent option of ext4, which will improve large file performance and reduce fragmentation and filesystem check time. In order to fully take advantage of ext4, all files would have to be rewritten on disk. Use e4defrag to take care of this problem.
  9. Reboot Arch Linux!

Using file-based encryption

Since Linux 4.1, ext4 supports file-based encryption. In a directory tree marked for encryption, file contents, filenames, and symbolic link targets are all encrypted. Encryption keys are stored in the kernel keyring. See also Quarkslab's blog entry with a write-up of the feature, an overview of the implementation state, and practical test results with kernel 4.1.

Make sure you are using a kernel with the option CONFIG_EXT4_ENCRYPTION enabled and have the e2fsprogs package updated to at least version 1.43.

Then verify that your filesystem is using a supported block size for encryption:

# tune2fs -l /dev/device | grep 'Block size'
Block size:               4096
# getconf PAGE_SIZE
4096

If these values are not the same, then your filesystem will not support encryption, so do not proceed further.

Next, enable the encryption feature flag on your filesystem:

# tune2fs -O encrypt /dev/device
Warning: Once the encryption feature flag is enabled, kernels older than 4.1 will be unable to mount the filesystem.

Next, make a directory to encrypt:

# mkdir /encrypted

Note that encryption can only be applied to an empty directory. The encryption setting (or "encryption policy") is inherited by new files and subdirectories. Encrypting existing files is not yet supported.

Now generate and add a new key to your keyring. This step must be repeated every time you flush your keyring (reboot):

# e4crypt add_key
Enter passphrase (echo disabled): 
Added key with descriptor [f88747555a6115f5]
Warning: If you forget your passphrase, there will be no way to decrypt your files! It also isn't yet possible to change a passphrase after you've set it.
Note: To help prevent dictionary attacks on your passphrase, a random salt is automatically generated and stored in the ext4 filesystem superblock. Both the passphrase and the salt are used to derive the actual encryption key. As a consequence of this, if you have multiple ext4 filesystems with encryption enabled mounted, then e4crypt add_key will actually add multiple keys, one per filesystem. Although any key can be used on any filesystem, it would be wise to only use, on a given filesystem, keys using that filesystem's salt. Otherwise, you risk being unable to decrypt files on filesystem A if filesystem B is unmounted. Alternatively, you can use the -S option to e4crypt add_key to specify a salt yourself.

Now you know the descriptor for your key. Make sure the key is in your session keyring:

# keyctl show
Session Keyring
1021618178 --alswrv   1000  1000  keyring: _ses
 176349519 --alsw-v   1000  1000   \_ logon: ext4:f88747555a6115f5

Almost done. Now set an encryption policy on the directory (assign the key to it):

# e4crypt set_policy f88747555a6115f5 /encrypted

That is all. If you try accessing the directory without adding the key into your keyring, filenames and their contents will be seen as encrypted gibberish.

Tips and tricks

E4rat

E4rat is a preload application designed for the ext4 filesystem. It monitors files opened during boot, optimizes their placement on the partition to improve access time, and preloads them at the very beginning of the boot process. E4rat does not offer improvements with SSDs, whose access time is negligible compared to hard disks.

Barriers and performance

Since kernel 2.6.30, ext4 performance has decreased due to changes that serve to improve data integrity.[5]

Most file systems (XFS, ext3, ext4, reiserfs) send write barriers to disk after fsync or during transaction commits. Write barriers enforce proper ordering of writes, making volatile disk write caches safe to use (at some performance penalty). If your disks are battery-backed in one way or another, disabling barriers may safely improve performance.
Sending write barriers can be disabled using the barrier=0 mount option (for ext3, ext4, and reiserfs), or using the nobarrier mount option (for XFS).
Warning: Disabling barriers when disks cannot guarantee caches are properly written in case of power failure can lead to severe file system corruption and data loss.

To turn barriers off add the option barrier=0 to the desired filesystem. For example:

/etc/fstab
/dev/sda5    /    ext4    noatime,barrier=0    0    1

Enabling metadata checksums

In both cases of enabling metadata checksums for new and existing filesystems, you will need to load some kernel modules.

If your CPU supports SSE 4.2, make sure the crc32c_intel kernel module is loaded in order to enable the hardware accelerated CRC32C algorithm. If not you will need to load the crc32c_generic module.

If this is the root file-system, add crypto-crc32c module (an alias to all CRC32C modules) to /etc/mkinitcpio.conf:

MODULES="... crypto-crc32c"

And then regenerate the initramfs. See Mkinitcpio#Image creation and activation.

After this, you are ready to enable support for metadata checksums as described in the following two sections. In both cases the file system must not be mounted.

More about metadata checksums can be read on the ext4 wiki.

New filesystem

To enable support for ext4 metadata checksums on a new file system make sure that you have e2fsprogs 1.43 or newer and simply do a:

# mkfs.ext4 /dev/path/to/disk

The metadata_csum and 64bit options will be enabled by default.

The file-system can then be mounted as usual.

Existing filesystem

To enable support on an existing ext4 file system do the following.

This needs to be done with the partition unmounted, so if you want to convert the root, you'll need to run off an USB live distro.

First the partition needs to be checked and optimized using:

# e2fsck -Df /dev/path/to/disk  

Then the file-system needs to be converted to 64bit:

# resize2fs -b /dev/path/to/disk 

Finally checksums can be added

# tune2fs -O metadata_csum /dev/path/to/disk

The file-system can then be mounted as usual.

You can check whether the features were successfully enabled by running:

# dumpe2fs /dev/path/to/disk

Impact on performance

Keep in mind that the intel module consistently performs 10x faster than the generic one, peaking at 20x faster as can be seen in this benchmark.

See also