Difference between revisions of "Ext4"

From ArchWiki
Jump to navigation Jump to search
(kernel modules for csums)
m (it was unclear whether this command was to be issued after the previous, or after someone had created a file system with their own parameters. By making this change, the previous percent of reserved blocks is not assumed.)
 
(115 intermediate revisions by 24 users not shown)
Line 6: Line 6:
 
[[it:Ext4]]
 
[[it:Ext4]]
 
[[ja:Ext4]]
 
[[ja:Ext4]]
 +
[[pt:Ext4]]
 
[[ru:Ext4]]
 
[[ru:Ext4]]
[[tr:Ext4]]
+
[[zh-hans:Ext4]]
[[zh-CN:Ext4]]
 
 
{{Related articles start}}
 
{{Related articles start}}
 
{{Related|File systems}}
 
{{Related|File systems}}
Line 22: Line 22:
 
  # mkfs.ext4 /dev/''partition''
 
  # mkfs.ext4 /dev/''partition''
  
{{Tip|See the mkfs.ext4 [[man page]] for more options; edit {{ic|/etc/mke2fs.conf}} to view/configure default options.}}
+
{{Tip|
 +
* See {{man|8|mke2fs}} for more options; edit {{ic|/etc/mke2fs.conf}} to view/configure default options.
 +
* If supported, you may want to enable [[#Enabling metadata checksums|metadata checksums]].
 +
}}
  
 
=== Bytes-per-inode ratio ===
 
=== Bytes-per-inode ratio ===
  
From {{ic|man mkfs.ext4}}:
+
From {{man|8|mke2fs}}:
  
 
:'''''mke2fs''' creates an inode for every ''bytes-per-inode'' bytes of space on the disk. The larger the ''bytes-per-inode'' ratio, the fewer inodes will be created.''
 
:'''''mke2fs''' creates an inode for every ''bytes-per-inode'' bytes of space on the disk. The larger the ''bytes-per-inode'' ratio, the fewer inodes will be created.''
Line 32: Line 35:
 
Creating a new file, directory, symlink etc. requires at least one free [[Wikipedia:Inode|inode]]. If the inode count is too low, no file can be created on the filesystem even though there is still space left on it.  
 
Creating a new file, directory, symlink etc. requires at least one free [[Wikipedia:Inode|inode]]. If the inode count is too low, no file can be created on the filesystem even though there is still space left on it.  
  
Because it is not possible to change either the bytes-per-inode ratio or the inode count after the filesystem is created, {{ic|mkfs.ext4}} uses by default a rather low ratio of one inode every 16384 bytes (16 Kb) to avoid this situation.
+
Because it is not possible to change either the bytes-per-inode ratio or the inode count after the filesystem is created, {{ic|mkfs.ext4}} uses by default a rather low ratio of one inode every 16384 bytes (16 KiB) to avoid this situation.
  
 
However, for partitions with size in the hundreds or thousands of GB and average file size in the megabyte range, this usually results in a much too large inode number because the number of files created never reaches the number of inodes.
 
However, for partitions with size in the hundreds or thousands of GB and average file size in the megabyte range, this usually results in a much too large inode number because the number of files created never reaches the number of inodes.
Line 42: Line 45:
 
{{hc|$ df -h /home|
 
{{hc|$ df -h /home|
 
Filesystem              Size    Used  Avail  '''Use%'''  Mounted on
 
Filesystem              Size    Used  Avail  '''Use%'''  Mounted on
/dev/mapper/lvm-home    115G    56G    59G    '''49%'''    /home}}
+
/dev/mapper/lvm-home    115G    56G    59G    '''49%'''    /home
 +
}}
 +
 
 
{{hc|$ df -hi /home|
 
{{hc|$ df -hi /home|
 
Filesystem              Inodes  IUsed  IFree  '''IUse%'''  Mounted on
 
Filesystem              Inodes  IUsed  IFree  '''IUse%'''  Mounted on
/dev/mapper/lvm-home    1.8M    1.1K  1.8M  '''1%'''    /home}}
+
/dev/mapper/lvm-home    1.8M    1.1K  1.8M  '''1%'''    /home
 +
}}
  
 
To specify a different bytes-per-inode ratio, you can use the {{ic|-T ''usage-type''}} option which hints at the expected usage of the filesystem using types defined in {{ic|/etc/mke2fs.conf}}. Among those types are the bigger {{ic|largefile}} and {{ic|largefile4}} which offer more relevant ratios of one inode every 1 MiB and 4 MiB respectively. It can be used as such:
 
To specify a different bytes-per-inode ratio, you can use the {{ic|-T ''usage-type''}} option which hints at the expected usage of the filesystem using types defined in {{ic|/etc/mke2fs.conf}}. Among those types are the bigger {{ic|largefile}} and {{ic|largefile4}} which offer more relevant ratios of one inode every 1 MiB and 4 MiB respectively. It can be used as such:
Line 59: Line 65:
 
=== Reserved blocks ===
 
=== Reserved blocks ===
  
By default, 5% of the filesystem blocks will be reserved for the super-user, to avoid fragmentation and "''allow root-owned daemons to continue to function correctly after non-privileged processes are prevented from writing to the filesystem''" (from {{ic|man mkfs.ext4}}).
+
By default, 5% of the filesystem blocks will be reserved for the super-user, to avoid fragmentation and "''allow root-owned daemons to continue to function correctly after non-privileged processes are prevented from writing to the filesystem''" (from {{man|8|mke2fs}}).
  
 
For modern high-capacity disks, this is higher than necessary if the partition is used as a long-term archive or not crucial to system operations (like {{ic|/home}}). See [http://www.redhat.com/archives/ext3-users/2009-January/msg00026.html this email] for the opinion of ext4 developer Ted Ts'o on reserved blocks.
 
For modern high-capacity disks, this is higher than necessary if the partition is used as a long-term archive or not crucial to system operations (like {{ic|/home}}). See [http://www.redhat.com/archives/ext3-users/2009-January/msg00026.html this email] for the opinion of ext4 developer Ted Ts'o on reserved blocks.
Line 74: Line 80:
 
  # mkfs.ext4 -m 0 /dev/''device''
 
  # mkfs.ext4 -m 0 /dev/''device''
  
To reduce it to 1% afterwards, use:
+
To change it to 1% afterwards, use:
  
 
  # tune2fs -m 1 /dev/''device''
 
  # tune2fs -m 1 /dev/''device''
  
You can use {{ic|findmnt(8)}} to find the device name:
+
You can use {{man|8|findmnt}} to find the device name:
  
 
  $ findmnt ''/the/mount/point''
 
  $ findmnt ''/the/mount/point''
  
==Migrating from ext2/ext3 to ext4==
+
== Migrating from ext2/ext3 to ext4 ==
  
===Mounting ext2/ext3 partitions as ext4 without converting===
+
=== Mounting ext2/ext3 partitions as ext4 without converting ===
  
====Rationale====
+
==== Rationale ====
  
 
A compromise between fully converting to ext4 and simply remaining with ext2/ext3 is to mount the partitions as ext4.
 
A compromise between fully converting to ext4 and simply remaining with ext2/ext3 is to mount the partitions as ext4.
  
 
'''Pros:'''
 
'''Pros:'''
 +
 
* Compatibility (the filesystem can continue to be mounted as ext3) – This allows users to still read the filesystem from other operating systems without ext4 support (e.g. Windows with ext2/ext3 drivers)
 
* Compatibility (the filesystem can continue to be mounted as ext3) – This allows users to still read the filesystem from other operating systems without ext4 support (e.g. Windows with ext2/ext3 drivers)
 
* Improved performance (though not as much as a fully-converted ext4 partition).[http://kernelnewbies.org/Ext4] [http://events.linuxfoundation.org/slides/2010/linuxcon_japan/linuxcon_jp2010_fujita.pdf]  
 
* Improved performance (though not as much as a fully-converted ext4 partition).[http://kernelnewbies.org/Ext4] [http://events.linuxfoundation.org/slides/2010/linuxcon_japan/linuxcon_jp2010_fujita.pdf]  
  
 
'''Cons:'''
 
'''Cons:'''
 +
 
* Fewer features of ext4 are used (only those that do not change the disk format such as multiblock allocation and delayed allocation)
 
* Fewer features of ext4 are used (only those that do not change the disk format such as multiblock allocation and delayed allocation)
  
 
{{Note|Except for the relative novelty of ext4 (which can be seen as a risk), '''there is no major drawback to this technique'''.}}
 
{{Note|Except for the relative novelty of ext4 (which can be seen as a risk), '''there is no major drawback to this technique'''.}}
  
====Procedure====
+
==== Procedure ====
  
 
# Edit {{ic|/etc/fstab}} and change the 'type' from ext2/ext3 to ext4 for any partitions you would like to mount as ext4.
 
# Edit {{ic|/etc/fstab}} and change the 'type' from ext2/ext3 to ext4 for any partitions you would like to mount as ext4.
 
# Re-mount the affected partitions.
 
# Re-mount the affected partitions.
  
===Converting ext2/ext3 partitions to ext4===
+
=== Converting ext2/ext3 partitions to ext4 ===
  
====Rationale====
+
==== Rationale ====
  
 
To experience the benefits of ext4, an irreversible conversion process must be completed.
 
To experience the benefits of ext4, an irreversible conversion process must be completed.
  
 
'''Pros:'''
 
'''Pros:'''
 +
 
* Improved performance and new features.[http://kernelnewbies.org/Ext4] [http://events.linuxfoundation.org/slides/2010/linuxcon_japan/linuxcon_jp2010_fujita.pdf]  
 
* Improved performance and new features.[http://kernelnewbies.org/Ext4] [http://events.linuxfoundation.org/slides/2010/linuxcon_japan/linuxcon_jp2010_fujita.pdf]  
  
 
'''Cons:'''
 
'''Cons:'''
 +
 
* Partitions that contain mostly static files, such as a {{ic|/boot}} partition, may not benefit from the new features. Also, adding a journal (which is implied by moving a ext2 partition to ext3/4) always incurs performance overhead.  
 
* Partitions that contain mostly static files, such as a {{ic|/boot}} partition, may not benefit from the new features. Also, adding a journal (which is implied by moving a ext2 partition to ext3/4) always incurs performance overhead.  
 
* Irreversible (ext4 partitions cannot be 'downgraded' to ext2/ext3. It is, however, backwards compatible until extent and other unique options are enabled)
 
* Irreversible (ext4 partitions cannot be 'downgraded' to ext2/ext3. It is, however, backwards compatible until extent and other unique options are enabled)
  
====Procedure====
+
==== Procedure ====
  
 
These instructions were adapted from [http://ext4.wiki.kernel.org/index.php/Ext4_Howto Kernel documentation] and an [https://bbs.archlinux.org/viewtopic.php?id=61602 BBS thread].  
 
These instructions were adapted from [http://ext4.wiki.kernel.org/index.php/Ext4_Howto Kernel documentation] and an [https://bbs.archlinux.org/viewtopic.php?id=61602 BBS thread].  
  
 
{{Warning|
 
{{Warning|
* If you convert the system's root filesystem, ensure that the 'fallback' initramfs is available at reboot. Alternatively, add {{ic|ext4}} according to [[Mkinitcpio#MODULES]] and re-create the 'default' initial ramdisk with {{Ic|mkinitcpio -p linux}} before starting.
+
* If you convert the system's root filesystem, ensure that the 'fallback' initramfs is available at reboot. Alternatively, add {{ic|ext4}} according to [[Mkinitcpio#MODULES]] and [[regenerate the initramfs]] before starting.
* If you decide to convert a separate {{ic|/boot}} partition, ensure the [[bootloader]] supports booting from ext4.}}
+
* If you decide to convert a separate {{ic|/boot}} partition, ensure the [[bootloader]] supports booting from ext4.
 +
}}
  
 
In the following steps {{Ic|/dev/sdxX}} denotes the path to the partition to be converted, such as {{Ic|/dev/sda1}}.  
 
In the following steps {{Ic|/dev/sdxX}} denotes the path to the partition to be converted, such as {{Ic|/dev/sda1}}.  
  
# '''[[Backup programs|BACK-UP!]]''' Back-up all data on any ext3 partitions that are to be converted to ext4. A useful package, especially for root partitions, is [http://clonezilla.org Clonezilla].
+
# [[Backup programs|Back up]] all data on any ext3 partitions that are to be converted to ext4. A useful package, especially for root partitions, is {{Pkg|clonezilla}}.
 
# Edit {{ic|/etc/fstab}} and change the 'type' from ext3 to ext4 for any partitions that are to be converted to ext4.
 
# Edit {{ic|/etc/fstab}} and change the 'type' from ext3 to ext4 for any partitions that are to be converted to ext4.
 
# Boot the live medium (if necessary). The conversion process with {{Pkg|e2fsprogs}} must be done when the drive is not mounted. If converting a root partition, the simplest way to achieve this is to boot from some other live medium.
 
# Boot the live medium (if necessary). The conversion process with {{Pkg|e2fsprogs}} must be done when the drive is not mounted. If converting a root partition, the simplest way to achieve this is to boot from some other live medium.
# Ensure the partition is '''NOT''' mounted
+
# Ensure the partition is ''not'' mounted
 
# If you want to convert a ext2 partition, the first conversion step is to add a [[File systems#Journaling|journal]] by running {{ic|tune2fs -j /dev/sdxX}} as root; making it a ext3 partition.   
 
# If you want to convert a ext2 partition, the first conversion step is to add a [[File systems#Journaling|journal]] by running {{ic|tune2fs -j /dev/sdxX}} as root; making it a ext3 partition.   
 
# Run {{Ic|tune2fs -O extent,uninit_bg,dir_index /dev/sdxX}} as root. This command converts the ext3 filesystem to ext4 (irreversibly).  
 
# Run {{Ic|tune2fs -O extent,uninit_bg,dir_index /dev/sdxX}} as root. This command converts the ext3 filesystem to ext4 (irreversibly).  
 
# Run {{Ic|fsck -f /dev/sdxX}} as root.  
 
# Run {{Ic|fsck -f /dev/sdxX}} as root.  
#* The user '''must ''fsck''''' the filesystem, or it '''will be unreadable'''! This ''fsck'' run is needed to return the filesystem to a consistent state. It '''will''' find checksum errors in the group descriptors - this '''is''' expected. The {{ic|-f}} option asks ''fsck'' to force checking even if the file system seems clean. The {{ic|-p}} option may be used on top to 'automatically repair' (otherwise, the user will be asked for input for each error).  
+
#* This step is necessary, otherwise the filesystem '''will be unreadable'''. This ''fsck'' run is needed to return the filesystem to a consistent state. It will find checksum errors in the group descriptors - this is expected. The {{ic|-f}} option asks ''fsck'' to force checking even if the file system seems clean. The {{ic|-p}} option may be used on top to "automatically repair" (otherwise, the user will be asked for input for each error).  
 
# Recommended: mount the partition and run {{Ic|e4defrag -c -v /dev/sdxX}} as root.
 
# Recommended: mount the partition and run {{Ic|e4defrag -c -v /dev/sdxX}} as root.
 
#* Even though the filesystem is now converted to ext4, all files that have been written before the conversion do not yet take advantage of the extent option of ext4, which will improve large file performance and reduce fragmentation and filesystem check time. In order to fully take advantage of ext4, all files would have to be rewritten on disk. Use ''e4defrag'' to take care of this problem.
 
#* Even though the filesystem is now converted to ext4, all files that have been written before the conversion do not yet take advantage of the extent option of ext4, which will improve large file performance and reduce fragmentation and filesystem check time. In order to fully take advantage of ext4, all files would have to be rewritten on disk. Use ''e4defrag'' to take care of this problem.
# Reboot Arch Linux!
+
# Reboot
  
== Using ext4 per directory encryption ==  
+
== Improving performance ==
  
Linux comes with an Ext4 feature to encrypt directories of a filesystem. See also [http://blog.quarkslab.com/a-glimpse-of-ext4-filesystem-level-encryption.html Quarkslab's blog] entry with a write-up of the features, an overview of the implementation state and practical test results with kernel 4.1.
+
=== E4rat ===
  
Encryption keys are stored in the keyring. To get started, make sure you have enabled {{ic|CONFIG_KEYS}} and {{ic|CONFIG_EXT4_ENCRYPTION}} kernel options and you have kernel 4.1 or higher. Note the Arch default {{Pkg|linux}} does not have {{ic|CONFIG_EXT4_ENCRYPTION}} set yet.  
+
[[E4rat]] is a preload application designed for the ext4 filesystem. It monitors files opened during boot, optimizes their placement on the partition to improve access time, and preloads them at the very beginning of the boot process. ''E4rat'' does not offer improvements with [[SSD]]s, whose access time is negligible compared to hard disks.
  
First of all, you need to update {{Pkg|e2fsprogs}} to at least version 1.43.
+
=== Disabling access time update ===
  
Let us make a directory that we will encrypt. Encryption policy can be set only on new empty directories. For example, if we are to encrypt {{ic|/encrypted/dir}}, create the upper level directory:
+
The ''ext4'' file system records information about when a file was last accessed and there is a cost associated with recording it. With the {{ic|noatime}} option, the access timestamps on the filesystem are not updated.
  
# mkdir /encrypted
+
{{hc|/etc/fstab|
 +
/dev/sda5    /    ext4    defaults,'''noatime'''    0    1
 +
}}
  
First generate a random salt value and store it in a safe place:
+
Doing so breaks applications that rely on access time, see [[fstab#atime options]] for possible solutions.
  
# head -c 16 /dev/random | xxd -p
+
=== Increasing commit interval ===
877282f53bd0adbbef92142fc4cac459
 
  
Now generate and add a new key into your keyring:
+
The sync interval for data and metadata can be increased by providing a higher time delay to the {{ic|commit}} option.
this step should be repeated every time you flush your keychain (reboot)
 
  
# e4crypt add_key -S 0x877282f53bd0adbbef92142fc4cac459
+
The default 5 sec means that if the power is lost, one will lose as much as the latest 5 seconds of work.
Enter passphrase (echo disabled):
+
It forces a full sync of all data/journal to physical media every 5 seconds. The filesystem will not be damaged though, thanks to the journaling.
Added key with descriptor [f88747555a6115f5]
+
The following [[fstab]] illustrates the use of {{ic|commit}}:
  
Now you know a descriptor for your key.
+
{{hc|/etc/fstab|2=
Make sure you have added a key into your keychain:
+
/dev/sda5    /    ext4    defaults,noatime,'''commit=60'''    0    1
 +
}}
  
# keyctl show
+
=== Turning barriers off ===
Session Keyring
 
1021618178 --alswrv  1000  1000  keyring: _ses
 
  176349519 --alsw-v  1000  1000  \_ logon: ext4:f88747555a6115f5
 
  
Almost done. Now set an encryption policy for a directory:
+
{{Warning|Disabling barriers for disks without battery-backed cache is not recommended and can lead to severe file system corruption and data loss.}}
  
# e4crypt set_policy f88747555a6115f5 /encrypted/dir
+
''Ext4'' enables write barriers by default. It ensures that file system metadata is correctly written and ordered on disk, even when write caches lose power. This goes with a performance cost especially for applications that use ''fsync'' heavily or create and delete many small files. For disks that have a write cache that is battery-backed in one way or another, disabling barriers may safely improve performance.
  
That is all. If you try accessing the disk without adding a key into keychain,
+
To turn barriers off, add the option {{Ic|1=barrier=0}} to the desired filesystem. For example:
filenames and their contents will be seen as encrypted gibberish. Be careful
 
running old versions of e2fsck on your filesystem - it will treat encrypted
 
filenames as invalid.
 
  
== Tips and tricks ==
+
{{hc|/etc/fstab|2=
 +
/dev/sda5    /    ext4    noatime,'''barrier=0'''    0    1
 +
}}
 +
 
 +
=== Disabling journaling ===
 +
 
 +
{{Warning|Using a filesystem without journaling can result in data loss in case of sudden dismount like power failure or kernel lockup.}}
 +
 
 +
Disabling the journal with ''ext4'' can be done with the following command on an unmounted disk:
 +
 
 +
# tune2fs -O "^has_journal" /dev/sdXN
 +
 
 +
=== Use external journal to optimize performance ===
 +
 
 +
{{Style|Complicated to read, needs style fixing.}}
 +
 
 +
For those with concerns about both data integrity and performance, the journaling can be significantly sped up with the {{ic|journal_async_commit}} mount option. Note that it [https://patchwork.ozlabs.org/patch/414750/ does not work with] the balanced default of {{ic|1=data=ordered}}, so this is only recommended when the filesystem is already cautiously using {{ic|1=data=journal}}.
  
=== E4rat ===
+
You can then format a dedicated device to journal to with {{ic|mke2fs -O journal_dev /dev/journal_device}}. Use {{ic|1=tune2fs -J device=/dev/journal_device /dev/ext4_fs}} to assign the journal to an existing device, or replace {{ic|tune2fs}} with {{ic|mkfs.ext4}} if you are making a new filesystem.
  
[[E4rat]] is a preload application designed for the ext4 filesystem. It monitors files opened during boot, optimizes their placement on the partition to improve access time, and preloads them at the very beginning of the boot process. [[E4rat]] does not offer improvements with [[SSD]]s, whose access time is negligible compared to hard disks.
+
== Tips and tricks ==
  
=== Barriers and performance ===
+
=== Using file-based encryption ===
  
Since kernel 2.6.30, ext4 performance has decreased due to changes that serve to improve data integrity.[http://www.phoronix.com/scan.php?page=article&item=ext4_then_now&num=1]
+
Since Linux 4.1, ext4 natively supports file encryption. Encryption is applied at the directory level, and different directories can use different encryption keys. This is different from both [[dm-crypt]], which is block-device level encryption, and from [[eCryptfs]], which is a stacked cryptographic filesystem. To use ext4's native encryption support, see the [[fscrypt]] article.
  
:Most file systems (XFS, ext3, ext4, reiserfs) send write barriers to disk after fsync or during transaction commits. Write barriers enforce proper ordering of writes, making volatile disk write caches safe to use (at some performance penalty). If your disks are battery-backed in one way or another, disabling barriers may safely improve performance.
+
=== Enabling metadata checksums ===
  
:Sending write barriers can be disabled using the {{Ic|1=barrier=0}} mount option (for ext3, ext4, and reiserfs), or using the {{Ic|1=nobarrier}} mount option (for XFS).[http://doc.opensuse.org/products/draft/SLES/SLES-tuning_sd_draft/cha.tuning.io.html]{{dead link|2016|7|5}}
+
When a filesystem has been created with {{Pkg|e2fsprogs}} 1.44 or later, metadata checksums should already be enabled by default. Existing filesystems may be converted to enable metadata checksum support.
  
{{Warning|Disabling barriers when disks cannot guarantee caches are properly written in case of power failure can lead to severe file system corruption and data loss.}}
+
If the CPU supports SSE 4.2, make sure the {{ic|crc32c_intel}} [[kernel module]] is loaded in order to enable the hardware accelerated CRC32C algorithm [https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums#Benchmarking]. If not, load the {{Ic|crc32c_generic}} module instead.
  
To turn barriers off add the option {{Ic|1=barrier=0}} to the desired filesystem. For example:
+
To read more about metadata checksums, see the [https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums ext4 wiki].
  
{{hc|/etc/fstab|2=
+
{{Tip|Use {{ic|dumpe2fs}} to check the features that are enabled on the filesystem:
/dev/sda5    /   ext4    noatime,barrier=0    0    1
+
# dumpe2fs ''/dev/path/to/disk''
 
}}
 
}}
  
==Enabling metadata checksums==
+
==== New filesystem ====
 +
 
 +
To enable support for ext4 metadata checksums on creating a new file system:
  
Starting with [http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.43 e2fsprogs 1.43] we can safely enable support for ext4 checksums and 64bit.
+
# mkfs.ext4 -O metadata_csum ''/dev/path/to/disk''
More about metadata checksums can be [https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums read here].
 
  
If your CPU supports SSE 4.2, make sure the {{Ic|crc32c_intel}} kernel module is loaded in order to enable the hardware accelerated CRC32C algorithm. If not you'll need to load the {{Ic|crc32c_generic}} module.
+
==== Convert existing filesystem ====
  
This needs to be done with the partition unmounted, so if you want to convert the root, you'll need to run off an USB live distro.
+
{{Note|The filesystem should not be mounted.}}
  
First the partition needs to be checked and optimized using {{Ic |e2fsck -D /dev/sdX}}
+
First the partition needs to be checked and optimized using {{ic|e2fsck}}:
  
Then the file-system needs to be converted to 64bit: {{Ic |resize2fs -b /dev/sdX}}.  
+
# e2fsck -Df ''/dev/path/to/disk''  
  
Finally checksums can be added: {{Ic |tune2fs -O metadata_csum /dev/sdX}}.
+
Convert the filesystem to 64bit:
  
The file-system can then be mounted as usual.
+
# resize2fs -b ''/dev/path/to/disk''
If this is the root file-system these modules might need to be added to {{Ic|/etc/mkinitcpio.conf}}:
 
  
    MODULES="... crc32c_intel crc32c_generic"
+
Finally enable checksums support:
  
And then regenerate the kernel image using {{Ic|mkinitcpio -p linux}}.
+
# tune2fs -O metadata_csum ''/dev/path/to/disk''
  
 
== See also ==
 
== See also ==
Line 228: Line 250:
 
* [https://ext4.wiki.kernel.org/ Official Ext4 wiki]
 
* [https://ext4.wiki.kernel.org/ Official Ext4 wiki]
 
* [https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout Ext4 Disk Layout] described in its wiki
 
* [https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout Ext4 Disk Layout] described in its wiki
* [http://lwn.net/Articles/639427/ Ext4 Encryption] LWM article
+
* [http://lwn.net/Articles/639427/ Ext4 Encryption] LWN article
 
* Kernel commits for ext4 encryption [https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6162e4b0bedeb3dac2ba0a5e1b1f56db107d97ec] [https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8663da2c0919896788321cd8a0016af08588c656]
 
* Kernel commits for ext4 encryption [https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6162e4b0bedeb3dac2ba0a5e1b1f56db107d97ec] [https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8663da2c0919896788321cd8a0016af08588c656]
 +
* [http://e2fsprogs.sourceforge.net/e2fsprogs-release.html e2fsprogs Changelog]
 +
* [https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums Ext4 Metadata Checksums]

Latest revision as of 18:27, 2 November 2019

From Ext4 - Linux Kernel Newbies:

Ext4 is the evolution of the most used Linux filesystem, Ext3. In many ways, Ext4 is a deeper improvement over Ext3 than Ext3 was over Ext2. Ext3 was mostly about adding journaling to Ext2, but Ext4 modifies important data structures of the filesystem such as the ones destined to store the file data. The result is a filesystem with an improved design, better performance, reliability, and features.

Create a new ext4 filesystem

To format a partition do:

# mkfs.ext4 /dev/partition
Tip:
  • See mke2fs(8) for more options; edit /etc/mke2fs.conf to view/configure default options.
  • If supported, you may want to enable metadata checksums.

Bytes-per-inode ratio

From mke2fs(8):

mke2fs creates an inode for every bytes-per-inode bytes of space on the disk. The larger the bytes-per-inode ratio, the fewer inodes will be created.

Creating a new file, directory, symlink etc. requires at least one free inode. If the inode count is too low, no file can be created on the filesystem even though there is still space left on it.

Because it is not possible to change either the bytes-per-inode ratio or the inode count after the filesystem is created, mkfs.ext4 uses by default a rather low ratio of one inode every 16384 bytes (16 KiB) to avoid this situation.

However, for partitions with size in the hundreds or thousands of GB and average file size in the megabyte range, this usually results in a much too large inode number because the number of files created never reaches the number of inodes.

This results in a waste of disk space, because all those unused inodes each take up 256 bytes on the filesystem (this is also set in /etc/mke2fs.conf but should not be changed). 256 * several millions = quite a few gigabytes wasted in unused inodes.

This situation can be evaluated by comparing the {I}Use% figures provided by df and df -i:

$ df -h /home
Filesystem              Size    Used   Avail  Use%   Mounted on
/dev/mapper/lvm-home    115G    56G    59G    49%    /home
$ df -hi /home
Filesystem              Inodes  IUsed  IFree  IUse%  Mounted on
/dev/mapper/lvm-home    1.8M    1.1K   1.8M   1%     /home

To specify a different bytes-per-inode ratio, you can use the -T usage-type option which hints at the expected usage of the filesystem using types defined in /etc/mke2fs.conf. Among those types are the bigger largefile and largefile4 which offer more relevant ratios of one inode every 1 MiB and 4 MiB respectively. It can be used as such:

# mkfs.ext4 -T largefile /dev/device

The bytes-per-inode ratio can also be set directly via the -i option: e.g. use -i 2097152 for a 2 MiB ratio and -i 6291456 for a 6 MiB ratio.

Tip: Conversely, if you are setting up a partition dedicated to host millions of small files like emails or newsgroup items, you can use smaller usage-type values such as news (one inode for every 4096 bytes) or small (same plus smaller inode and block sizes).
Warning: If you make a heavy use of symbolic links, make sure to keep the inode count high enough with a low bytes-per-inode ratio, because while not taking more space every new symbolic link consumes one new inode and therefore the filesystem may run out of them quickly.

Reserved blocks

By default, 5% of the filesystem blocks will be reserved for the super-user, to avoid fragmentation and "allow root-owned daemons to continue to function correctly after non-privileged processes are prevented from writing to the filesystem" (from mke2fs(8)).

For modern high-capacity disks, this is higher than necessary if the partition is used as a long-term archive or not crucial to system operations (like /home). See this email for the opinion of ext4 developer Ted Ts'o on reserved blocks.

It is generally safe to reduce the percentage of reserved blocks to free up disk space when the partition is either:

  • Very large (for example > 50G)
  • Used as long-term archive, i.e., where files will not be deleted and created very often

The -m option of ext4-related utilities allows to specify the percentage of reserved blocks.

To totally prevent reserving blocks upon filesystem creation, use:

# mkfs.ext4 -m 0 /dev/device

To change it to 1% afterwards, use:

# tune2fs -m 1 /dev/device

You can use findmnt(8) to find the device name:

$ findmnt /the/mount/point

Migrating from ext2/ext3 to ext4

Mounting ext2/ext3 partitions as ext4 without converting

Rationale

A compromise between fully converting to ext4 and simply remaining with ext2/ext3 is to mount the partitions as ext4.

Pros:

  • Compatibility (the filesystem can continue to be mounted as ext3) – This allows users to still read the filesystem from other operating systems without ext4 support (e.g. Windows with ext2/ext3 drivers)
  • Improved performance (though not as much as a fully-converted ext4 partition).[1] [2]

Cons:

  • Fewer features of ext4 are used (only those that do not change the disk format such as multiblock allocation and delayed allocation)
Note: Except for the relative novelty of ext4 (which can be seen as a risk), there is no major drawback to this technique.

Procedure

  1. Edit /etc/fstab and change the 'type' from ext2/ext3 to ext4 for any partitions you would like to mount as ext4.
  2. Re-mount the affected partitions.

Converting ext2/ext3 partitions to ext4

Rationale

To experience the benefits of ext4, an irreversible conversion process must be completed.

Pros:

  • Improved performance and new features.[3] [4]

Cons:

  • Partitions that contain mostly static files, such as a /boot partition, may not benefit from the new features. Also, adding a journal (which is implied by moving a ext2 partition to ext3/4) always incurs performance overhead.
  • Irreversible (ext4 partitions cannot be 'downgraded' to ext2/ext3. It is, however, backwards compatible until extent and other unique options are enabled)

Procedure

These instructions were adapted from Kernel documentation and an BBS thread.

Warning:
  • If you convert the system's root filesystem, ensure that the 'fallback' initramfs is available at reboot. Alternatively, add ext4 according to Mkinitcpio#MODULES and regenerate the initramfs before starting.
  • If you decide to convert a separate /boot partition, ensure the bootloader supports booting from ext4.

In the following steps /dev/sdxX denotes the path to the partition to be converted, such as /dev/sda1.

  1. Back up all data on any ext3 partitions that are to be converted to ext4. A useful package, especially for root partitions, is clonezilla.
  2. Edit /etc/fstab and change the 'type' from ext3 to ext4 for any partitions that are to be converted to ext4.
  3. Boot the live medium (if necessary). The conversion process with e2fsprogs must be done when the drive is not mounted. If converting a root partition, the simplest way to achieve this is to boot from some other live medium.
  4. Ensure the partition is not mounted
  5. If you want to convert a ext2 partition, the first conversion step is to add a journal by running tune2fs -j /dev/sdxX as root; making it a ext3 partition.
  6. Run tune2fs -O extent,uninit_bg,dir_index /dev/sdxX as root. This command converts the ext3 filesystem to ext4 (irreversibly).
  7. Run fsck -f /dev/sdxX as root.
    • This step is necessary, otherwise the filesystem will be unreadable. This fsck run is needed to return the filesystem to a consistent state. It will find checksum errors in the group descriptors - this is expected. The -f option asks fsck to force checking even if the file system seems clean. The -p option may be used on top to "automatically repair" (otherwise, the user will be asked for input for each error).
  8. Recommended: mount the partition and run e4defrag -c -v /dev/sdxX as root.
    • Even though the filesystem is now converted to ext4, all files that have been written before the conversion do not yet take advantage of the extent option of ext4, which will improve large file performance and reduce fragmentation and filesystem check time. In order to fully take advantage of ext4, all files would have to be rewritten on disk. Use e4defrag to take care of this problem.
  9. Reboot

Improving performance

E4rat

E4rat is a preload application designed for the ext4 filesystem. It monitors files opened during boot, optimizes their placement on the partition to improve access time, and preloads them at the very beginning of the boot process. E4rat does not offer improvements with SSDs, whose access time is negligible compared to hard disks.

Disabling access time update

The ext4 file system records information about when a file was last accessed and there is a cost associated with recording it. With the noatime option, the access timestamps on the filesystem are not updated.

/etc/fstab
/dev/sda5    /    ext4    defaults,noatime    0    1

Doing so breaks applications that rely on access time, see fstab#atime options for possible solutions.

Increasing commit interval

The sync interval for data and metadata can be increased by providing a higher time delay to the commit option.

The default 5 sec means that if the power is lost, one will lose as much as the latest 5 seconds of work. It forces a full sync of all data/journal to physical media every 5 seconds. The filesystem will not be damaged though, thanks to the journaling. The following fstab illustrates the use of commit:

/etc/fstab
/dev/sda5    /    ext4    defaults,noatime,commit=60    0    1

Turning barriers off

Warning: Disabling barriers for disks without battery-backed cache is not recommended and can lead to severe file system corruption and data loss.

Ext4 enables write barriers by default. It ensures that file system metadata is correctly written and ordered on disk, even when write caches lose power. This goes with a performance cost especially for applications that use fsync heavily or create and delete many small files. For disks that have a write cache that is battery-backed in one way or another, disabling barriers may safely improve performance.

To turn barriers off, add the option barrier=0 to the desired filesystem. For example:

/etc/fstab
/dev/sda5    /    ext4    noatime,barrier=0    0    1

Disabling journaling

Warning: Using a filesystem without journaling can result in data loss in case of sudden dismount like power failure or kernel lockup.

Disabling the journal with ext4 can be done with the following command on an unmounted disk:

# tune2fs -O "^has_journal" /dev/sdXN

Use external journal to optimize performance

Tango-edit-clear.pngThis article or section needs language, wiki syntax or style improvements. See Help:Style for reference.Tango-edit-clear.png

Reason: Complicated to read, needs style fixing. (Discuss in Talk:Ext4#)

For those with concerns about both data integrity and performance, the journaling can be significantly sped up with the journal_async_commit mount option. Note that it does not work with the balanced default of data=ordered, so this is only recommended when the filesystem is already cautiously using data=journal.

You can then format a dedicated device to journal to with mke2fs -O journal_dev /dev/journal_device. Use tune2fs -J device=/dev/journal_device /dev/ext4_fs to assign the journal to an existing device, or replace tune2fs with mkfs.ext4 if you are making a new filesystem.

Tips and tricks

Using file-based encryption

Since Linux 4.1, ext4 natively supports file encryption. Encryption is applied at the directory level, and different directories can use different encryption keys. This is different from both dm-crypt, which is block-device level encryption, and from eCryptfs, which is a stacked cryptographic filesystem. To use ext4's native encryption support, see the fscrypt article.

Enabling metadata checksums

When a filesystem has been created with e2fsprogs 1.44 or later, metadata checksums should already be enabled by default. Existing filesystems may be converted to enable metadata checksum support.

If the CPU supports SSE 4.2, make sure the crc32c_intel kernel module is loaded in order to enable the hardware accelerated CRC32C algorithm [5]. If not, load the crc32c_generic module instead.

To read more about metadata checksums, see the ext4 wiki.

Tip: Use dumpe2fs to check the features that are enabled on the filesystem:
# dumpe2fs /dev/path/to/disk

New filesystem

To enable support for ext4 metadata checksums on creating a new file system:

# mkfs.ext4 -O metadata_csum /dev/path/to/disk

Convert existing filesystem

Note: The filesystem should not be mounted.

First the partition needs to be checked and optimized using e2fsck:

# e2fsck -Df /dev/path/to/disk  

Convert the filesystem to 64bit:

# resize2fs -b /dev/path/to/disk 

Finally enable checksums support:

# tune2fs -O metadata_csum /dev/path/to/disk

See also