Difference between revisions of "Btrfs"

From ArchWiki
Jump to: navigation, search
m (Subvolumes)
m
(44 intermediate revisions by 22 users not shown)
Line 1: Line 1:
 
[[Category:File systems]]
 
[[Category:File systems]]
 +
[[ja:Btrfs]]
 
[[zh-CN:Btrfs]]
 
[[zh-CN:Btrfs]]
 
{{Article summary start}}
 
{{Article summary start}}
Line 7: Line 8:
 
{{Article summary end}}
 
{{Article summary end}}
  
Btrfs is a new copy on write (COW) filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. Jointly developed by Oracle, Red Hat, Fujitsu, Intel, SUSE and many others, Btrfs is licensed under the GPL and open for contribution from anyone.
+
Btrfs is an abbreviation for '''B-tree''' '''F'''ile '''S'''ystem and is also known as "Butter FS" or "Better FS".  Btrfs is a copy-on-write (COW) file system written from the ground up for Linux.  It is aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. Jointly developed by Oracle, Red Hat, Fujitsu, Intel, SUSE and many others, Btrfs is licensed under the GPL and open for contribution from anyone.
 
+
== Recent Developments and News Links ==
+
*[http://www.phoronix.com/scan.php?page=news_item&px=MTA4Mzc Summary of Chris Mason's talk from LFCS 2012]
+
*On 2012-03-28, {{Pkg|btrfs-progs}} includes btrfsck, a tool that can fix errors on btrfs filesystems.
+
*Oracle has packaged this version of btrfs-progs and released it to their customers of Oracle Linux 6 and backported to 5.
+
*Arch Linux supplies this version in core/btrfs-progs (since version 0.19.20120328-1).
+
  
 
== Installation ==
 
== Installation ==
  
Btrfs support is included in the the {{pkg|linux}} package (hardcoded into the kernel). User space utilities are available in {{pkg|btrfs-progs}}.
+
As of the beginning of the year 2013 Btrfs is included in the default kernel and its tools ({{pkg|btrfs-progs}}) are part of the default installation.  [[GRUB|GRUB 2]], mkinitcpio, and [[Syslinux]] have support for Btrfs and require no additional configuration.
  
If using btrfs as the root filesystem, users ''may'' want to install {{AUR|mkinitcpio-btrfs}} from the [[Arch User Repository|AUR]]. This package will install a mkinitcpio hook intended for those who wish to have a single or multi-drive BTRFS file system as their {{ic|/}} (root).  The hook will ensure that the chosen root device from the kernel command line is intact and safe to boot.  If root is not a btrfs device, the hook is quietly skipped.
+
== File system creation ==
  
== Creating a Btrfs Partition ==
+
A Btrfs file system can either be newly created or have one converted.
=== Format a New Partition to Btrfs ===
+
  
# mkfs.btrfs [options] dev [dev ...]
+
=== Creating a new file system ===
  
One can select multiple devices to create a RAID. Supported RAID levels include RAID 0, RAID 1 and RAID 10. By default, metadata is mirrored and data is striped.
+
To format do a partition do:
  
=== Convert Ext3/4 to Btrfs ===
+
# mkfs.btrfs /dev/<partition>
{{Warning|[[GRUB Legacy]] cannot boot with btrfs as root. Users need to install either [[GRUB]] or [[Syslinux]]. This guide assumes users are aware of this limitation.}}
+
  
# Boot a live CD (Arch for example)
+
Multiple devices can be entered to create a RAID. Supported RAID levels include RAID 0, RAID 1 and RAID 10. By default the metadata is mirrored and data is striped.
# Enable [remote-core] and [remote-testing]
+
# Setup the network
+
# {{ic|modprobe btrfs}}
+
# Install btrfs-progs (make sure versions of dependencies match: glibc, e2fsprogs)
+
# Run {{ic|btrfs-convert}}
+
# Mount the converted partition and modify the {{ic|/etc/fstab}} file specifying either {{ic|auto}} or {{ic|btrfs}} for the partition type.
+
# Chroot into the system and rebuild the GRUB entry (see [[Install from Existing Linux]] and [[GRUB]] articles, if unfamiliar with this procedure.
+
  
== Btrfs Features ==
+
# mkfs.btrfs [options] /dev/<part1> /dev/<part2>
=== Subvolumes ===
+
  
One of the features of btrfs is the use of subvolumes. Subvolumes are basically a named btree that holds files and directories. They have inodes inside the tree of tree roots and can have non-root owners and groups. Subvolumes can optionally be given a quota of blocks.  All of the blocks and file extents inside of subvolumes are reference counted to allow snapshotting. Similar to the dynamically expanding storage of a virtual machine that will only use as much space on a device as needed. Eliminating several half-filled partitions.  One can also mount the subvolumes with different mount options giving more flexibility in security.
+
=== Convert from Ext3/4 ===
  
To create a subvolume:
+
Boot from an install CD, then convert by doing:
 +
 
 +
# btrfs-convert /dev/<partition>
 +
 
 +
Mount the partion and test the conversion by checking the files.  Be sure to change the {{ic|/etc/fstab}} to reflect the change ('''type''' to btrfs and '''fs_passno''' [the last field] to 0 as Btrfs does not do a file system check on boot).  {{ic|chroot}} into the system and rebuild the GRUB menu list (see [[Install from Existing Linux]] and [[GRUB]] articles).
 +
 
 +
To complete, delete the saved image, delete the sub-volume that image is on, then balance the drive to reclaim the space.
 +
 
 +
# rm /ext2_saved/*
 +
# btrfs subvolume delete /ext2_saved
 +
 
 +
== Limitations ==
 +
 
 +
A few limitations should be known before trying.
 +
 
 +
=== Encryption ===
 +
 
 +
Btrfs has no built-in encryption support (this may come in future), but you can encrypt the partition before running <code>mkfs.btrfs</code>. See [[Dm-crypt with LUKS]].
 +
 
 +
(If you've already created a btrfs file system, you can also use something like [[EncFS]] or [[TrueCrypt]], though perhaps without some of btrfs' features.)
 +
 
 +
=== Swap file ===
 +
 
 +
Btrfs does not support swap files. This is due to swap files requiring a function that Btrfs doesn't have for possibility of corruptions.<sup>[https://btrfs.wiki.kernel.org/index.php/FAQ#Does_btrfs_support_swap_files.3F link]</sup> A swap file can be mounted on a loop device with poorer performance but will not be able to hibernate. A systemd service file is available {{AUR|systemd-loop-swapfile}}.
 +
 
 +
=== GRUB2 and core.img ===
 +
 
 +
[[GRUB|Grub 2]] can boot Btrfs partitions however the module is larger than e.g. ext4 and the core.img file made by grub-install may not fit between the MBR and the first partition.  This can be solved by using GPT or by putting an extra 1 or 2 MB of free space before the first partition.
 +
 
 +
== Features ==
 +
 
 +
Various features are available and can be adjusted.
 +
 
 +
=== Copy-On-Write (CoW) ===
 +
 
 +
CoW comes with some advantages, but can negatively affect performance with large files that have small random writes. It is recommended to disable CoW for database files and virtual machine images.
 +
You can disable CoW for the entire block device by mounting it with "nodatacow" option. However, this will disable CoW for the entire file system.
 +
 
 +
To disable CoW for single files/directories do:
 +
 
 +
# chattr +C </dir/file>
 +
 
 +
Note, from chattr man page: For btrfs, the 'C' flag should be set on new or empty files.  If it is set on a file which already has data blocks, it is undefined when the blocks assigned to the file will be fully stable. If the 'C' flag is set on a directory, it will have no effect on the directory, but new files created in that directory will have the No_COW attribute.
 +
 
 +
Likewise, to save space by forcing CoW when copying files use:
 +
 
 +
# cp --reflink source dest
 +
 
 +
As dest file is changed, only those blocks that are changed from source will be written to the disk. One might consider aliasing aliasing cp to 'cp --reflink=auto'
 +
 
 +
=== Multi-device filesystem and RAID feature ===
 +
====Multi-device filesystem====
 +
 
 +
When creating a ''btrfs'' filesystem, you can pass as many partitions or disk devices as you want to ''mkfs.btrfs''. The filesystem will be created across these devices. You can '''"'''merge'''"''' this way, multiple partitions or devices to get a big ''btrfs'' filesystem.
 +
 
 +
You can also add or remove device from an existing btrfs filesystem (caution is mandatory).
 +
 
 +
A multi-device ''btrfs'' filesystem (also called a btrfs volume) is not recognized until
 +
  # btrfs device scan
 +
has been run. This is the purpose of the ''btrfs'' mkinitcpio hook or the ''USEBTRFS'' variable in /etc/rc.conf
 +
 
 +
====RAID features====
 +
 
 +
When creating multi-device filesystem, you can also specify to use RAID0, RAID1 or RAID10 across the devices you have added to the filesystem. RAID levels can be applied independently to data and meta data. By default, meta data is duplicated on single volumes or RAID1 on multi-disk sets.
 +
 
 +
btrfs works in block-pairs for raid0, raid1, and raid10. This means:
 +
 
 +
raid0 - block-pair stripped across 2 devices<br>
 +
raid1 - block-pair written to 2 devices
 +
 
 +
For 2 disk sets, this matches raid levels as defined in md-raid (mdadm). For 3+ disk-sets, the result is entirely different than md-raid.
 +
 
 +
For example:<br>
 +
3 1TB disks in an md based raid1 yields a /dev/md0 with 1TB free space and the ability to safely loose 2 disks without losing data.
 +
3 1TB disks in a btrfs volume with data=raid1 will allow the storage of approximately 1.5TB of data before reporting full. Only 1 disk can safely be lost without losing data.
 +
 
 +
btrfs uses a round-robin scheme to decide how block-pairs are spread among disks. As of Linux 3.0, a quasi-round-robin scheme is used which prefers larger disks when distributing block pairs. This allows raid0 and raid1 to take advantage of most (and sometimes all) space in a disk set made of multiple disks. For example, a set consisting of a 1TB disk and 2 500GB disks with data=raid1 will place a copy of every block on the 1TB disk and alternate (round-robin) placing blocks on each of the 500GB disks. Full space utilization will be made. A set made from a 1TB disk, a 750GB disk, and a 500GB disk will work the same, but the filesystem will report full with 250GB unusable on the 750GB disk. To always take advantage of the full space (even in the last example), use data=single. (data=single is akin to JBOD defined by some raid controllers) See [https://btrfs.wiki.kernel.org/index.php/FAQ#How_much_space_do_I_get_with_unequal_devices_in_RAID-1_mode.3F the BTRFS FAQ] for more info.
 +
 
 +
=== Sub-volumes ===
 +
 
 +
One of the features of Btrfs is the use of sub-volumes. Sub-volumes are basically a named btree that holds files and directories. They have inodes inside the tree of tree roots and can have non-root owners and groups. Sub-volumes can optionally be given a quota of blocks.  All of the blocks and file extents inside of sub-volumes are reference counted to allow snapshotting. This is similar to the dynamically expanding storage of a virtual machine that will only use as much space on a device as needed, eliminating several half-filled partitions.  One can also mount the sub-volumes with different mount options, giving more flexibility in security.
 +
 
 +
To create a sub-volume:
  
 
  # btrfs subvolume create [<dest>/]
 
  # btrfs subvolume create [<dest>/]
  
For increased flexibility, install your system into a dedicated subvolume, and use:
+
For increased flexibility, install your system into a dedicated sub-volume, and, in the kernel boot parameters, use:
  
 
{{bc|1=rootflags=subvol=<whatever you called the subvol>}}
 
{{bc|1=rootflags=subvol=<whatever you called the subvol>}}
  
In the kernel boot parameters. It makes system rollbacks possible.
+
This makes system rollbacks possible.
  
If using for the root partition, it is advisable to add '''crc32c''' to the modules array in {{ic|/etc/mkinitcpio.conf}} as well as adding {{ic|btrfs}} to the HOOKS.
+
If using for the root partition, it is advisable to add '''crc32c''' (or '''crc32c-intel''' for Intel machines) to the modules array in {{ic|/etc/mkinitcpio.conf}}.
  
 
=== Snapshots ===
 
=== Snapshots ===
Line 66: Line 133:
  
 
=== Defragmentation ===
 
=== Defragmentation ===
Btrfs supports online defragmentation. To defragment the metadata of the root folder, simply do:
+
 
 +
Btrfs supports online defragmentation. To defragment the metadata of the root folder do:
 +
 
 
  # btrfs filesystem defragment /
 
  # btrfs filesystem defragment /
This ''will not'' defragment the entire system. For more information, see [https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#Defragmenting_a_directory_doesn.27t_work this page] on the btrfs wiki.
+
 
 +
This ''will not'' defragment the entire system. For more information read [https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#Defragmenting_a_directory_doesn.27t_work this page] on the btrfs wiki.
 +
 
 +
To defragment the entire system verbosely do:
 +
 
 +
# find / -xdev -type f -print -exec btrfs filesystem defrag '{}' \;
  
 
=== Compression ===
 
=== Compression ===
Btrfs supports transparent compression, which means every file on the partition is automatically compressed. This does not only reduce the size of those files, but also [http://www.phoronix.com/scan.php?page=article&item=btrfs_compress_2635&num=1 improves performance], in particular if using the [http://www.phoronix.com/scan.php?page=article&item=btrfs_lzo_2638&num=1 lzo algorithm]. Compression is enabled using the compress=gzip or compress=lzo mount options. Only files created or modified after the mount option is added will be compressed, so to fully benefit from compression it should be enabled during installation. After [[Beginners%27_Guide#Prepare_hard_drive|preparing the hard drive]], simply switch to another terminal ({{keypress|Ctrl+Alt+number}}), and run the following command:
 
# mount -o remount,compress=lzo /dev/partition /mnt/target # note: replace /dev/partition by the partition on which Arch Linux is installed.
 
  
Verify if compression is enabled with the mount command. After the installation is finished, add compress=lzo to the mount options of the root filesystem in {{ic|/etc/fstab}}.
+
Btrfs supports transparent compression, which means every file on the partition is automatically compressed. This does not only reduce the size of those files, but also [http://www.phoronix.com/scan.php?page=article&item=btrfs_compress_2635&num=1 improves performance], in particular if using the [http://www.phoronix.com/scan.php?page=article&item=btrfs_lzo_2638&num=1 lzo algorithm]. Compression is enabled using the {{ic|1=compress=gzip}} or {{ic|1=compress=lzo}} mount options. Only files created or modified after the mount option is added will be compressed, so to fully benefit from compression it should be enabled during installation. After [[Beginners%27_Guide#Prepare_the_storage_drive|preparing the storage drive]], simply switch to another terminal ({{keypress|Ctrl+Alt+number}}), and run the following command:
 +
 
 +
# mount -o remount,compress=lzo /dev/sdXY /mnt/target
 +
 
 +
After the installation is finished, add {{ic|1=compress=lzo}} to the mount options of the root filesystem in {{ic|/etc/[[fstab]]}}.
  
 
== Resources ==
 
== Resources ==
 +
 
* [https://btrfs.wiki.kernel.org/ Btrfs Wiki]
 
* [https://btrfs.wiki.kernel.org/ Btrfs Wiki]
 
* [https://btrfs.wiki.kernel.org/index.php/Problem_FAQ BTRFS Problem FAQ] - Official FAQ
 
* [https://btrfs.wiki.kernel.org/index.php/Problem_FAQ BTRFS Problem FAQ] - Official FAQ
* [http://www.funtoo.org/wiki/BTRFS_Fun Funtoo Btrfs wiki entry] - Very well-written albeit slightly out dated article.
+
* [http://www.funtoo.org/wiki/BTRFS_Fun Funtoo Btrfs wiki entry] - Very well-written article
 +
* [http://www.phoronix.com/scan.php?page=news_item&px=MTA0ODU Avi Miller presenting BTRFS] at SCALE 10x.  Jan/2012.
 +
* [http://www.phoronix.com/scan.php?page=news_item&px=MTA4Mzc Summary of Chris Mason's talk from LFCS 2012]
 +
* On 2012-03-28, {{Pkg|btrfs-progs}} includes btrfsck, a tool that can fix errors on btrfs filesystems.
 +
* Oracle has packaged this version of btrfs-progs and released it to their customers of Oracle Linux 6 and backported to 5.
 +
* {{AUR|mkinitcpio-btrfs}}: for roll-back abilities (currently unmaintained).

Revision as of 01:02, 8 May 2013

Template:Article summary start Template:Article summary text Template:Article summary heading Template:Article summary wiki Template:Article summary end

Btrfs is an abbreviation for B-tree File System and is also known as "Butter FS" or "Better FS". Btrfs is a copy-on-write (COW) file system written from the ground up for Linux. It is aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. Jointly developed by Oracle, Red Hat, Fujitsu, Intel, SUSE and many others, Btrfs is licensed under the GPL and open for contribution from anyone.

Installation

As of the beginning of the year 2013 Btrfs is included in the default kernel and its tools (btrfs-progs) are part of the default installation. GRUB 2, mkinitcpio, and Syslinux have support for Btrfs and require no additional configuration.

File system creation

A Btrfs file system can either be newly created or have one converted.

Creating a new file system

To format do a partition do:

# mkfs.btrfs /dev/<partition>

Multiple devices can be entered to create a RAID. Supported RAID levels include RAID 0, RAID 1 and RAID 10. By default the metadata is mirrored and data is striped.

# mkfs.btrfs [options] /dev/<part1> /dev/<part2>

Convert from Ext3/4

Boot from an install CD, then convert by doing:

# btrfs-convert /dev/<partition>

Mount the partion and test the conversion by checking the files. Be sure to change the /etc/fstab to reflect the change (type to btrfs and fs_passno [the last field] to 0 as Btrfs does not do a file system check on boot). chroot into the system and rebuild the GRUB menu list (see Install from Existing Linux and GRUB articles).

To complete, delete the saved image, delete the sub-volume that image is on, then balance the drive to reclaim the space.

# rm /ext2_saved/*
# btrfs subvolume delete /ext2_saved

Limitations

A few limitations should be known before trying.

Encryption

Btrfs has no built-in encryption support (this may come in future), but you can encrypt the partition before running mkfs.btrfs. See Dm-crypt with LUKS.

(If you've already created a btrfs file system, you can also use something like EncFS or TrueCrypt, though perhaps without some of btrfs' features.)

Swap file

Btrfs does not support swap files. This is due to swap files requiring a function that Btrfs doesn't have for possibility of corruptions.link A swap file can be mounted on a loop device with poorer performance but will not be able to hibernate. A systemd service file is available systemd-loop-swapfileAUR.

GRUB2 and core.img

Grub 2 can boot Btrfs partitions however the module is larger than e.g. ext4 and the core.img file made by grub-install may not fit between the MBR and the first partition. This can be solved by using GPT or by putting an extra 1 or 2 MB of free space before the first partition.

Features

Various features are available and can be adjusted.

Copy-On-Write (CoW)

CoW comes with some advantages, but can negatively affect performance with large files that have small random writes. It is recommended to disable CoW for database files and virtual machine images. You can disable CoW for the entire block device by mounting it with "nodatacow" option. However, this will disable CoW for the entire file system.

To disable CoW for single files/directories do:

# chattr +C </dir/file>

Note, from chattr man page: For btrfs, the 'C' flag should be set on new or empty files. If it is set on a file which already has data blocks, it is undefined when the blocks assigned to the file will be fully stable. If the 'C' flag is set on a directory, it will have no effect on the directory, but new files created in that directory will have the No_COW attribute.

Likewise, to save space by forcing CoW when copying files use:

# cp --reflink source dest 

As dest file is changed, only those blocks that are changed from source will be written to the disk. One might consider aliasing aliasing cp to 'cp --reflink=auto'

Multi-device filesystem and RAID feature

Multi-device filesystem

When creating a btrfs filesystem, you can pass as many partitions or disk devices as you want to mkfs.btrfs. The filesystem will be created across these devices. You can "merge" this way, multiple partitions or devices to get a big btrfs filesystem.

You can also add or remove device from an existing btrfs filesystem (caution is mandatory).

A multi-device btrfs filesystem (also called a btrfs volume) is not recognized until

 # btrfs device scan

has been run. This is the purpose of the btrfs mkinitcpio hook or the USEBTRFS variable in /etc/rc.conf

RAID features

When creating multi-device filesystem, you can also specify to use RAID0, RAID1 or RAID10 across the devices you have added to the filesystem. RAID levels can be applied independently to data and meta data. By default, meta data is duplicated on single volumes or RAID1 on multi-disk sets.

btrfs works in block-pairs for raid0, raid1, and raid10. This means:

raid0 - block-pair stripped across 2 devices
raid1 - block-pair written to 2 devices

For 2 disk sets, this matches raid levels as defined in md-raid (mdadm). For 3+ disk-sets, the result is entirely different than md-raid.

For example:
3 1TB disks in an md based raid1 yields a /dev/md0 with 1TB free space and the ability to safely loose 2 disks without losing data. 3 1TB disks in a btrfs volume with data=raid1 will allow the storage of approximately 1.5TB of data before reporting full. Only 1 disk can safely be lost without losing data.

btrfs uses a round-robin scheme to decide how block-pairs are spread among disks. As of Linux 3.0, a quasi-round-robin scheme is used which prefers larger disks when distributing block pairs. This allows raid0 and raid1 to take advantage of most (and sometimes all) space in a disk set made of multiple disks. For example, a set consisting of a 1TB disk and 2 500GB disks with data=raid1 will place a copy of every block on the 1TB disk and alternate (round-robin) placing blocks on each of the 500GB disks. Full space utilization will be made. A set made from a 1TB disk, a 750GB disk, and a 500GB disk will work the same, but the filesystem will report full with 250GB unusable on the 750GB disk. To always take advantage of the full space (even in the last example), use data=single. (data=single is akin to JBOD defined by some raid controllers) See the BTRFS FAQ for more info.

Sub-volumes

One of the features of Btrfs is the use of sub-volumes. Sub-volumes are basically a named btree that holds files and directories. They have inodes inside the tree of tree roots and can have non-root owners and groups. Sub-volumes can optionally be given a quota of blocks. All of the blocks and file extents inside of sub-volumes are reference counted to allow snapshotting. This is similar to the dynamically expanding storage of a virtual machine that will only use as much space on a device as needed, eliminating several half-filled partitions. One can also mount the sub-volumes with different mount options, giving more flexibility in security.

To create a sub-volume:

# btrfs subvolume create [<dest>/]

For increased flexibility, install your system into a dedicated sub-volume, and, in the kernel boot parameters, use:

rootflags=subvol=<whatever you called the subvol>

This makes system rollbacks possible.

If using for the root partition, it is advisable to add crc32c (or crc32c-intel for Intel machines) to the modules array in /etc/mkinitcpio.conf.

Snapshots

To create a snapshot:

# btrfs subvolume snapshot <source> [<dest>/]<name>

Snapshots are not recursive, this means that every subvolume inside subvolume will be an empty directory inside the snapshot.

Defragmentation

Btrfs supports online defragmentation. To defragment the metadata of the root folder do:

# btrfs filesystem defragment /

This will not defragment the entire system. For more information read this page on the btrfs wiki.

To defragment the entire system verbosely do:

# find / -xdev -type f -print -exec btrfs filesystem defrag '{}' \;

Compression

Btrfs supports transparent compression, which means every file on the partition is automatically compressed. This does not only reduce the size of those files, but also improves performance, in particular if using the lzo algorithm. Compression is enabled using the compress=gzip or compress=lzo mount options. Only files created or modified after the mount option is added will be compressed, so to fully benefit from compression it should be enabled during installation. After preparing the storage drive, simply switch to another terminal (Template:Keypress), and run the following command:

# mount -o remount,compress=lzo /dev/sdXY /mnt/target

After the installation is finished, add compress=lzo to the mount options of the root filesystem in /etc/fstab.

Resources