https://wiki.archlinux.org/api.php?action=feedcontributions&user=Chungy&feedformat=atomArchWiki - User contributions [en]2024-03-29T10:30:48ZUser contributionsMediaWiki 1.41.0https://wiki.archlinux.org/index.php?title=ZFS&diff=784329ZFS2023-08-02T19:23:59Z<p>Chungy: /* Native encryption */ generate keys using only software in base (no openssl requirement)</p>
<hr />
<div>[[Category:File systems]]<br />
[[Category:Oracle]]<br />
[[ja:ZFS]]<br />
[[pt:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|ZFS/Virtual disks}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005.<br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 exabyte]] file size, and a maximum 256 quadrillion [[Wikipedia:Zettabyte|zettabyte]] storage with no limit on number of filesystems (datasets) or files[https://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [https://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"], ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [https://openzfs.org OpenZFS], previously named [https://zfsonlinux.org/ ZFS on Linux] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|<br />
Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.<br />
}}<br />
<br />
== Installation ==<br />
<br />
=== General ===<br />
<br />
{{Warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Unofficial user repositories#archzfs|archzfs]] repository or alternatively the [[Arch User Repository]]:<br />
<br />
* {{AUR|zfs-linux}} for [https://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/openzfs/zfs development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for development releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for development releases for hardened kernels.<br />
* {{AUR|zfs-linux-zen}} for stable releases for zen kernels.<br />
* {{AUR|zfs-linux-zen-git}} for development releases for zen kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
* {{AUR|zfs-dkms-git}} for development releases for versions with dynamic kernel module support.<br />
<br />
These packages have a dependency on the {{ic|zfs-utils}} package.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
See [[Install Arch Linux on ZFS#Installation]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of [[DKMS]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
{{Note|When installing {{Pkg|dkms}}, read [[Dynamic Kernel Module Support#Installation]]}}<br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}}.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, {{Ic|zfs-import-cache.service}} must be enabled to import the pools and {{Ic|zfs-mount.service}} must be enabled to mount the filesystems available in the pools. A benefit to this is that it is not necessary to mount ZFS filesystems in {{ic|/etc/fstab}}. {{Ic|zfs-import-cache.service}} imports the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each [[#Importing a pool created by id|imported pool]] you want automatically imported by {{Ic|zfs-import-cache.service}} execute:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
{{Note|Beginning with [https://github.com/openzfs/zfs/releases/tag/zfs-0.6.5.8 OpenZFS version 0.6.5.8] the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run. See [https://github.com/archzfs/archzfs/issues/72 ArchZFS issue 72] for more information.}}<br />
<br />
[[Enable]] the relevant service ({{ic|zfs-import-cache.service}}) and targets ({{ic|zfs.target}} and {{ic|zfs-import.target}}) so the pools are automatically imported at boot time:<br />
<br />
To mount the ZFS filesystems, you have 2 choices:<br />
* Enable the [[#Using zfs-mount.service|zfs-mount.service]]<br />
* Using [[#Using zfs-mount-generator|zfs-mount-generator]]<br />
<br />
==== Using zfs-mount.service ====<br />
<br />
In order to mount ZFS filesystems automatically on boot you need to [[enable]] {{ic|zfs-mount.service}} and {{ic|zfs.target}}.<br />
<br />
{{Note|1=<nowiki/><br />
* This does not work with separated {{ic|/var}} dataset because it does not get mounted early enough. Instead you should use [[#Using zfs-mount-generator|zfs-mount-generator]] option. See [https://github.com/openzfs/zfs/issues/3768 OpenZFS issue #3768] for more information.<br />
* It appears that the ZFS module is loaded too late for using this method, see [https://bbs.archlinux.org/viewtopic.php?id=274044 BBS#274044] and [[Talk:ZFS#zfs hook in mkinitcpio.conf]]. The workaround is to use the {{ic|ZFS}} [[mkinitcpio]] hook.<br />
}}<br />
<br />
==== Using zfs-mount-generator ====<br />
<br />
You can also use the zfs-mount-generator to create systemd mount units for your ZFS filesystems at boot. systemd will automatically mount the filesystems based on the mount units without having to use the {{Ic|zfs-mount.service}}. To do that, you need to:<br />
<br />
# Create the {{Ic|/etc/zfs/zfs-list.cache}} directory.<br />
# Enable the ZFS Event Daemon(ZED) script (called a ZEDLET) required to create a list of mountable ZFS filesystems. (This link is created automatically if you are using OpenZFS >= 2.0.0.) {{bc|# ln -s /usr/lib/zfs/zed.d/history_event-zfs-list-cacher.sh /etc/zfs/zed.d}}<br />
# [[Enable]] {{ic|zfs.target}} and [[enable/start]] the ZFS Event Daemon ({{ic|zfs-zed.service}}). This service is responsible for running the script in the previous step. <br />
# You need to create an empty file named after your pool in {{Ic|/etc/zfs/zfs-list.cache}}. The ZEDLET will only update the list of filesystems if the file for the pool already exists. {{bc|# touch /etc/zfs/zfs-list.cache/<pool-name>}}<br />
# Check the contents of {{Ic|/etc/zfs/zfs-list.cache/<pool-name>}}. If it is empty, make sure that the {{Ic|zfs-zed.service}} is running and just change the canmount property of any of your ZFS filesystem by running: {{bc|1=zfs set canmount=off zroot/fs1}} This change causes ZFS to raise an event which is captured by ZED, which in turn runs the ZEDLET to update the file in {{Ic|/etc/zfs/zfs-list.cache}}. If the file in {{Ic|/etc/zfs/zfs-list.cache}} is updated, you can set the {{Ic|canmount}} property of the filesystem back by running: {{bc|1=zfs set canmount=on zroot/fs1}}<br />
You need to add a file in {{Ic|/etc/zfs/zfs-list.cache}} for each ZFS pool in your system. Make sure the pools are imported by enabling {{Ic|zfs-import-cache.service}} and {{Ic|zfs-import.target}} as [[#Automatic Start|explained above]].<br />
<br />
== Storage pools ==<br />
<br />
It is not necessary to partition the drives before creating the ZFS filesystem. It is recommended to point ZFS at an entire disk (ie. {{ic|/dev/sdx}} rather than {{ic|/dev/sdx1}}), which will [https://www.reddit.com/r/zfs/comments/667na0/zfs_on_raw_or_gpt/dgh0l9t/ automatically create a GPT (GUID Partition Table)] and add an 8 MB reserved partition at the end of the disk for legacy bootloaders. However, you can specify a partition or a file within an existing filesystem, if you wish to create multiple volumes with different redundancy properties.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to [[mdadm#Prepare the devices|erase any old RAID configuration information]].}}<br />
<br />
{{Warning|For [[Advanced Format]] Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See the OpenZFS FAQ: [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#performance-considerations Performance Considerations], [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#advanced-format-disks Advanced Format Disks], and [https://web.archive.org/web/20170913063528/https://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
=== Identify disks ===<br />
<br />
OpenZFS recommends using device IDs when creating ZFS storage pools of less than 10 devices[https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#selecting-dev-names-when-creating-a-pool-linux]. Use [[Persistent block device naming#by-id and by-path]] to identify the list of drives to be used for ZFS pool. <br />
<br />
The disk IDs should look similar to the following:<br />
<br />
{{hc|$ ls -lh /dev/disk/by-id/|<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb}}<br />
<br />
{{Warning|If you create zpools using device names (e.g. {{ic|/dev/sda}},{{ic|/dev/sdb}},...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
==== Using GPT labels ====<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
{{hc|$ ls -l /dev/disk/by-partlabel|<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1}}<br />
<br />
{{hc|$ ls -l /dev/disk/by-partuuid|<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1}}<br />
<br />
{{Tip|To minimize typing and copy/paste errors, set a local variable with the target PARTUUID: {{ic|<nowiki>$ UUID=$(lsblk --noheadings --output PARTUUID /dev/sd</nowiki>''XY''<nowiki>)</nowiki>}} }}<br />
<br />
=== Creating ZFS pools ===<br />
<br />
To create a ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> [raidz(2|3)|mirror] <ids><br />
<br />
{{Tip|One may want to read [[#Advanced Format disks]] first as it may be recommend to set {{ic|ashift}} on pool creation.}}<br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz(2|3)|mirror''': This is the type of virtual device that will be created from the list of devices. Raidz is a single disk of parity (similar to raid5), raidz2 for 2 disks of parity (similar to raid6) and raidz3 for 3 disks of parity. Also available is '''mirror''', which is similar to raid1 or raid10, but is not constrained to just 2 device. If not specified, each device will be added as a vdev which is similar to raid0. After creation, a device can be added to each single drive vdev to turn it into a mirror, which can be useful for migrating data.<br />
<br />
* '''ids''': The [[Persistent block device naming#by-id and by-path|ID's]] of the drives or partitions that to include into the pool.<br />
<br />
Create pool with single raidz vdev:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
Create pool with two mirror vdevs:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== Advanced Format disks ====<br />
<br />
At pool creation, '''ashift=12''' should always be used, except with SSDs that have 8k sectors where '''ashift=13''' is correct. A vdev of 512 byte disks using 4k sectors will not experience performance issues, but a 4k disk using 512 byte sectors will. Since '''ashift''' cannot be changed after pool creation, even a pool with only 512 byte disks should use 4k because those disks may need to be replaced with 4k disks or the pool may be expanded by adding a vdev composed of 4k disks. Because correct detection of 4k disks is not reliable, {{ic|<nowiki>-o ashift=12</nowiki>}} should always be specified during pool creation. See the [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#advanced-format-disks OpenZFS FAQ] for more details.<br />
<br />
{{Tip|Use {{man|8|blockdev}} (part of {{Pkg|util-linux}}) to print the sector size reported by the device's ioctls: {{ic|blockdev --getpbsz /dev/sd''XY''}} as the root user.}}<br />
<br />
Create pool with ashift=12 and single raidz vdev:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== GRUB-compatible pool creation ====<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. ZFS includes compatibility files (see {{ic|/usr/share/zfs/compatibility.d}}) to assist in creating pools at specific feature sets, of which grub2 is an option.<br />
<br />
You can create a pool with only the compatible features enabled:<br />
<br />
# zpool create -o compatibility=grub2 $POOL_NAME $VDEVS<br />
<br />
=== Verifying pool status ===<br />
<br />
If the command is successful, there will be no output. Using the [[mount]] command will show that the pool is mounted. Using {{ic|zpool status}} will show that the pool has been created:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
{{Warning|Do not run {{ic|zpool import ''pool''}}! This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine.}}<br />
<br />
Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with:<br />
<br />
# zpool import -d /dev/disk/by-id bigdata<br />
# zpool import -d /dev/disk/by-partlabel bigdata<br />
# zpool import -d /dev/disk/by-partuuid bigdata<br />
<br />
{{Note|Use the {{ic|-l}} flag when importing a pool that contains [[#Native encryption|encrypted datasets keys]], e.g.:<br />
# zpool import -l -d /dev/disk/by-id bigdata<br />
}}<br />
<br />
Finally check the state of the pool:<br />
<br />
# zpool status -v bigdata<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device.<br />
<br />
{{Warning|This command destroys '''any data''' containing in the pool and/or dataset.}}<br />
<br />
To destroy the pool:<br />
# zpool destroy <pool><br />
<br />
To destroy a dataset:<br />
# zfs destroy <pool>/<dataset><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|<br />
no pools available<br />
}}<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|1=zfs_force=1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]].<br />
<br />
To export a pool:<br />
<br />
# zpool export <pool><br />
<br />
=== Extending an existing zpool ===<br />
<br />
A device (a partition or a disk) can be added to an existing zpool:<br />
<br />
# zpool add <pool> <device-id><br />
<br />
To import a pool which consists of multiple devices:<br />
<br />
# zpool import -d <device-id-1> -d <device-id-2> <pool><br />
<br />
or simply:<br />
<br />
# zpool import -d /dev/disk-by-id/ <pool><br />
<br />
=== Attaching a device to (create) a mirror ===<br />
<br />
A device (a partition or a disk) can be attached aside an existing device to be its mirror (similar to RAID 1):<br />
<br />
# zpool attach <pool> <device-id|mirror> <new-device-id><br />
<br />
You can attach the new device to an already existing mirror vdev (e.g. to upgrade from a 2-device to a 3-device mirror) or [https://askubuntu.com/a/1303462 attach it to single device to create a new mirror vdev].<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Upgrade zpools ===<br />
<br />
When using a newer {{ic|zfs}} module, zpools may display an upgrade indication:<br />
<br />
{{hc|$ zpool status -v|2=<br />
pool: bigdata<br />
state: ONLINE<br />
status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable.<br />
action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(5) for details.<br />
}}<br />
<br />
{{Note|<br />
* Lower version {{ic|zfs}} modules will not be able to import a zpool of a higher version.<br />
* When dealing with important data, one may want to create a [[backup]] prior running a {{ic|zpool upgrade}}.<br />
}}<br />
<br />
To upgrade the version of zpool ''bigdata'':<br />
<br />
# zpool upgrade bigdata<br />
<br />
To upgrade the version of all zpools:<br />
<br />
# zpool upgrade -a<br />
<br />
== Creating datasets ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, see {{man|8|zfs|url=}} or {{man|8|zpool|url=}}.<br />
<br />
=== Native encryption ===<br />
<br />
ZFS offers the following supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-gcm}} will be used. See [https://openzfs.github.io/openzfs-docs/man/8/zfs-change-key.8.html#Encryption zfs-change-key(8)] for a description of the native encryption, including limitations.<br />
<br />
The following keyformats are supported: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}.<br />
<br />
One can also specify/increase the default iterations of PBKDF2 when using {{ic|passphrase}} with {{ic|-o pbkdf2iters <n>}}, although it may increase the decryption time.<br />
<br />
{{Tip|<br />
* To import a pool with keys, one needs to specify the {{ic|-l}} flag, without this flag encrypted datasets will be left unavailable until the keys are loaded. See [[#Importing a pool created by id]].<br />
* Native ZFS encryption has been made available in the stable 0.8.0 release or newer. Previously it was only available in development versions provided by packages like {{AUR|zfs-linux-git}}, {{AUR|zfs-dkms-git}} or other development builds. Users who were only using the development versions for the native encryption, may now switch to the stable releases if they wish.<br />
* The default encryption suite was changed from {{ic|aes-256-ccm}} to {{ic|aes-256-gcm}} in the 0.8.4 release.<br />
}}<br />
<br />
To create a dataset including native encryption with a passphrase, use:<br />
<br />
# zfs create -o encryption=on -o keyformat=passphrase <nameofzpool>/<nameofdataset><br />
<br />
To use a key instead of using a passphrase:<br />
<br />
# dd if=/dev/random of=/path/to/key bs=1 count=32<br />
# zfs create -o encryption=on -o keyformat=raw -o keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
The easy way to make a key in human-readable form ({{ic|keyformat{{=}}hex}}):<br />
# od -Anone -x -N 32 -w64 /dev/random | tr -d [:blank:] > /path/to/hex.key<br />
<br />
To verify the key location:<br />
# zfs get keylocation <nameofzpool>/<nameofdataset><br />
<br />
To change the key location:<br />
<br />
# zfs set keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
You can also manually load the keys by using one of the following commands:<br />
<br />
# zfs load-key <nameofzpool>/<nameofdataset> # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
To mount the created encrypted dataset:<br />
<br />
# zfs mount <nameofzpool>/<nameofdataset><br />
<br />
==== Unlock/Mount at boot time: systemd ====<br />
<br />
It is possible to automatically unlock a pool dataset on boot time by using a [[systemd]] unit. For example create the following service to unlock any specific dataset: <br />
{{hc|1=/etc/systemd/system/zfs-load-key@.service|2=<nowiki>[Unit]<br />
Description=Load %I encryption keys<br />
Before=systemd-user-sessions.service zfs-mount.service<br />
After=zfs-import.target<br />
Requires=zfs-import.target<br />
DefaultDependencies=no<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/bash -c 'until (systemd-ask-password "Encrypted ZFS password for %I" --no-tty | zfs load-key %I); do echo "Try again!"; done'<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
</nowiki><br />
}}<br />
<br />
[[Enable/start]] the service for each encrypted dataset, (e.g. {{ic|zfs-load-key@pool0-dataset0.service}}). Note the use of {{ic|-}}, which is an escaped {{ic|/}} in systemd unit definitions. See {{ic|systemd-escape(1)}} for more info.<br />
{{Note|The {{ic|1=Before=systemd-user-sessions.service}} ensures that systemd-ask-password is invoked before the local IO devices are handed over to the [[desktop environment]].}}<br />
<br />
An alternative is to load all possible keys:<br />
<br />
{{hc|/etc/systemd/system/zfs-load-key.service|2=<br />
[Unit]<br />
Description=Load encryption keys<br />
DefaultDependencies=no<br />
After=zfs-import.target<br />
Before=zfs-mount.service<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/zfs load-key -a<br />
StandardInput=tty-force<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
}}<br />
<br />
[[Enable/start]] {{ic|zfs-load-key.service}}.<br />
<br />
==== Unlock at login time: PAM ====<br />
<br />
If you are not encrypting the root volume, but only the home volume or a user-specific volume, another idea is to [https://blog.trifork.com/2020/05/22/linux-homedir-encryption/ wait until login to decrypt it]. The advantages of this method are that the system boots uninterrupted, and that when the user logs in, the same password can be used both to authenticate and to decrypt the home volume, so that the password is only entered once.<br />
<br />
First set the mountpoint to legacy to avoid having it mounted by {{ic|zfs mount -a}}:<br />
{{bc|1=zfs set mountpoint=legacy zroot/data/home}}<br />
Ensure that it is in /etc/fstab so that {{ic|mount /home}} will work:<br />
{{hc|/etc/fstab|zroot/data/home /home zfs rw,xattr,posixacl,noauto 0 0}}<br />
<br />
On a single-user system, with only one /home volume having the same encryption password as the user's password, it can be decrypted at login as follows: first create /sbin/mount-zfs-homedir<br />
{{hc|/sbin/mount-zfs-homedir|2=<br />
#!/bin/bash<br />
<br />
# simplified from https://talldanestale.dk/2020/04/06/zfs-and-homedir-encryption/<br />
<br />
set -eu<br />
<br />
# Password is given to us via stdin, save it in a variable for later<br />
PASS=$(cat -)<br />
<br />
VOLNAME="zroot/data/home"<br />
<br />
# Unlock and mount the volume<br />
zfs load-key "$VOLNAME" <<< "$PASS" {{!}}{{!}} continue<br />
zfs mount "$VOLNAME" {{!}}{{!}} true # ignore errors<br />
}}<br />
<br />
do not forget {{ic|chmod a+x /sbin/mount-zfs-homedir}}; then get PAM to run it by adding the following line to /etc/pam.d/system-auth:<br />
<br />
{{hc|/etc/pam.d/system-auth|auth optional pam_exec.so expose_authtok /sbin/mount-zfs-homedir}}<br />
<br />
Now it will transparently decrypt and mount the /home volume when you log in anywhere: on the console, via ssh, etc. A caveat is that since your ~/.ssh directory is not mounted, if you log in via ssh, you must use the default password authentication the first time rather than relying on {{ic|~/.ssh/authorized_keys}}.<br />
<br />
If you want to have separate volumes for each user, each encrypted with the user's password, try the [https://blog.trifork.com/2020/05/22/linux-homedir-encryption/ linked method].<br />
<br />
=== Swap volume ===<br />
<br />
{{Warning|<br />
* On systems with extremely high memory pressure, using a zvol for swap can result in lockup, regardless of how much swap is still available. This issue is currently being investigated in [https://github.com/openzfs/zfs/issues/7734 OpenZFS issue #7734]<br />
* Swap on zvol does not support resume from hibernation, attempt to resume will result in pool corruption. Possible workaround: https://github.com/openzfs/zfs/issues/260#issuecomment-758782144<br />
}}<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is important to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) -o compression=zle \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata -o secondarycache=none \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
=== Access Control Lists ===<br />
<br />
To use [[ACL]] on a dataset:<br />
<br />
# zfs set acltype=posixacl <nameofzpool>/<nameofdataset><br />
# zfs set xattr=sa <nameofzpool>/<nameofdataset><br />
<br />
Setting {{ic|xattr}} is recommended for performance reasons [https://github.com/openzfs/zfs/issues/170#issuecomment-27348094].<br />
<br />
It may be preferable to enable ACL on the zpool as datasets will inherit the ACL parameters. Setting {{ic|1=aclinherit=passthrough}} may be wanted as the default mode is {{ic|restricted}} [https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaaz/index.html]; however, it is worth noting that {{ic|aclinherit}} does not affect POSIX ACLs [https://askubuntu.com/questions/342553/activate-acl-for-zfs-pool-ubuntu-13-04#comment1220577_392008]:<br />
<br />
# zfs set aclinherit=passthrough <nameofzpool><br />
# zfs set acltype=posixacl <nameofzpool><br />
# zfs set xattr=sa <nameofzpool><br />
<br />
=== Databases ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
{{Accuracy|At least MariaDB uses a default of 16Kib pages! Check your specific DBMS before setting this value.}}<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and Oracle database, all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
<br />
{{Note|[[#L2ARC|L2ARC]] requires {{ic|primarycache}} to function, because it is fed with data evicted from {{ic|primarycache}}. If you intend to use the L2ARC, do not set the option below, otherwise no actual data will be cached on L2ARC.}}<br />
<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
ZFS uses the [[#ZIL|ZIL]] for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. However, doing so on non-solid state storage (e.g. HDDs) can result in decreased read performance due to fragmentation ([https://openzfs.org/wiki/ZFS_on_high_latency_devices OpenZFS Wiki]) -- with mechanical hard drives, please consider using a dedicated SSD as [[#ZIL|ZIL]] rather than setting the option below. In addition, setting this for non-database file systems, or for pools with configured log devices, can also ''negatively'' impact the performance, so beware:<br />
<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to [[mask]] (disable) [[systemd]]'s automatic tmpfs-backed /tmp ({{ic|tmp.mount}}, else ZFS will be unable to mount your dataset at boot-time or import-time.<br />
<br />
=== Transmitting snapshots with ZFS Send and ZFS Recv ===<br />
<br />
It is possible to pipe ZFS snapshots to an arbitrary target by pairing {{ic|zfs send}} and {{ic|zfs recv}}. This is done through standard output, which allows the data to be sent to any file, device, across the network, or manipulated mid-stream by incorporating additional programs in the pipe. <br />
<br />
Below are examples of common scenarios:<br />
<br />
==== Basic ZFS Send ====<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
# zfs snapshot zpool0/archive/books@snap<br />
<br />
Now let's send the snapshot to a new location on a different zpool <br />
# zfs send -v zpool0/archive/books@snap | zfs recv zpool4/library<br />
<br />
The contents of {{ic|zpool0/archive/books@snap}} are now live at {{ic|zpool4/library}}<br />
<br />
{{Tip| See [https://openzfs.github.io/openzfs-docs/man/8/zfs-send.8.html man zfs-send] and [https://openzfs.github.io/openzfs-docs/man/8/zfs-recv.8.html man zfs-recv] for details on flags.}}<br />
<br />
===== To and from files =====<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
# zfs snapshot zpool0/archive/books@snap<br />
<br />
Write the snapshot to a gzip file:<br />
# zfs send zpool0/archive/books@snap > /tmp/mybooks.gz<br />
<br />
{{Warning| Make sure to run {{ic|zfs send}} with {{ic|-w}} flag if you wish to preserve encryption during the send.}}<br />
<br />
Now restore the snapshot from the file:<br />
<br />
# gzcat /tmp/mybooks.gz | zfs recv -F zpool0/archive/books<br />
<br />
==== Send over ssh ====<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
<br />
# zfs snapshot zpool1/filestore@snap<br />
<br />
Next we pipe our "send" traffic over an ssh session running "recv":<br />
<br />
# zfs send -v zpool1/filestore@snap | ssh $HOST zfs recv coldstore/backups<br />
<br />
The {{ic|-v}} flag prints information about the datastream being generated. If you are using a passphrase or passkey, you will be prompted to enter it.<br />
<br />
==== Incremental Backups ====<br />
<br />
You may wish update a previously sent ZFS filesystem without retransmitting all of the data over again. <br />
Alternatively, it may be necessary to keep a filesystem online during a lengthy transfer and it is now time to send writes that were made since the initial snapshot.<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
# zfs snapshot zpool1/filestore@initial<br />
Next we pipe our "send" traffic over an ssh session running "recv":<br />
# zfs send -v -R zpool1/filestore@initial | ssh $HOST zfs recv coldstore/backups<br />
Once changes are written, make another snapshot:<br />
# zfs snapshot zpool1/filestore@snap2<br />
The following will send the differences that exist locally between zpool1/filestore@initial and zpool1/filestore@snap2 and create an additional snapshot for the remote filesystem coldstore/backups:<br />
# zfs send -v -i -R zpool1/filestore@initial | ssh $HOST zfs recv coldstore/backups<br />
<br />
Now both zpool1/filestore and coldstore/backups have the @initial and @snap2 snapshots.<br />
<br />
On the remote host, you may now promote the latest snapshot to become the active filesystem:<br />
# rollback coldstore/backups@snap2<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
ZFS pools and datasets can be further adjusted using parameters.<br />
<br />
{{Note|All settable properties, with the exception of quotas and reservations, inherit their value from the parent dataset.}}<br />
<br />
To retrieve the current pool parameter status:<br />
<br />
# zfs get all <pool><br />
<br />
To retrieve the current dataset parameter status:<br />
<br />
# zfs get all <pool>/<dataset><br />
<br />
To disable access time (atime), which is enabled by default:<br />
<br />
# zfs set atime=off <pool><br />
<br />
To disable access time (atime) on a particular dataset:<br />
<br />
# zfs set atime=off <pool>/<dataset><br />
<br />
An alternative to turning off atime completely, {{ic|relatime}} is available. This brings the default ext4/XFS atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between {{ic|1=atime=off}} and {{ic|1=atime=on}}. This property ''only'' takes effect if {{ic|atime}} is {{ic|on}}:<br />
<br />
# zfs set atime=on <pool><br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default, ''gzip'' is also available for seldom-written yet highly-compressible data; consult the [https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html OpenZFS Wiki] for more details.<br />
<br />
To enable compression:<br />
<br />
# zfs set compression=on <pool><br />
<br />
To reset a property of a pool and/or dataset to its default state, use {{ic|zfs inherit}}:<br />
<br />
# zfs inherit -rS atime <pool><br />
# zfs inherit -rS atime <pool>/<dataset><br />
<br />
{{Warning|Using the {{ic|-r}} flag will recursively reset all datasets of the zpool.}}<br />
<br />
=== Scrubbing ===<br />
<br />
Whenever data is read and ZFS encounters an error, it is silently repaired when possible, rewritten back to disk and logged so you can obtain an overview of errors on your pools. There is no fsck or equivalent tool for ZFS. Instead, ZFS supports a feature known as scrubbing. This traverses through all the data in a pool and verifies that all blocks can be read.<br />
<br />
To scrub a pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To cancel a running scrub:<br />
<br />
# zpool scrub -s <pool><br />
<br />
==== How often should I do this? ====<br />
<br />
From the Oracle blog post [https://blogs.oracle.com/wonders-of-zfs-storage/disk-scrub-why-and-when-v2 Disk Scrub - Why and When?]:<br />
<br />
:This question is challenging for Support to answer, because as always the true answer is "It Depends". So before I offer a general guideline, here are a few tips to help you create an answer more tailored to your use pattern.<br />
<br />
:* What is the expiration of your oldest backup? You should probably scrub your data at least as often as your oldest tapes expire so that you have a known-good restore point.<br />
:* How often are you experiencing disk failures? While the recruitment of a hot-spare disk invokes a "resilver" -- a targeted scrub of just the VDEV which lost a disk -- you should probably scrub at least as often as you experience disk failures on average in your specific environment.<br />
:* How often is the oldest piece of data on your disk read? You should scrub occasionally to prevent very old, very stale data from experiencing bit-rot and dying without you knowing it.<br />
<br />
:If any of your answers to the above are "I do not know", the general guideline is: you should probably be scrubbing your zpool at least once per month. It is a schedule that works well for most use cases, provides enough time for scrubs to complete before starting up again on all but the busiest & most heavily-loaded systems, and even on very large zpools (192+ disks) should complete fairly often between disk failures.<br />
<br />
In the [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ ZFS Administration Guide] by Aaron Toponce, he advises to scrub consumer disks once a week.<br />
<br />
==== Start with a service or timer ====<br />
<br />
{{Note|Starting with OpenZFS 2.1.3 weekly and monthly [[systemd]] timers/services are included. To use these [[enable]]/[[start]] {{ic|zfs-scrub-weekly@''pool-to-scrub''.timer}} or {{ic|zfs-scrub-monthly@''pool-to-scrub''.timer}} for the desired pool.}}<br />
<br />
Using a [[systemd]] timer/service it is possible to automatically scrub pools.<br />
<br />
To perform scrubbing monthly on a particular pool:<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.timer|2=<br />
[Unit]<br />
Description=Monthly zpool scrub on %i<br />
<br />
[Timer]<br />
OnCalendar=monthly<br />
AccuracySec=1h<br />
Persistent=true<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.service|2=<br />
[Unit]<br />
Description=zpool scrub on %i<br />
<br />
[Service]<br />
Nice=19<br />
IOSchedulingClass=idle<br />
KillSignal=SIGINT<br />
ExecStart=/usr/bin/zpool scrub %i<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-scrub@''pool-to-scrub''.timer}} unit for monthly scrubbing the specified zpool.<br />
<br />
=== Enabling TRIM ===<br />
<br />
To quickly query your vdevs [[TRIM]] support, you can include trimming information in {{Ic|zpool status}} with {{Ic|-t}}. <br />
<br />
{{hc|$ zpool status -t tank|2=<br />
pool: tank<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
tank ONLINE 0 0 0<br />
ata-ST31000524AS_5RP4SSNR-part1 ONLINE 0 0 0 (trim unsupported)<br />
ata-CT480BX500SSD1_2134A59B933D-part1 ONLINE 0 0 0 (untrimmed)<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
ZFS is capable of trimming supported vdevs either on-demand or periodically via the {{Ic|autotrim}} property.<br />
<br />
Manually performing a TRIM operation on a zpool:<br />
<br />
# zpool trim <zpool><br />
<br />
Enabling periodic trimming on all supported vdevs in a pool:<br />
<br />
# zpool set autotrim=on <zpool><br />
<br />
{{Note|Because of how the automatic TRIM and a full {{Ic|zpool trim}} differ in their operation, [https://github.com/openzfs/zfs/commit/1b939560be5c51deecf875af9dada9d094633bf7 it can make sense to run a manual trim occasionally.] }}<br />
<br />
To perform a full {{Ic|zpool trim}} monthly on a particular pool using a [[systemd]] timer/service:<br />
<br />
{{hc|/etc/systemd/system/zfs-trim@.timer|2=<br />
[Unit]<br />
Description=Monthly zpool trim on %i<br />
<br />
[Timer]<br />
OnCalendar=monthly<br />
AccuracySec=1h<br />
Persistent=true<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
{{hc|/etc/systemd/system/zfs-trim@.service|2=<br />
[Unit]<br />
Description=zpool trim on %i<br />
Documentation=man:zpool-trim(8)<br />
Requires=zfs.target<br />
After=zfs.target<br />
ConditionACPower=true<br />
ConditionPathIsDirectory=/sys/module/zfs<br />
<br />
[Service]<br />
Nice=19<br />
IOSchedulingClass=idle<br />
KillSignal=SIGINT<br />
ExecStart=/bin/sh -c '\<br />
if /usr/bin/zpool status %i {{!}} grep "trimming"; then\<br />
exec /usr/bin/zpool wait -t trim %i;\<br />
else exec /usr/bin/zpool trim -w %i; fi'<br />
ExecStop=-/bin/sh -c '/usr/bin/zpool trim -s %i 2>/dev/null {{!}}{{!}} true'<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-trim@''pool-to-trim''.timer}} unit for monthly trimming of the specified zpool.<br />
<br />
=== SSD Caching ===<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL, also called SLOG). If your data disks are slow (e.g. HDD) it is highly recommended to configure the ZIL on solid state drives for better write performance and also to consider a layer 2 adaptive replacement cache (L2ARC). The process to add them is very similar to adding a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from {{ic|/dev/disk/by-id/*}}.<br />
<br />
==== ZIL ====<br />
<br />
To add a mirrored ZIL:<br />
<br />
# zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
Or to add a single device ZIL (unsafe):<br />
<br />
# zpool add <pool> log <device-id><br />
<br />
Because the ZIL device stores data that has not been written to the pool, it is important to use devices that can finish writes when power is lost. It is also important to use redundancy, since a device failure can cause data loss. In addition, the ZIL is only used for sync writes, so may not provide any performance improvement when your data drives are as fast as your ZIL drive(s).<br />
<br />
==== L2ARC ====<br />
To add L2ARC:<br />
# zpool add <pool> cache <device-id><br />
L2ARC is only a read cache, so redundancy is unnecessary. Since [https://github.com/openzfs/zfs/releases/tag/zfs-2.0.0 ZFS version 2.0.0], L2ARC is persisted across reboots.[https://github.com/openzfs/zfs/pull/9582]<br />
<br />
L2ARC is generally only useful in workloads where the amount of hot data is '''bigger''' than system memory, but small enough to fit into L2ARC. The L2ARC is indexed by the ARC in system memory, consuming 70 bytes per record (default 128KiB). Thus, the equation for RAM usage is:<br />
(L2ARC size) / (recordsize) * 70 bytes<br />
Because of this, L2ARC can, in certain workloads, harm performance as it takes memory away from ARC.<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8 KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8 KiB tends to be a good value for most file systems, even when using 4 KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/openzfs/zfs/issues/1807 OpenZFS issue #1807] for details.<br />
<br />
=== I/O Scheduler ===<br />
<br />
While ZFS is expected to work well with modern schedulers including, {{ic|mq-deadline}}, and {{ic|none}}, experimenting with [[Improving performance#Changing I/O scheduler|manually setting]] the I/O scheduler on ZFS disks may yield performance gains.<br />
<br />
== Troubleshooting ==<br />
<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because ZFS expects pool creation to take less than 1 second[https://github.com/openzfs/zfs/issues/2582][https://github.com/openzfs/zfs/issues/1646]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
One cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MiB)<br />
<br />
In case that the default value of {{ic|zfs_arc_min}} (1/32 of system memory) is higher than the specified {{ic|zfs_arc_max}} it is needed to add also the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_min=268435456 # (for 256MiB, needs to be lower than zfs.zfs_arc_max)<br />
<br />
For a more detailed description, as well as other configuration options, see [[Gentoo:ZFS#ARC]].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then [[regenerate the initramfs]] image. Which will copy the hostid into the initramfs image.<br />
<br />
=== Pool cannot be found while booting from SAS/SCSI devices ===<br />
<br />
In case you are booting a SAS/SCSI based, you might occassionally get boot problems where the pool you are trying to boot from cannot be found. A likely reason for this is that your devices are initialized too late into the process. That means that zfs cannot find any devices at the time when it tries to assemble your pool.<br />
<br />
In this case you should force the scsi driver to wait for devices to come online before continuing. You can do this by putting this into {{ic|/etc/modprobe.d/zfs.conf}}:<br />
<br />
{{hc|1=/etc/modprobe.d/zfs.conf|2=<br />
options scsi_mod scan=sync<br />
}}<br />
<br />
Afterwards, [[regenerate the initramfs]].<br />
<br />
This works because the zfs hook will copy the file at {{ic|/etc/modprobe.d/zfs.conf}} into the initcpio which will then be used at build time.<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [https://web.archive.org/web/20151101094022/http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then [[regenerate the initramfs]] in normally booted system.<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
<br />
{{hc|$ hostid|<br />
0a0af0f8<br />
}}<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|1=spl.spl_hostid=0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Install Arch Linux on ZFS#Configure systemd ZFS mounts|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|1=zfs_force=1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#performance-considerations OpenZFS FAQ: Performance Considerations] and [https://web.archive.org/web/20170913063528/https://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
=== Pool resilvering stuck/restarting/slow? ===<br />
<br />
According to the ZFSonLinux github it is a known issue since 2012 with ZFS-ZED which causes the resilvering process to constantly restart, sometimes get stuck and be generally slow for some hardware. The simplest mitigation is to stop zfs-zed.service until the resilver completes.<br />
<br />
=== Fix slow boot caused by failed import of unavailable pools in the initramfs zpool.cache ===<br />
<br />
Your boot time can be significantly impacted if you update your intitramfs (eg when doing a kernel update) when you have additional but non-permanently attached pools imported because these pools will get added to your initramfs zpool.cache and ZFS will attempt to import these extra pools on every boot, regardless of whether you have exported it and removed it from your regular zpool.cache.<br />
<br />
If you notice ZFS trying to import unavailable pools at boot, first run:<br />
<br />
$ zdb -C<br />
<br />
To check your zpool.cache for pools you do not want imported at boot. If this command is showing (a) additional, currently unavailable pool(s), run:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache zroot<br />
<br />
To clear the zpool.cache of any pools other than the pool named zroot. Sometimes there is no need to refresh your zpool.cache, but instead all you need to do is [[regenerate the initramfs]].<br />
<br />
=== ZFS Command History ===<br />
<br />
ZFS logs changes to a pool's structure natively as a log of executed commands in a ring buffer (which cannot be turned off).<br />
The log may be helpful when restoring a degraded or failed pool.<br />
<br />
{{hc|# zpool history zpool|<nowiki><br />
History for 'zpool':<br />
2023-02-19.16:28:44 zpool create zpool raidz1 /scratch/disk_1.img /scratch/disk_2.img /scratch/disk_3.img<br />
2023-02-19.16:31:29 zfs set compression=lz4 zpool<br />
2023-02-19.16:41:45 zpool scrub zpool<br />
2023-02-19.17:00:57 zpool replace zpool /scratch/disk_1.img /scratch/bigger_disk_1.img<br />
2023-02-19.17:01:34 zpool scrub zpool<br />
2023-02-19.17:01:42 zpool replace zpool /scratch/disk_2.img /scratch/bigger_disk_2.img<br />
2023-02-19.17:01:46 zpool replace zpool /scratch/disk_3.img /scratch/bigger_disk_3.img<br />
</nowiki>}}<br />
<br />
== Tips and tricks ==<br />
<br />
=== Create an Archiso image with ZFS support ===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image. To include ZFS support in the image, you can either build your choice of PKGBUILDs from the AUR or include prebuilt packages from one of the [[unofficial user repositories]].<br />
<br />
==== Using self-built ZFS packages from the AUR ====<br />
<br />
Build the ZFS packages you want by following the [[Arch User Repository#Build the package|normal procedures]]. If you are unsure, {{aur|zfs-dkms}} and {{aur|zfs-utils}} are likely to be compatible with the widest range of other modifications to the Archiso image you may wish to perform. Proceed to set up [[Pacman/Tips and tricks#Custom local repository|a custom local repository]]. [[Archiso#Custom local repository|Include the resulting repository]] in the Pacman configuration of your new profile.<br />
<br />
Include the built packages in the list of packages to be installed. The example below presumes you want to include only the {{aur|zfs-dkms}} and {{aur|zfs-utils}} packages.<br />
<br />
{{hc|packages.x86_64|<br />
...<br />
zfs-dkms<br />
zfs-utils<br />
}}<br />
<br />
If you include any [[DKMS]] packages, make sure you also include headers for any kernels you are including in the ISO ({{pkg|linux-headers}} for the default kernel).<br />
<br />
==== Using the archzfs unofficial user repository ====<br />
<br />
Add the [[Unofficial user repositories#archzfs|archzfs]] unofficial user repository to {{ic|pacman.conf}} in your new Archiso profile.<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
{{Note|If you later have problems running modprobe zfs, you should include the linux-headers in the packages.x86_64. }}<br />
<br />
==== Finishing up ====<br />
<br />
Regardless of where you source your ZFS packages from, you should finish by [[Archiso#Build the ISO|building the ISO]].<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== zrepl ====<br />
<br />
The {{AUR|zrepl}} package provides a ZFS automatic replication service, which could also be used as a snapshotting service much like [[snapper]].<br />
<br />
For details on how to configure the zrepl daemon, see the zrepl [https://zrepl.github.io/ documentation]. The configuration file should be located at {{ic|/etc/zrepl/zrepl.yml}}. Then, run {{ic|zrepl configcheck}} to make sure that the syntax of the config file is correct. Finally, enable {{ic|zrepl.service}}.<br />
<br />
==== sanoid ====<br />
<br />
{{AUR|sanoid}} is a policy-driven tool for taking snapshots. Sanoid also includes {{ic|syncoid}}, which is for replicating snapshots. It comes with systemd services and a timer.<br />
<br />
Sanoid only prunes snapshots on the local system. To prune snapshots on the remote system, run sanoid there as well with prune options. Either use the {{ic|--prune-snapshots}} command line option or use the {{ic|--cron}} command line option together with the {{ic|1=autoprune = yes}} and {{ic|1=autosnap = no}} configuration options.<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
{{Note| {{AUR|zfs-auto-snapshot-git}} has not seen any updates since 2019, and the functionality is extremely limited. You are advised to switch to a newer tool like {{AUR|zrepl}}. }}<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the {{ic|--keep parameter}} from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during [[#Scrubbing|scrubbing]]. It is possible to override this by [[Systemd#Editing provided units|editing provided systemd unit]] and removing {{ic|--skip-scrub}} from {{ic|ExecStart}} line. Consequences not known, someone please edit.}}<br />
<br />
Once the package has been installed, [[systemd/Timers|enable and start the selected timers]] ({{ic|<nowiki>zfs-auto-snapshot-{frequent,daily,weekly,monthly}.timer</nowiki>}}).<br />
<br />
=== Creating a share ===<br />
<br />
ZFS has support for creating shares by [[NFS]] or [[SMB]].<br />
<br />
==== NFS ====<br />
<br />
Make sure [[NFS]] has been installed/configured, note there is no need to edit the {{ic|/etc/exports}} file. For sharing over NFS the services {{ic|nfs-server.service}} and {{ic|zfs-share.service}} should be [[start|started]].<br />
<br />
To make a pool available on the network:<br />
<br />
# zfs set sharenfs=on ''nameofzpool''<br />
<br />
To make a dataset available on the network:<br />
<br />
# zfs set sharenfs=on ''nameofzpool''/''nameofdataset''<br />
<br />
To enable read/write access for a specific ip-range(s):<br />
<br />
# zfs set sharenfs="rw=@192.168.1.100/24,rw=@10.0.0.0/24" ''nameofzpool''/''nameofdataset''<br />
<br />
To check if the dataset is exported successfully:<br />
<br />
{{hc|# showmount -e `hostname`|<br />
Export list for hostname:<br />
/path/of/dataset 192.168.1.100/24<br />
}}<br />
<br />
To view the current loaded exports state in more detail, use:<br />
<br />
{{hc|# exportfs -v|2=<br />
''/path/of/dataset''<br />
192.168.1.100/24(sync,wdelay,hide,no_subtree_check,mountpoint,sec=sys,rw,secure,no_root_squash,no_all_squash)<br />
}}<br />
<br />
To view the current NFS share list by ZFS:<br />
<br />
# zfs get sharenfs<br />
<br />
==== SMB ====<br />
<br />
When sharing through SMB, using {{ic|usershares}} in {{ic|/etc/samba/smb.conf}} will allow ZFS to setup and create the shares. See [[Samba#Enable Usershares]] for details.<br />
<br />
{{hc|/etc/samba/smb.conf|2=<br />
[global]<br />
usershare path = /var/lib/samba/usershares<br />
usershare max shares = 100<br />
usershare allow guests = yes<br />
usershare owner only = no<br />
}}<br />
<br />
Create and set permissions on the user directory as root<br />
<br />
# mkdir /var/lib/samba/usershares<br />
# chmod +t /var/lib/samba/usershares<br />
<br />
To make a pool available on the network:<br />
<br />
# zfs set sharesmb=on ''nameofzpool''<br />
<br />
To make a dataset available on the network:<br />
<br />
# zfs set sharesmb=on ''nameofzpool''/''nameofdataset''<br />
<br />
To check if the dataset is exported successfully:<br />
<br />
{{hc|# smbclient -L localhost -U%|<br />
Sharename Type Comment<br />
--------- ---- -------<br />
IPC$ IPC IPC Service (SMB Server Name)<br />
''nameofzpool''_''nameofdataset'' Disk Comment: path/of/dataset<br />
SMB1 disabled -- no workgroup available<br />
}}<br />
<br />
To view the current SMB share list by ZFS:<br />
<br />
# zfs get sharesmb<br />
<br />
=== Encryption in ZFS using dm-crypt ===<br />
<br />
Before [https://github.com/openzfs/zfs/releases/tag/zfs-0.8.0 OpenZFS version 0.8.0], ZFS did not support encryption directly (See [[#Native encryption]]). Instead, zpools can be created on [[dm-crypt]] block devices. Since the zpool is created on the plain-text abstraction, it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness. Furthermore, utilizing dm-crypt will encrypt the zpools metadata, which the native encryption can inherently not provide.[https://openzfs.github.io/openzfs-docs/man/8/zfs-change-key.8.html#Encryption]<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there. Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinitcpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible. To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Create an Archiso image with ZFS support]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial user repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partition and EFI system partition (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
[[Regenerate the initramfs]]. There should be no errors.<br />
<br />
=== Bind mount ===<br />
<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
<br />
See {{man|5|systemd.mount}} for more information on how systemd converts fstab into mount unit files with {{man|8|systemd-fstab-generator}}.<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
=== Monitoring / Mailing on Events ===<br />
<br />
See [https://ramsdenj.com/2016/08/29/arch-linux-on-zfs-part-3-followup.html ZED: The ZFS Event Daemon] for more information.<br />
<br />
An email forwarder, such as [[S-nail]], is required to accomplish this. Test it to be sure it is working correctly.<br />
<br />
Uncomment the following in the configuration file:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|<nowiki><br />
ZED_EMAIL_ADDR="root"<br />
ZED_EMAIL_PROG="mailx"<br />
ZED_NOTIFY_VERBOSE=0<br />
ZED_EMAIL_OPTS="-s '@SUBJECT@' @ADDRESS@"<br />
</nowiki>}}<br />
<br />
Update 'root' in {{ic|1=ZED_EMAIL_ADDR="root"}} to the email address you want to receive notifications at.<br />
<br />
If you are keeping your mailrc in your home directory, you can tell mail to get it from there by setting {{ic|MAILRC}}:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|2=<br />
export MAILRC=/home/<user>/.mailrc<br />
}}<br />
<br />
This works because ZED sources this file, so {{ic|mailx}} sees this environment variable.<br />
<br />
If you want to receive an email no matter the state of your pool, you will want to set {{ic|1=ZED_NOTIFY_VERBOSE=1}}. You will need to do this temporary to test.<br />
<br />
[[Start]] and [[enable]] {{ic|zfs-zed.service}}.<br />
<br />
With {{ic|1=ZED_NOTIFY_VERBOSE=1}}, you can test by running a scrub as root: {{ic|1=zpool scrub <pool-name>}}.<br />
<br />
=== Wrap shell commands in pre & post snapshots ===<br />
<br />
Since it is so cheap to make a snapshot, we can use this as a measure of security for sensitive commands such as system and package upgrades. If we make a snapshot before, and one after, we can later diff these snapshots to find out what changed on the filesystem after the command executed. Furthermore we can also rollback in case the outcome was not desired.<br />
<br />
==== znp ====<br />
<br />
E.g.:<br />
<br />
# zfs snapshot -r zroot@pre<br />
# pacman -Syu<br />
# zfs snapshot -r zroot@post<br />
# zfs diff zroot@pre zroot@post <br />
# zfs rollback zroot@pre<br />
<br />
A utility that automates the creation of pre and post snapshots around a shell command is [https://gist.github.com/erikw/eeec35be33e847c211acd886ffb145d5 znp].<br />
<br />
E.g.:<br />
<br />
# znp pacman -Syu<br />
# znp find / -name "something*" -delete<br />
<br />
and you would get snapshots created before and after the supplied command, and also output of the commands logged to file for future reference so we know what command created the diff seen in a pair of pre/post snapshots.<br />
<br />
=== Remote unlocking of ZFS encrypted root ===<br />
<br />
As of [https://github.com/archzfs/archzfs/pull/261 PR #261], {{ic|archzfs}} supports SSH unlocking of natively-encrypted ZFS datasets. This section describes how to use this feature, and is largely based on [[dm-crypt/Specialties#Busybox based initramfs (built with mkinitcpio)]].<br />
<br />
# Install {{Pkg|mkinitcpio-netconf}} to provide hooks for setting up early user space networking.<br />
# Choose an SSH server to use in early user space. The options are {{Pkg|mkinitcpio-tinyssh}} or {{Pkg|mkinitcpio-dropbear}}, and are mutually exclusive.<br />
## If using {{Pkg|mkinitcpio-tinyssh}}, it is also recommended to install {{Pkg|tinyssh}} or {{AUR|tinyssh-convert-git}}. This tool converts an existing OpenSSH hostkey to the TinySSH key format, preserving the key fingerprint and avoiding connection warnings. The TinySSH and Dropbear mkinitcpio install scripts will automatically convert existing hostkeys when generating a new initcpio image.<br />
# Decide whether to use an existing OpenSSH key or generate a new one (recommended) for the host that will be connecting to and unlocking the encrypted ZFS machine. Copy the public key into {{ic|/etc/tinyssh/root_key}} or {{ic|/etc/dropbear/root_key}}. When generating the initcpio image, this file will be added to {{ic|authorized_keys}} for the root user and is only valid in the initrd environment.<br />
# Add the {{ic|1=ip=}} [[kernel parameter]] to your boot loader configuration. The {{ic|ip}} string is [https://docs.kernel.org/admin-guide/nfs/nfsroot.html highly configurable]. A simple DHCP example is shown below.{{bc|1=ip=:::::eth0:dhcp}}<br />
# Edit {{ic|/etc/mkinitcpio.conf}} to include the {{ic|netconf}}, {{ic|dropbear}} or {{ic|tinyssh}}, and {{ic|zfsencryptssh}} hooks before the {{ic|zfs}} hook:{{bc|1=HOOKS=(... netconf <tinyssh>{{!}}<dropbear> zfsencryptssh zfs ...)}}<br />
# [[Regenerate the initramfs]].<br />
# Reboot and try it out!<br />
<br />
==== Changing the SSH server port ====<br />
<br />
By default, {{Pkg|mkinitcpio-tinyssh}} and {{Pkg|mkinitcpio-dropbear}} listen on port {{ic|22}}. You may wish to change this.<br />
<br />
For '''TinySSH''', copy {{ic|/usr/lib/initcpio/hooks/tinyssh}} to {{ic|/etc/initcpio/hooks/tinyssh}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/bin/tcpserver -HRDl0 0.0.0.0 <new_port> /usr/sbin/tinysshd -v /etc/tinyssh/sshkeydir &<br />
}}<br />
<br />
For '''Dropbear''', copy {{ic|/usr/lib/initcpio/hooks/dropbear}} to {{ic|/etc/initcpio/hooks/dropbear}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/sbin/dropbear -E -s -j -k -p <new_port><br />
}}<br />
<br />
[[Regenerate the initramfs]].<br />
<br />
==== Unlocking from a Windows machine using PuTTY/Plink ====<br />
<br />
First, we need to use {{ic|puttygen.exe}} to import and convert the OpenSSH key generated earlier into PuTTY's ''.ppk'' private key format. Let us call it {{ic|zfs_unlock.ppk}} for this example.<br />
<br />
The mkinitcpio-netconf process above does not setup a shell (nor do we need need one). However, because there is no shell, PuTTY will immediately close after a successful connection. This can be disabled in the PuTTY SSH configuration (''Connection > SSH > [X] Do not start a shell or command at all''), but it still does not allow us to see stdout or enter the encryption passphrase. Instead, we use {{ic|plink.exe}} with the following parameters:<br />
<br />
plink.exe -ssh -l root -i c:\path\to\zfs_unlock.ppk <hostname><br />
<br />
The plink command can be put into a batch script for ease of use.<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [https://zfsonlinux.org/ ZFS on Linux]<br />
* [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html OpenZFS FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook - The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [https://web.archive.org/web/20161028084224/http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide ZFS Best Practices Guide]<br />
* [https://docs.oracle.com/cd/E23823_01/html/819-5461/gavwg.html ZFS Troubleshooting Guide]<br />
* [https://web.archive.org/web/20171213164254/http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]<br />
* [https://github.com/danboid/creating-ZFS-disks-under-Linux How to create cross platform ZFS disks under Linux]<br />
* [https://blog.heckel.xyz/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/ How-To: Using ZFS Encryption at Rest in OpenZFS (ZFS on Linux, ZFS on FreeBSD, …)]<br />
* [https://archzfs.leibelt.de/ Archzfs iso download page: Frequently updated and downloadable archzfs linux iso with full OpenZFS support since 2016]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=779551ZFS2023-05-25T22:06:43Z<p>Chungy: /* GRUB-compatible pool creation */ Stop using a long line, just do it the easy way :)</p>
<hr />
<div>[[Category:File systems]]<br />
[[Category:Oracle]]<br />
[[ja:ZFS]]<br />
[[pt:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|ZFS/Virtual disks}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005.<br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 exabyte]] file size, and a maximum 256 quadrillion [[Wikipedia:Zettabyte|zettabyte]] storage with no limit on number of filesystems (datasets) or files[https://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [https://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"], ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [https://openzfs.org OpenZFS], previously named [https://zfsonlinux.org/ ZFS on Linux] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|<br />
Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.<br />
}}<br />
<br />
== Installation ==<br />
<br />
=== General ===<br />
<br />
{{Warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Unofficial user repositories#archzfs|archzfs]] repository or alternatively the [[Arch User Repository]]:<br />
<br />
* {{AUR|zfs-linux}} for [https://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/openzfs/zfs development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for development releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for development releases for hardened kernels.<br />
* {{AUR|zfs-linux-zen}} for stable releases for zen kernels.<br />
* {{AUR|zfs-linux-zen-git}} for development releases for zen kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
* {{AUR|zfs-dkms-git}} for development releases for versions with dynamic kernel module support.<br />
<br />
These packages have a dependency on the {{ic|zfs-utils}} package.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
See [[Install Arch Linux on ZFS#Installation]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of [[DKMS]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
{{Note|When installing {{Pkg|dkms}}, read [[Dynamic Kernel Module Support#Installation]]}}<br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}}.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, {{Ic|zfs-import-cache.service}} must be enabled to import the pools and {{Ic|zfs-mount.service}} must be enabled to mount the filesystems available in the pools. A benefit to this is that it is not necessary to mount ZFS filesystems in {{ic|/etc/fstab}}. {{Ic|zfs-import-cache.service}} imports the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each [[#Importing a pool created by id|imported pool]] you want automatically imported by {{Ic|zfs-import-cache.service}} execute:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
{{Note|Beginning with [https://github.com/openzfs/zfs/releases/tag/zfs-0.6.5.8 OpenZFS version 0.6.5.8] the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run. See [https://github.com/archzfs/archzfs/issues/72 ArchZFS issue 72] for more information.}}<br />
<br />
[[Enable]] the relevant service ({{ic|zfs-import-cache.service}}) and targets ({{ic|zfs.target}} and {{ic|zfs-import.target}}) so the pools are automatically imported at boot time:<br />
<br />
To mount the ZFS filesystems, you have 2 choices:<br />
* Enable the [[#Using zfs-mount.service|zfs-mount.service]]<br />
* Using [[#Using zfs-mount-generator|zfs-mount-generator]]<br />
<br />
==== Using zfs-mount.service ====<br />
<br />
In order to mount ZFS filesystems automatically on boot you need to [[enable]] {{ic|zfs-mount.service}} and {{ic|zfs.target}}.<br />
<br />
{{Note|This does not work with separated /var dataset because it does not get mounted early enough. Instead you should use [[#Using zfs-mount-generator|zfs-mount-generator]] option. See [https://github.com/openzfs/zfs/issues/3768 OpenZFS issue #3768] for more information.}}<br />
<br />
==== Using zfs-mount-generator ====<br />
<br />
You can also use the zfs-mount-generator to create systemd mount units for your ZFS filesystems at boot. systemd will automatically mount the filesystems based on the mount units without having to use the {{Ic|zfs-mount.service}}. To do that, you need to:<br />
<br />
# Create the {{Ic|/etc/zfs/zfs-list.cache}} directory.<br />
# Enable the ZFS Event Daemon(ZED) script (called a ZEDLET) required to create a list of mountable ZFS filesystems. (This link is created automatically if you are using OpenZFS >= 2.0.0.) {{bc|# ln -s /usr/lib/zfs/zed.d/history_event-zfs-list-cacher.sh /etc/zfs/zed.d}}<br />
# [[Enable]] {{ic|zfs.target}} and [[enable/start]] the ZFS Event Daemon ({{ic|zfs-zed.service}}). This service is responsible for running the script in the previous step. <br />
# You need to create an empty file named after your pool in {{Ic|/etc/zfs/zfs-list.cache}}. The ZEDLET will only update the list of filesystems if the file for the pool already exists. {{bc|# touch /etc/zfs/zfs-list.cache/<pool-name>}}<br />
# Check the contents of {{Ic|/etc/zfs/zfs-list.cache/<pool-name>}}. If it is empty, make sure that the {{Ic|zfs-zed.service}} is running and just change the canmount property of any of your ZFS filesystem by running: {{bc|1=zfs set canmount=off zroot/fs1}} This change causes ZFS to raise an event which is captured by ZED, which in turn runs the ZEDLET to update the file in {{Ic|/etc/zfs/zfs-list.cache}}. If the file in {{Ic|/etc/zfs/zfs-list.cache}} is updated, you can set the {{Ic|canmount}} property of the filesystem back by running: {{bc|1=zfs set canmount=on zroot/fs1}}<br />
You need to add a file in {{Ic|/etc/zfs/zfs-list.cache}} for each ZFS pool in your system. Make sure the pools are imported by enabling {{Ic|zfs-import-cache.service}} and {{Ic|zfs-import.target}} as [[#Automatic Start|explained above]].<br />
<br />
== Storage pools ==<br />
<br />
It is not necessary to partition the drives before creating the ZFS filesystem. It is recommended to point ZFS at an entire disk (ie. {{ic|/dev/sdx}} rather than {{ic|/dev/sdx1}}), which will [https://www.reddit.com/r/zfs/comments/667na0/zfs_on_raw_or_gpt/dgh0l9t/ automatically create a GPT (GUID Partition Table)] and add an 8 MB reserved partition at the end of the disk for legacy bootloaders. However, you can specify a partition or a file within an existing filesystem, if you wish to create multiple volumes with different redundancy properties.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to [[mdadm#Prepare the devices|erase any old RAID configuration information]].}}<br />
<br />
{{Warning|For [[Advanced Format]] Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See the OpenZFS FAQ: [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#performance-considerations Performance Considerations], [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#advanced-format-disks Advanced Format Disks], and [https://web.archive.org/web/20170913063528/https://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
=== Identify disks ===<br />
<br />
OpenZFS recommends using device IDs when creating ZFS storage pools of less than 10 devices[https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#selecting-dev-names-when-creating-a-pool-linux]. Use [[Persistent block device naming#by-id and by-path]] to identify the list of drives to be used for ZFS pool. <br />
<br />
The disk IDs should look similar to the following:<br />
<br />
{{hc|$ ls -lh /dev/disk/by-id/|<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb}}<br />
<br />
{{Warning|If you create zpools using device names (e.g. {{ic|/dev/sda}},{{ic|/dev/sdb}},...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
==== Using GPT labels ====<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
{{hc|$ ls -l /dev/disk/by-partlabel|<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1}}<br />
<br />
{{hc|$ ls -l /dev/disk/by-partuuid|<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1}}<br />
<br />
{{Tip|To minimize typing and copy/paste errors, set a local variable with the target PARTUUID: {{ic|<nowiki>$ UUID=$(lsblk --noheadings --output PARTUUID /dev/sd</nowiki>''XY''<nowiki>)</nowiki>}} }}<br />
<br />
=== Creating ZFS pools ===<br />
<br />
To create a ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> [raidz(2|3)|mirror] <ids><br />
<br />
{{Tip|One may want to read [[#Advanced Format disks]] first as it may be recommend to set {{ic|ashift}} on pool creation.}}<br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz(2|3)|mirror''': This is the type of virtual device that will be created from the list of devices. Raidz is a single disk of parity (similar to raid5), raidz2 for 2 disks of parity (similar to raid6) and raidz3 for 3 disks of parity. Also available is '''mirror''', which is similar to raid1 or raid10, but is not constrained to just 2 device. If not specified, each device will be added as a vdev which is similar to raid0. After creation, a device can be added to each single drive vdev to turn it into a mirror, which can be useful for migrating data.<br />
<br />
* '''ids''': The [[Persistent block device naming#by-id and by-path|ID's]] of the drives or partitions that to include into the pool.<br />
<br />
Create pool with single raidz vdev:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
Create pool with two mirror vdevs:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== Advanced Format disks ====<br />
<br />
At pool creation, '''ashift=12''' should always be used, except with SSDs that have 8k sectors where '''ashift=13''' is correct. A vdev of 512 byte disks using 4k sectors will not experience performance issues, but a 4k disk using 512 byte sectors will. Since '''ashift''' cannot be changed after pool creation, even a pool with only 512 byte disks should use 4k because those disks may need to be replaced with 4k disks or the pool may be expanded by adding a vdev composed of 4k disks. Because correct detection of 4k disks is not reliable, {{ic|<nowiki>-o ashift=12</nowiki>}} should always be specified during pool creation. See the [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#advanced-format-disks OpenZFS FAQ] for more details.<br />
<br />
{{Tip|Use {{man|8|blockdev}} (part of {{Pkg|util-linux}}) to print the sector size reported by the device's ioctls: {{ic|blockdev --getpbsz /dev/sd''XY''}} as the root user.}}<br />
<br />
Create pool with ashift=12 and single raidz vdev:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== GRUB-compatible pool creation ====<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. ZFS includes compatibility files (see {{ic|/usr/share/zfs/compatibility.d}}) to assist in creating pools at specific feature sets, of which grub2 is an option.<br />
<br />
You can create a pool with only the compatible features enabled:<br />
<br />
# zpool create -o compatibility=grub2 $POOL_NAME $VDEVS<br />
<br />
=== Verifying pool status ===<br />
<br />
If the command is successful, there will be no output. Using the [[mount]] command will show that the pool is mounted. Using {{ic|zpool status}} will show that the pool has been created:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
{{Warning|Do not run {{ic|zpool import ''pool''}}! This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine.}}<br />
<br />
Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with:<br />
<br />
# zpool import -d /dev/disk/by-id bigdata<br />
# zpool import -d /dev/disk/by-partlabel bigdata<br />
# zpool import -d /dev/disk/by-partuuid bigdata<br />
<br />
{{Note|Use the {{ic|-l}} flag when importing a pool that contains [[#Native encryption|encrypted datasets keys]], e.g.:<br />
# zpool import -l -d /dev/disk/by-id bigdata<br />
}}<br />
<br />
Finally check the state of the pool:<br />
<br />
# zpool status -v bigdata<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device.<br />
<br />
{{Warning|This command destroys '''any data''' containing in the pool and/or dataset.}}<br />
<br />
To destroy the pool:<br />
# zpool destroy <pool><br />
<br />
To destroy a dataset:<br />
# zfs destroy <pool>/<dataset><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|<br />
no pools available<br />
}}<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|1=zfs_force=1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]].<br />
<br />
To export a pool:<br />
<br />
# zpool export <pool><br />
<br />
=== Extending an existing zpool ===<br />
<br />
A device (a partition or a disk) can be added to an existing zpool:<br />
<br />
# zpool add <pool> <device-id><br />
<br />
To import a pool which consists of multiple devices:<br />
<br />
# zpool import -d <device-id-1> -d <device-id-2> <pool><br />
<br />
or simply:<br />
<br />
# zpool import -d /dev/disk-by-id/ <pool><br />
<br />
=== Attaching a device to (create) a mirror ===<br />
<br />
A device (a partition or a disk) can be attached aside an existing device to be its mirror (similar to RAID 1):<br />
<br />
# zpool attach <pool> <device-id|mirror> <new-device-id><br />
<br />
You can attach the new device to an already existing mirror vdev (e.g. to upgrade from a 2-device to a 3-device mirror) or [https://askubuntu.com/a/1303462 attach it to single device to create a new mirror vdev].<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Upgrade zpools ===<br />
<br />
When using a newer {{ic|zfs}} module, zpools may display an upgrade indication:<br />
<br />
{{hc|$ zpool status -v|2=<br />
pool: bigdata<br />
state: ONLINE<br />
status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable.<br />
action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(5) for details.<br />
}}<br />
<br />
{{Note|<br />
* Lower version {{ic|zfs}} modules will not be able to import a zpool of a higher version.<br />
* When dealing with important data, one may want to create a [[backup]] prior running a {{ic|zpool upgrade}}.<br />
}}<br />
<br />
To upgrade the version of zpool ''bigdata'':<br />
<br />
# zpool upgrade bigdata<br />
<br />
To upgrade the version of all zpools:<br />
<br />
# zpool upgrade -a<br />
<br />
== Creating datasets ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, see {{man|8|zfs|url=}} or {{man|8|zpool|url=}}.<br />
<br />
=== Native encryption ===<br />
<br />
ZFS offers the following supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-gcm}} will be used. See [https://openzfs.github.io/openzfs-docs/man/8/zfs-change-key.8.html#Encryption zfs-change-key(8)] for a description of the native encryption, including limitations.<br />
<br />
The following keyformats are supported: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}.<br />
<br />
One can also specify/increase the default iterations of PBKDF2 when using {{ic|passphrase}} with {{ic|-o pbkdf2iters <n>}}, although it may increase the decryption time.<br />
<br />
{{Tip|<br />
* To import a pool with keys, one needs to specify the {{ic|-l}} flag, without this flag encrypted datasets will be left unavailable until the keys are loaded. See [[#Importing a pool created by id]].<br />
* Native ZFS encryption has been made available in the stable 0.8.0 release or newer. Previously it was only available in development versions provided by packages like {{AUR|zfs-linux-git}}, {{AUR|zfs-dkms-git}} or other development builds. Users who were only using the development versions for the native encryption, may now switch to the stable releases if they wish.<br />
* The default encryption suite was changed from {{ic|aes-256-ccm}} to {{ic|aes-256-gcm}} in the 0.8.4 release.<br />
}}<br />
<br />
To create a dataset including native encryption with a passphrase, use:<br />
<br />
# zfs create -o encryption=on -o keyformat=passphrase <nameofzpool>/<nameofdataset><br />
<br />
To use a key instead of using a passphrase:<br />
<br />
# dd if=/dev/random of=/path/to/key bs=1 count=32<br />
# zfs create -o encryption=on -o keyformat=raw -o keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
To verify the key location:<br />
# zfs get keylocation <nameofzpool>/<nameofdataset><br />
<br />
To change the key location:<br />
<br />
# zfs set keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
You can also manually load the keys by using one of the following commands:<br />
<br />
# zfs load-key <nameofzpool>/<nameofdataset> # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
To mount the created encrypted dataset:<br />
<br />
# zfs mount <nameofzpool>/<nameofdataset><br />
<br />
==== Unlock/Mount at boot time: systemd ====<br />
<br />
It is possible to automatically unlock a pool dataset on boot time by using a [[systemd]] unit. For example create the following service to unlock any specific dataset: <br />
{{hc|1=/etc/systemd/system/zfs-load-key@.service|2=<nowiki>[Unit]<br />
Description=Load %I encryption keys<br />
Before=systemd-user-sessions.service zfs-mount.service<br />
After=zfs-import.target<br />
Requires=zfs-import.target<br />
DefaultDependencies=no<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/bash -c 'until (systemd-ask-password "Encrypted ZFS password for %I" --no-tty | zfs load-key %I); do echo "Try again!"; done'<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
</nowiki><br />
}}<br />
<br />
[[Enable/start]] the service for each encrypted dataset, (e.g. {{ic|zfs-load-key@pool0-dataset0.service}}). Note the use of {{ic|-}}, which is an escaped {{ic|/}} in systemd unit definitions. See {{ic|systemd-escape(1)}} for more info.<br />
{{Note|The {{ic|1=Before=systemd-user-sessions.service}} ensures that systemd-ask-password is invoked before the local IO devices are handed over to the [[desktop environment]].}}<br />
<br />
An alternative is to load all possible keys:<br />
<br />
{{hc|/etc/systemd/system/zfs-load-key.service|2=<br />
[Unit]<br />
Description=Load encryption keys<br />
DefaultDependencies=no<br />
After=zfs-import.target<br />
Before=zfs-mount.service<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/zfs load-key -a<br />
StandardInput=tty-force<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
}}<br />
<br />
[[Enable/start]] {{ic|zfs-load-key.service}}.<br />
<br />
==== Unlock at login time: PAM ====<br />
<br />
If you are not encrypting the root volume, but only the home volume or a user-specific volume, another idea is to [https://blog.trifork.com/2020/05/22/linux-homedir-encryption/ wait until login to decrypt it]. The advantages of this method are that the system boots uninterrupted, and that when the user logs in, the same password can be used both to authenticate and to decrypt the home volume, so that the password is only entered once.<br />
<br />
First set the mountpoint to legacy to avoid having it mounted by {{ic|zfs mount -a}}:<br />
{{bc|1=zfs set mountpoint=legacy zroot/data/home}}<br />
Ensure that it is in /etc/fstab so that {{ic|mount /home}} will work:<br />
{{hc|/etc/fstab|zroot/data/home /home zfs rw,xattr,posixacl,noauto 0 0}}<br />
<br />
On a single-user system, with only one /home volume having the same encryption password as the user's password, it can be decrypted at login as follows: first create /sbin/mount-zfs-homedir<br />
{{hc|/sbin/mount-zfs-homedir|2=<br />
#!/bin/bash<br />
<br />
# simplified from https://talldanestale.dk/2020/04/06/zfs-and-homedir-encryption/<br />
<br />
set -eu<br />
<br />
# Password is given to us via stdin, save it in a variable for later<br />
PASS=$(cat -)<br />
<br />
VOLNAME="zroot/data/home"<br />
<br />
# Unlock and mount the volume<br />
zfs load-key "$VOLNAME" <<< "$PASS" {{!}}{{!}} continue<br />
zfs mount "$VOLNAME" {{!}}{{!}} true # ignore errors<br />
}}<br />
<br />
do not forget {{ic|chmod a+x /sbin/mount-zfs-homedir}}; then get PAM to run it by adding the following line to /etc/pam.d/system-auth:<br />
<br />
{{hc|/etc/pam.d/system-auth|auth optional pam_exec.so expose_authtok /sbin/mount-zfs-homedir}}<br />
<br />
Now it will transparently decrypt and mount the /home volume when you log in anywhere: on the console, via ssh, etc. A caveat is that since your ~/.ssh directory is not mounted, if you log in via ssh, you must use the default password authentication the first time rather than relying on {{ic|~/.ssh/authorized_keys}}.<br />
<br />
If you want to have separate volumes for each user, each encrypted with the user's password, try the [https://blog.trifork.com/2020/05/22/linux-homedir-encryption/ linked method].<br />
<br />
=== Swap volume ===<br />
<br />
{{Warning|<br />
* On systems with extremely high memory pressure, using a zvol for swap can result in lockup, regardless of how much swap is still available. This issue is currently being investigated in [https://github.com/openzfs/zfs/issues/7734 OpenZFS issue #7734]<br />
* Swap on zvol does not support resume from hibernation, attempt to resume will result in pool corruption. Possible workaround: https://github.com/openzfs/zfs/issues/260#issuecomment-758782144<br />
}}<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is important to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) -o compression=zle \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata -o secondarycache=none \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
=== Access Control Lists ===<br />
<br />
To use [[ACL]] on a dataset:<br />
<br />
# zfs set acltype=posixacl <nameofzpool>/<nameofdataset><br />
# zfs set xattr=sa <nameofzpool>/<nameofdataset><br />
<br />
Setting {{ic|xattr}} is recommended for performance reasons [https://github.com/openzfs/zfs/issues/170#issuecomment-27348094].<br />
<br />
It may be preferable to enable ACL on the zpool as datasets will inherit the ACL parameters. Setting {{ic|1=aclinherit=passthrough}} may be wanted as the default mode is {{ic|restricted}} [https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaaz/index.html]; however, it is worth noting that {{ic|aclinherit}} does not affect POSIX ACLs [https://askubuntu.com/questions/342553/activate-acl-for-zfs-pool-ubuntu-13-04#comment1220577_392008]:<br />
<br />
# zfs set aclinherit=passthrough <nameofzpool><br />
# zfs set acltype=posixacl <nameofzpool><br />
# zfs set xattr=sa <nameofzpool><br />
<br />
=== Databases ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
{{Accuracy|At least MariaDB uses a default of 16Kib pages! Check your specific DBMS before setting this value.}}<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and Oracle database, all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
<br />
{{Note|[[#L2ARC|L2ARC]] requires {{ic|primarycache}} to function, because it is fed with data evicted from {{ic|primarycache}}. If you intend to use the L2ARC, do not set the option below, otherwise no actual data will be cached on L2ARC.}}<br />
<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
ZFS uses the [[#ZIL|ZIL]] for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. However, doing so on non-solid state storage (e.g. HDDs) can result in decreased read performance due to fragmentation ([https://openzfs.org/wiki/ZFS_on_high_latency_devices OpenZFS Wiki]) -- with mechanical hard drives, please consider using a dedicated SSD as [[#ZIL|ZIL]] rather than setting the option below. In addition, setting this for non-database file systems, or for pools with configured log devices, can also ''negatively'' impact the performance, so beware:<br />
<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to [[mask]] (disable) [[systemd]]'s automatic tmpfs-backed /tmp ({{ic|tmp.mount}}, else ZFS will be unable to mount your dataset at boot-time or import-time.<br />
<br />
=== Transmitting snapshots with ZFS Send and ZFS Recv ===<br />
<br />
It is possible to pipe ZFS snapshots to an arbitrary target by pairing {{ic|zfs send}} and {{ic|zfs recv}}. This is done through standard output, which allows the data to be sent to any file, device, across the network, or manipulated mid-stream by incorporating additional programs in the pipe. <br />
<br />
Below are examples of common scenarios:<br />
<br />
==== Basic ZFS Send ====<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
# zfs snapshot zpool0/archive/books@snap<br />
<br />
Now let's send the snapshot to a new location on a different zpool <br />
# zfs send -v zpool0/archive/books@snap | zfs recv zpool4/library<br />
<br />
The contents of {{ic|zpool0/archive/books@snap}} are now live at {{ic|zpool4/library}}<br />
<br />
{{Tip| See [https://openzfs.github.io/openzfs-docs/man/8/zfs-send.8.html man zfs-send] and [https://openzfs.github.io/openzfs-docs/man/8/zfs-recv.8.html man zfs-recv] for details on flags.}}<br />
<br />
===== To and from files =====<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
# zfs snapshot zpool0/archive/books@snap<br />
<br />
Write the snapshot to a gzip file:<br />
# zfs send zpool0/archive/books@snap > /tmp/mybooks.gz<br />
<br />
{{Warning| Make sure to run {{ic|zfs send}} with {{ic|-w}} flag if you wish to preserve encryption during the send.}}<br />
<br />
Now restore the snapshot from the file:<br />
<br />
# gzcat /tmp/mybooks.gz | zfs recv -F zpool0/archive/books<br />
<br />
==== Send over ssh ====<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
<br />
# zfs snapshot zpool1/filestore@snap<br />
<br />
Next we pipe our "send" traffic over an ssh session running "recv":<br />
<br />
# zfs send -v zpool1/filestore@snap | ssh $HOST zfs recv coldstore/backups<br />
<br />
The {{ic|-v}} flag prints information about the datastream being generated. If you are using a passphrase or passkey, you will be prompted to enter it.<br />
<br />
==== Incremental Backups ====<br />
<br />
You may wish update a previously sent ZFS filesystem without retransmitting all of the data over again. <br />
Alternatively, it may be necessary to keep a filesystem online during a lengthy transfer and it is now time to send writes that were made since the initial snapshot.<br />
<br />
First, let's create a snapshot of some ZFS filesystem:<br />
# zfs snapshot zpool1/filestore@initial<br />
Next we pipe our "send" traffic over an ssh session running "recv":<br />
# zfs send -v -R zpool1/filestore@initial | ssh $HOST zfs recv coldstore/backups<br />
Once changes are written, make another snapshot:<br />
# zfs snapshot zpool1/filestore@snap2<br />
The following will send the differences that exist locally between zpool1/filestore@initial and zpool1/filestore@snap2 and create an additional snapshot for the remote filesystem coldstore/backups:<br />
# zfs send -v -i -R zpool1/filestore@initial | ssh $HOST zfs recv coldstore/backups<br />
<br />
Now both zpool1/filestore and coldstore/backups have the @initial and @snap2 snapshots.<br />
<br />
On the remote host, you may now promote the latest snapshot to become the active filesystem:<br />
# rollback coldstore/backups@snap2<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
ZFS pools and datasets can be further adjusted using parameters.<br />
<br />
{{Note|All settable properties, with the exception of quotas and reservations, inherit their value from the parent dataset.}}<br />
<br />
To retrieve the current pool parameter status:<br />
<br />
# zfs get all <pool><br />
<br />
To retrieve the current dataset parameter status:<br />
<br />
# zfs get all <pool>/<dataset><br />
<br />
To disable access time (atime), which is enabled by default:<br />
<br />
# zfs set atime=off <pool><br />
<br />
To disable access time (atime) on a particular dataset:<br />
<br />
# zfs set atime=off <pool>/<dataset><br />
<br />
An alternative to turning off atime completely, {{ic|relatime}} is available. This brings the default ext4/XFS atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between {{ic|1=atime=off}} and {{ic|1=atime=on}}. This property ''only'' takes effect if {{ic|atime}} is {{ic|on}}:<br />
<br />
# zfs set atime=on <pool><br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default, ''gzip'' is also available for seldom-written yet highly-compressible data; consult the [https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html OpenZFS Wiki] for more details.<br />
<br />
To enable compression:<br />
<br />
# zfs set compression=on <pool><br />
<br />
To reset a property of a pool and/or dataset to its default state, use {{ic|zfs inherit}}:<br />
<br />
# zfs inherit -rS atime <pool><br />
# zfs inherit -rS atime <pool>/<dataset><br />
<br />
{{Warning|Using the {{ic|-r}} flag will recursively reset all datasets of the zpool.}}<br />
<br />
=== Scrubbing ===<br />
<br />
Whenever data is read and ZFS encounters an error, it is silently repaired when possible, rewritten back to disk and logged so you can obtain an overview of errors on your pools. There is no fsck or equivalent tool for ZFS. Instead, ZFS supports a feature known as scrubbing. This traverses through all the data in a pool and verifies that all blocks can be read.<br />
<br />
To scrub a pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To cancel a running scrub:<br />
<br />
# zpool scrub -s <pool><br />
<br />
==== How often should I do this? ====<br />
<br />
From the Oracle blog post [https://blogs.oracle.com/wonders-of-zfs-storage/disk-scrub-why-and-when-v2 Disk Scrub - Why and When?]:<br />
<br />
:This question is challenging for Support to answer, because as always the true answer is "It Depends". So before I offer a general guideline, here are a few tips to help you create an answer more tailored to your use pattern.<br />
<br />
:* What is the expiration of your oldest backup? You should probably scrub your data at least as often as your oldest tapes expire so that you have a known-good restore point.<br />
:* How often are you experiencing disk failures? While the recruitment of a hot-spare disk invokes a "resilver" -- a targeted scrub of just the VDEV which lost a disk -- you should probably scrub at least as often as you experience disk failures on average in your specific environment.<br />
:* How often is the oldest piece of data on your disk read? You should scrub occasionally to prevent very old, very stale data from experiencing bit-rot and dying without you knowing it.<br />
<br />
:If any of your answers to the above are "I do not know", the general guideline is: you should probably be scrubbing your zpool at least once per month. It is a schedule that works well for most use cases, provides enough time for scrubs to complete before starting up again on all but the busiest & most heavily-loaded systems, and even on very large zpools (192+ disks) should complete fairly often between disk failures.<br />
<br />
In the [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ ZFS Administration Guide] by Aaron Toponce, he advises to scrub consumer disks once a week.<br />
<br />
==== Start with a service or timer ====<br />
<br />
{{Note|Starting with OpenZFS 2.1.3 weekly and monthly [[systemd]] timers/services are included. To use these [[enable]]/[[start]] {{ic|zfs-scrub-weekly@''pool-to-scrub''.timer}} or {{ic|zfs-scrub-monthly@''pool-to-scrub''.timer}} for the desired pool.}}<br />
<br />
Using a [[systemd]] timer/service it is possible to automatically scrub pools.<br />
<br />
To perform scrubbing monthly on a particular pool:<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.timer|2=<br />
[Unit]<br />
Description=Monthly zpool scrub on %i<br />
<br />
[Timer]<br />
OnCalendar=monthly<br />
AccuracySec=1h<br />
Persistent=true<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.service|2=<br />
[Unit]<br />
Description=zpool scrub on %i<br />
<br />
[Service]<br />
Nice=19<br />
IOSchedulingClass=idle<br />
KillSignal=SIGINT<br />
ExecStart=/usr/bin/zpool scrub %i<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-scrub@''pool-to-scrub''.timer}} unit for monthly scrubbing the specified zpool.<br />
<br />
=== Enabling TRIM ===<br />
<br />
To quickly query your vdevs [[TRIM]] support, you can include trimming information in {{Ic|zpool status}} with {{Ic|-t}}. <br />
<br />
{{hc|$ zpool status -t tank|2=<br />
pool: tank<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
tank ONLINE 0 0 0<br />
ata-ST31000524AS_5RP4SSNR-part1 ONLINE 0 0 0 (trim unsupported)<br />
ata-CT480BX500SSD1_2134A59B933D-part1 ONLINE 0 0 0 (untrimmed)<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
ZFS is capable of trimming supported vdevs either on-demand or periodically via the {{Ic|autotrim}} property.<br />
<br />
Manually performing a TRIM operation on a zpool:<br />
<br />
# zpool trim <zpool><br />
<br />
Enabling periodic trimming on all supported vdevs in a pool:<br />
<br />
# zpool set autotrim=on <zpool><br />
<br />
{{Note|Because of how the automatic TRIM and a full {{Ic|zpool trim}} differ in their operation, [https://github.com/openzfs/zfs/commit/1b939560be5c51deecf875af9dada9d094633bf7 it can make sense to run a manual trim occasionally.] }}<br />
<br />
To perform a full {{Ic|zpool trim}} monthly on a particular pool using a [[systemd]] timer/service:<br />
<br />
{{hc|/etc/systemd/system/zfs-trim@.timer|2=<br />
[Unit]<br />
Description=Monthly zpool trim on %i<br />
<br />
[Timer]<br />
OnCalendar=monthly<br />
AccuracySec=1h<br />
Persistent=true<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
{{hc|/etc/systemd/system/zfs-trim@.service|2=<br />
[Unit]<br />
Description=zpool trim on %i<br />
Documentation=man:zpool-trim(8)<br />
Requires=zfs.target<br />
After=zfs.target<br />
ConditionACPower=true<br />
ConditionPathIsDirectory=/sys/module/zfs<br />
<br />
[Service]<br />
Nice=19<br />
IOSchedulingClass=idle<br />
KillSignal=SIGINT<br />
ExecStart=/bin/sh -c '\<br />
if /usr/bin/zpool status %i {{!}} grep "trimming"; then\<br />
exec /usr/bin/zpool wait -t trim %i;\<br />
else exec /usr/bin/zpool trim -w %i; fi'<br />
ExecStop=-/bin/sh -c '/usr/bin/zpool trim -s %i 2>/dev/null {{!}}{{!}} true'<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-trim@''pool-to-trim''.timer}} unit for monthly trimming of the specified zpool.<br />
<br />
=== SSD Caching ===<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL, also called SLOG). If your data disks are slow (e.g. HDD) it is highly recommended to configure the ZIL on solid state drives for better write performance and also to consider a layer 2 adaptive replacement cache (L2ARC). The process to add them is very similar to adding a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from {{ic|/dev/disk/by-id/*}}.<br />
<br />
==== ZIL ====<br />
<br />
To add a mirrored ZIL:<br />
<br />
# zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
Or to add a single device ZIL (unsafe):<br />
<br />
# zpool add <pool> log <device-id><br />
<br />
Because the ZIL device stores data that has not been written to the pool, it is important to use devices that can finish writes when power is lost. It is also important to use redundancy, since a device failure can cause data loss. In addition, the ZIL is only used for sync writes, so may not provide any performance improvement when your data drives are as fast as your ZIL drive(s).<br />
<br />
==== L2ARC ====<br />
To add L2ARC:<br />
# zpool add <pool> cache <device-id><br />
L2ARC is only a read cache, so redundancy is unnecessary. Since [https://github.com/openzfs/zfs/releases/tag/zfs-2.0.0 ZFS version 2.0.0], L2ARC is persisted across reboots.[https://github.com/openzfs/zfs/pull/9582]<br />
<br />
L2ARC is generally only useful in workloads where the amount of hot data is '''bigger''' than system memory, but small enough to fit into L2ARC. The L2ARC is indexed by the ARC in system memory, consuming 70 bytes per record (default 128KiB). Thus, the equation for RAM usage is:<br />
(L2ARC size) / (recordsize) * 70 bytes<br />
Because of this, L2ARC can, in certain workloads, harm performance as it takes memory away from ARC.<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8 KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8 KiB tends to be a good value for most file systems, even when using 4 KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/openzfs/zfs/issues/1807 OpenZFS issue #1807] for details.<br />
<br />
=== I/O Scheduler ===<br />
<br />
While ZFS is expected to work well with modern schedulers including, {{ic|mq-deadline}}, and {{ic|none}}, experimenting with [[Improving performance#Changing I/O scheduler|manually setting]] the I/O scheduler on ZFS disks may yield performance gains.<br />
<br />
== Troubleshooting ==<br />
<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because ZFS expects pool creation to take less than 1 second[https://github.com/openzfs/zfs/issues/2582][https://github.com/openzfs/zfs/issues/1646]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
One cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MiB)<br />
<br />
In case that the default value of {{ic|zfs_arc_min}} (1/32 of system memory) is higher than the specified {{ic|zfs_arc_max}} it is needed to add also the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_min=268435456 # (for 256MiB, needs to be lower than zfs.zfs_arc_max)<br />
<br />
For a more detailed description, as well as other configuration options, see [[Gentoo:ZFS#ARC]].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then [[regenerate the initramfs]] image. Which will copy the hostid into the initramfs image.<br />
<br />
=== Pool cannot be found while booting from SAS/SCSI devices ===<br />
<br />
In case you are booting a SAS/SCSI based, you might occassionally get boot problems where the pool you are trying to boot from cannot be found. A likely reason for this is that your devices are initialized too late into the process. That means that zfs cannot find any devices at the time when it tries to assemble your pool.<br />
<br />
In this case you should force the scsi driver to wait for devices to come online before continuing. You can do this by putting this into {{ic|/etc/modprobe.d/zfs.conf}}:<br />
<br />
{{hc|1=/etc/modprobe.d/zfs.conf|2=<br />
options scsi_mod scan=sync<br />
}}<br />
<br />
Afterwards, [[regenerate the initramfs]].<br />
<br />
This works because the zfs hook will copy the file at {{ic|/etc/modprobe.d/zfs.conf}} into the initcpio which will then be used at build time.<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [https://web.archive.org/web/20151101094022/http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then [[regenerate the initramfs]] in normally booted system.<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
<br />
{{hc|$ hostid|<br />
0a0af0f8<br />
}}<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|1=spl.spl_hostid=0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Install Arch Linux on ZFS#Configure systemd ZFS mounts|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|1=zfs_force=1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#performance-considerations OpenZFS FAQ: Performance Considerations] and [https://web.archive.org/web/20170913063528/https://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
=== Pool resilvering stuck/restarting/slow? ===<br />
<br />
According to the ZFSonLinux github it is a known issue since 2012 with ZFS-ZED which causes the resilvering process to constantly restart, sometimes get stuck and be generally slow for some hardware. The simplest mitigation is to stop zfs-zed.service until the resilver completes.<br />
<br />
=== Fix slow boot caused by failed import of unavailable pools in the initramfs zpool.cache ===<br />
<br />
Your boot time can be significantly impacted if you update your intitramfs (eg when doing a kernel update) when you have additional but non-permanently attached pools imported because these pools will get added to your initramfs zpool.cache and ZFS will attempt to import these extra pools on every boot, regardless of whether you have exported it and removed it from your regular zpool.cache.<br />
<br />
If you notice ZFS trying to import unavailable pools at boot, first run:<br />
<br />
$ zdb -C<br />
<br />
To check your zpool.cache for pools you do not want imported at boot. If this command is showing (a) additional, currently unavailable pool(s), run:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache zroot<br />
<br />
To clear the zpool.cache of any pools other than the pool named zroot. Sometimes there is no need to refresh your zpool.cache, but instead all you need to do is [[regenerate the initramfs]].<br />
<br />
=== ZFS Command History ===<br />
<br />
ZFS logs changes to a pool's structure natively as a log of executed commands in a ring buffer (which cannot be turned off).<br />
The log may be helpful when restoring a degraded or failed pool.<br />
<br />
{{hc|# zpool history zpool|<nowiki><br />
History for 'zpool':<br />
2023-02-19.16:28:44 zpool create zpool raidz1 /scratch/disk_1.img /scratch/disk_2.img /scratch/disk_3.img<br />
2023-02-19.16:31:29 zfs set compression=lz4 zpool<br />
2023-02-19.16:41:45 zpool scrub zpool<br />
2023-02-19.17:00:57 zpool replace zpool /scratch/disk_1.img /scratch/bigger_disk_1.img<br />
2023-02-19.17:01:34 zpool scrub zpool<br />
2023-02-19.17:01:42 zpool replace zpool /scratch/disk_2.img /scratch/bigger_disk_2.img<br />
2023-02-19.17:01:46 zpool replace zpool /scratch/disk_3.img /scratch/bigger_disk_3.img<br />
</nowiki>}}<br />
<br />
== Tips and tricks ==<br />
<br />
=== Create an Archiso image with ZFS support ===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image. To include ZFS support in the image, you can either build your choice of PKGBUILDs from the AUR or include prebuilt packages from one of the [[unofficial user repositories]].<br />
<br />
==== Using self-built ZFS packages from the AUR ====<br />
<br />
Build the ZFS packages you want by following the [[Arch User Repository#Build the package|normal procedures]]. If you are unsure, {{aur|zfs-dkms}} and {{aur|zfs-utils}} are likely to be compatible with the widest range of other modifications to the Archiso image you may wish to perform. Proceed to set up [[Pacman/Tips and tricks#Custom local repository|a custom local repository]]. [[Archiso#Custom local repository|Include the resulting repository]] in the Pacman configuration of your new profile.<br />
<br />
Include the built packages in the list of packages to be installed. The example below presumes you want to include only the {{aur|zfs-dkms}} and {{aur|zfs-utils}} packages.<br />
<br />
{{hc|packages.x86_64|<br />
...<br />
zfs-dkms<br />
zfs-utils<br />
}}<br />
<br />
If you include any [[DKMS]] packages, make sure you also include headers for any kernels you are including in the ISO ({{pkg|linux-headers}} for the default kernel).<br />
<br />
==== Using the archzfs unofficial user repository ====<br />
<br />
Add the [[Unofficial user repositories#archzfs|archzfs]] unofficial user repository to {{ic|pacman.conf}} in your new Archiso profile.<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
{{Note|If you later have problems running modprobe zfs, you should include the linux-headers in the packages.x86_64. }}<br />
<br />
==== Finishing up ====<br />
<br />
Regardless of where you source your ZFS packages from, you should finish by [[Archiso#Build the ISO|building the ISO]].<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== zrepl ====<br />
<br />
The {{AUR|zrepl}} package provides a ZFS automatic replication service, which could also be used as a snapshotting service much like [[snapper]].<br />
<br />
For details on how to configure the zrepl daemon, see the zrepl [https://zrepl.github.io/ documentation]. The configuration file should be located at {{ic|/etc/zrepl/zrepl.yml}}. Then, run {{ic|zrepl configcheck}} to make sure that the syntax of the config file is correct. Finally, enable {{ic|zrepl.service}}.<br />
<br />
==== sanoid ====<br />
<br />
{{AUR|sanoid}} is a policy-driven tool for taking snapshots. Sanoid also includes {{ic|syncoid}}, which is for replicating snapshots. It comes with systemd services and a timer.<br />
<br />
Sanoid only prunes snapshots on the local system. To prune snapshots on the remote system, run sanoid there as well with prune options. Either use the {{ic|--prune-snapshots}} command line option or use the {{ic|--cron}} command line option together with the {{ic|1=autoprune = yes}} and {{ic|1=autosnap = no}} configuration options.<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
{{Note| {{AUR|zfs-auto-snapshot-git}} has not seen any updates since 2019, and the functionality is extremely limited. You are advised to switch to a newer tool like {{AUR|zrepl}}. }}<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the {{ic|--keep parameter}} from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during [[#Scrubbing|scrubbing]]. It is possible to override this by [[Systemd#Editing provided units|editing provided systemd unit]] and removing {{ic|--skip-scrub}} from {{ic|ExecStart}} line. Consequences not known, someone please edit.}}<br />
<br />
Once the package has been installed, [[systemd/Timers|enable and start the selected timers]] ({{ic|<nowiki>zfs-auto-snapshot-{frequent,daily,weekly,monthly}.timer</nowiki>}}).<br />
<br />
=== Creating a share ===<br />
<br />
ZFS has support for creating shares by [[NFS]] or [[SMB]].<br />
<br />
==== NFS ====<br />
<br />
Make sure [[NFS]] has been installed/configured, note there is no need to edit the {{ic|/etc/exports}} file. For sharing over NFS the services {{ic|nfs-server.service}} and {{ic|zfs-share.service}} should be [[start|started]].<br />
<br />
To make a pool available on the network:<br />
<br />
# zfs set sharenfs=on ''nameofzpool''<br />
<br />
To make a dataset available on the network:<br />
<br />
# zfs set sharenfs=on ''nameofzpool''/''nameofdataset''<br />
<br />
To enable read/write access for a specific ip-range(s):<br />
<br />
# zfs set sharenfs="rw=@192.168.1.100/24,rw=@10.0.0.0/24" ''nameofzpool''/''nameofdataset''<br />
<br />
To check if the dataset is exported successful:<br />
<br />
{{hc|# showmount -e `hostname`|<br />
Export list for hostname:<br />
/path/of/dataset 192.168.1.100/24<br />
}}<br />
<br />
To view the current loaded exports state in more detail, use:<br />
<br />
{{hc|# exportfs -v|2=<br />
''/path/of/dataset''<br />
192.168.1.100/24(sync,wdelay,hide,no_subtree_check,mountpoint,sec=sys,rw,secure,no_root_squash,no_all_squash)<br />
}}<br />
<br />
To view the current NFS share list by ZFS:<br />
<br />
# zfs get sharenfs<br />
<br />
==== SMB ====<br />
<br />
When sharing through SMB, using {{ic|usershares}} in {{ic|/etc/samba/smb.conf}} will allow ZFS to setup and create the shares. See [[Samba#Enable Usershares]] for details.<br />
<br />
{{hc|/etc/samba/smb.conf|2=<br />
[global]<br />
usershare path = /var/lib/samba/usershares<br />
usershare max shares = 100<br />
usershare allow guests = yes<br />
usershare owner only = no<br />
}}<br />
<br />
Create and set permissions on the user directory as root<br />
<br />
# mkdir /var/lib/samba/usershares<br />
# chmod +t /var/lib/samba/usershares<br />
<br />
To make a pool available on the network:<br />
<br />
# zfs set sharesmb=on ''nameofzpool''<br />
<br />
To make a dataset available on the network:<br />
<br />
# zfs set sharesmb=on ''nameofzpool''/''nameofdataset''<br />
<br />
To check if the dataset is exported successfully:<br />
<br />
{{hc|# smbclient -L localhost -U%|<br />
Sharename Type Comment<br />
--------- ---- -------<br />
IPC$ IPC IPC Service (SMB Server Name)<br />
''nameofzpool''_''nameofdataset'' Disk Comment: path/of/dataset<br />
SMB1 disabled -- no workgroup available<br />
}}<br />
<br />
To view the current SMB share list by ZFS:<br />
<br />
# zfs get sharesmb<br />
<br />
=== Encryption in ZFS using dm-crypt ===<br />
<br />
Before [https://github.com/openzfs/zfs/releases/tag/zfs-0.8.0 OpenZFS version 0.8.0], ZFS did not support encryption directly (See [[#Native encryption]]). Instead, zpools can be created on [[dm-crypt]] block devices. Since the zpool is created on the plain-text abstraction, it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness. Furthermore, utilizing dm-crypt will encrypt the zpools metadata, which the native encryption can inherently not provide.[https://openzfs.github.io/openzfs-docs/man/8/zfs-change-key.8.html#Encryption]<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there. Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinitcpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible. To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Create an Archiso image with ZFS support]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial user repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partition and EFI system partition (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
[[Regenerate the initramfs]]. There should be no errors.<br />
<br />
=== Bind mount ===<br />
<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
<br />
See {{man|5|systemd.mount}} for more information on how systemd converts fstab into mount unit files with {{man|8|systemd-fstab-generator}}.<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
=== Monitoring / Mailing on Events ===<br />
<br />
See [https://ramsdenj.com/2016/08/29/arch-linux-on-zfs-part-3-followup.html ZED: The ZFS Event Daemon] for more information.<br />
<br />
An email forwarder, such as [[S-nail]], is required to accomplish this. Test it to be sure it is working correctly.<br />
<br />
Uncomment the following in the configuration file:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|<nowiki><br />
ZED_EMAIL_ADDR="root"<br />
ZED_EMAIL_PROG="mailx"<br />
ZED_NOTIFY_VERBOSE=0<br />
ZED_EMAIL_OPTS="-s '@SUBJECT@' @ADDRESS@"<br />
</nowiki>}}<br />
<br />
Update 'root' in {{ic|1=ZED_EMAIL_ADDR="root"}} to the email address you want to receive notifications at.<br />
<br />
If you are keeping your mailrc in your home directory, you can tell mail to get it from there by setting {{ic|MAILRC}}:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|2=<br />
export MAILRC=/home/<user>/.mailrc<br />
}}<br />
<br />
This works because ZED sources this file, so {{ic|mailx}} sees this environment variable.<br />
<br />
If you want to receive an email no matter the state of your pool, you will want to set {{ic|1=ZED_NOTIFY_VERBOSE=1}}. You will need to do this temporary to test.<br />
<br />
[[Start]] and [[enable]] {{ic|zfs-zed.service}}.<br />
<br />
With {{ic|1=ZED_NOTIFY_VERBOSE=1}}, you can test by running a scrub as root: {{ic|1=zpool scrub <pool-name>}}.<br />
<br />
=== Wrap shell commands in pre & post snapshots ===<br />
<br />
Since it is so cheap to make a snapshot, we can use this as a measure of security for sensitive commands such as system and package upgrades. If we make a snapshot before, and one after, we can later diff these snapshots to find out what changed on the filesystem after the command executed. Furthermore we can also rollback in case the outcome was not desired.<br />
<br />
==== znp ====<br />
<br />
E.g.:<br />
<br />
# zfs snapshot -r zroot@pre<br />
# pacman -Syu<br />
# zfs snapshot -r zroot@post<br />
# zfs diff zroot@pre zroot@post <br />
# zfs rollback zroot@pre<br />
<br />
A utility that automates the creation of pre and post snapshots around a shell command is [https://gist.github.com/erikw/eeec35be33e847c211acd886ffb145d5 znp].<br />
<br />
E.g.:<br />
<br />
# znp pacman -Syu<br />
# znp find / -name "something*" -delete<br />
<br />
and you would get snapshots created before and after the supplied command, and also output of the commands logged to file for future reference so we know what command created the diff seen in a pair of pre/post snapshots.<br />
<br />
=== Remote unlocking of ZFS encrypted root ===<br />
<br />
As of [https://github.com/archzfs/archzfs/pull/261 PR #261], {{ic|archzfs}} supports SSH unlocking of natively-encrypted ZFS datasets. This section describes how to use this feature, and is largely based on [[dm-crypt/Specialties#Remote unlocking (hooks: netconf, dropbear, tinyssh, ppp)]].<br />
<br />
# Install {{Pkg|mkinitcpio-netconf}} to provide hooks for setting up early user space networking.<br />
# Choose an SSH server to use in early user space. The options are {{Pkg|mkinitcpio-tinyssh}} or {{Pkg|mkinitcpio-dropbear}}, and are mutually exclusive.<br />
## If using {{Pkg|mkinitcpio-tinyssh}}, it is also recommended to install {{Pkg|tinyssh}} or {{AUR|tinyssh-convert-git}}. This tool converts an existing OpenSSH hostkey to the TinySSH key format, preserving the key fingerprint and avoiding connection warnings. The TinySSH and Dropbear mkinitcpio install scripts will automatically convert existing hostkeys when generating a new initcpio image.<br />
# Decide whether to use an existing OpenSSH key or generate a new one (recommended) for the host that will be connecting to and unlocking the encrypted ZFS machine. Copy the public key into {{ic|/etc/tinyssh/root_key}} or {{ic|/etc/dropbear/root_key}}. When generating the initcpio image, this file will be added to {{ic|authorized_keys}} for the root user and is only valid in the initrd environment.<br />
# Add the {{ic|1=ip=}} [[kernel parameter]] to your boot loader configuration. The {{ic|ip}} string is [https://docs.kernel.org/admin-guide/nfs/nfsroot.html highly configurable]. A simple DHCP example is shown below.{{bc|1=ip=:::::eth0:dhcp}}<br />
# Edit {{ic|/etc/mkinitcpio.conf}} to include the {{ic|netconf}}, {{ic|dropbear}} or {{ic|tinyssh}}, and {{ic|zfsencryptssh}} hooks before the {{ic|zfs}} hook:{{bc|1=HOOKS=(... netconf <tinyssh>{{!}}<dropbear> zfsencryptssh zfs ...)}}<br />
# [[Regenerate the initramfs]].<br />
# Reboot and try it out!<br />
<br />
==== Changing the SSH server port ====<br />
<br />
By default, {{Pkg|mkinitcpio-tinyssh}} and {{Pkg|mkinitcpio-dropbear}} listen on port {{ic|22}}. You may wish to change this.<br />
<br />
For '''TinySSH''', copy {{ic|/usr/lib/initcpio/hooks/tinyssh}} to {{ic|/etc/initcpio/hooks/tinyssh}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/bin/tcpserver -HRDl0 0.0.0.0 <new_port> /usr/sbin/tinysshd -v /etc/tinyssh/sshkeydir &<br />
}}<br />
<br />
For '''Dropbear''', copy {{ic|/usr/lib/initcpio/hooks/dropbear}} to {{ic|/etc/initcpio/hooks/dropbear}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/sbin/dropbear -E -s -j -k -p <new_port><br />
}}<br />
<br />
[[Regenerate the initramfs]].<br />
<br />
==== Unlocking from a Windows machine using PuTTY/Plink ====<br />
<br />
First, we need to use {{ic|puttygen.exe}} to import and convert the OpenSSH key generated earlier into PuTTY's ''.ppk'' private key format. Let us call it {{ic|zfs_unlock.ppk}} for this example.<br />
<br />
The mkinitcpio-netconf process above does not setup a shell (nor do we need need one). However, because there is no shell, PuTTY will immediately close after a successful connection. This can be disabled in the PuTTY SSH configuration (''Connection > SSH > [X] Do not start a shell or command at all''), but it still does not allow us to see stdout or enter the encryption passphrase. Instead, we use {{ic|plink.exe}} with the following parameters:<br />
<br />
plink.exe -ssh -l root -i c:\path\to\zfs_unlock.ppk <hostname><br />
<br />
The plink command can be put into a batch script for ease of use.<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [https://zfsonlinux.org/ ZFS on Linux]<br />
* [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html OpenZFS FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook - The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [https://web.archive.org/web/20161028084224/http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide ZFS Best Practices Guide]<br />
* [https://docs.oracle.com/cd/E23823_01/html/819-5461/gavwg.html ZFS Troubleshooting Guide]<br />
* [https://web.archive.org/web/20171213164254/http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]<br />
* [https://github.com/danboid/creating-ZFS-disks-under-Linux How to create cross platform ZFS disks under Linux]<br />
* [https://blog.heckel.xyz/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/ How-To: Using ZFS Encryption at Rest in OpenZFS (ZFS on Linux, ZFS on FreeBSD, …)]<br />
* [https://archzfs.leibelt.de/ Archzfs iso download page: Frequently updated and downloadable archzfs linux iso with full OpenZFS support since 2016]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=574738ZFS2019-06-06T22:27:32Z<p>Chungy: /* GRUB-compatible pool creation */ embedded data</p>
<hr />
<div>[[Category:File systems]]<br />
[[Category:Oracle]]<br />
[[ja:ZFS]]<br />
[[ru:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|ZFS/Virtual disks}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|<br />
Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.<br />
}}<br />
<br />
== Installation ==<br />
<br />
=== General ===<br />
<br />
{{Warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for hardened kernels.<br />
* {{AUR|zfs-linux-zen}} for stable releases for zen kernels.<br />
* {{AUR|zfs-linux-zen-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for zen kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
* {{AUR|zfs-dkms-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl}}, {{ic|spl-utils}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, {{Ic|zfs-import-cache.service}} must be enabled to import the pools and {{Ic|zfs-mount.service}} must be enabled to mount the filesystems available in the pools. A benefit to this is that it is not necessary to mount ZFS filesystems in {{ic|/etc/fstab}}. {{Ic|zfs-import-cache.service}} imports the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each [[#Importing a pool created by id|imported pool]] you want automatically imported by {{Ic|zfs-import-cache.service}} execute:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run. See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.}}<br />
<br />
Enable the relevant service and target so the pools are automatically imported at boot time:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-import.target<br />
<br />
To mount the ZFS filesystems, you have 2 choices:<br />
* Enable the [[#Using zfs-mount.service|zfs-mount.service]]<br />
* Using [[#Using zfs-mount-generator|zfs-mount-generator]]<br />
<br />
==== Using zfs-mount.service ====<br />
<br />
In order to mount ZFS filesystems automatically on boot you need to enable the following services and targets:<br />
<br />
# systemctl enable zfs-mount<br />
# systemctl enable zfs.target<br />
<br />
==== Using zfs-mount-generator ====<br />
<br />
You can also use the zfs-mount-generator to create systemd mount units for your ZFS filesystems at boot. systemd will automatically mount the filesystems based on the mount units without having to use the {{Ic|zfs-mount.service}}. To do that, you need to:<br />
<br />
# Create the {{Ic|/etc/zfs/zfs-list.cache}} directory.<br />
# Enable the ZFS Event Daemon(ZED) script (called a ZEDLET) required to create a list of mountable ZFS filesystems. {{bc|# ln -s /usr/lib/zfs-0.8.0/zfs/zed.d/history_event-zfs-list-cacher.sh /etc/zfs/zed.d}}<br />
# Enable and start the ZFS Event Daemon. This service is responsible for running the script in the previous step. {{bc|# systemctl enable zfs-zed.service<br># systemctl enable zfs.target<br># systemctl start zfs-zed.service}}<br />
# You need to create an empty file named after your pool in {{Ic|/etc/zfs/zfs-list.cache}}. The ZEDLET will only update the list of filesystems if the file for the pool already exists. {{bc|# touch /etc/zfs/zfs-list.cache/<pool-name>}}<br />
# Check the contents of {{Ic|/etc/zfs/zfs-list.cache/<pool-name>}}. If it is empty, make sure that the {{Ic|zfs-zed.service}} is running and just change the canmount property of any of your ZFS filesystem by running: {{bc|1=zfs set canmount=off zroot/fs1}} This change causes ZFS to raise an event which is captured by ZED, which in turn runs the ZEDLET to update the file in {{Ic|/etc/zfs/zfs-list.cache}}. If the file in {{Ic|/etc/zfs/zfs-list.cache}} is updated, you can set the {{Ic|canmount}} property of the filesystem back by running: {{bc|1=zfs set canmount=on zroot/fs1}}<br />
You need to add a file in {{Ic|/etc/zfs/zfs-list.cache}} for each ZFS pool in your system. Make sure the pools are imported by enabling {{Ic|zfs-import-cache.service}} and {{Ic|zfs-import.target}} as [[#Automatic Start|explained above]].<br />
<br />
== Creating a storage pool ==<br />
<br />
Make sure the wanted disks contain an empty [[GPT]]/[[MBR]] [[partition table]]. It is not necessary nor recommended to partition the drives before creating the ZFS filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to [[mdadm#Prepare the devices|erase any old RAID configuration information]].}}<br />
<br />
{{Warning|For [[Advanced Format]] Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
=== Identify disks ===<br />
<br />
[https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool ZFS on Linux] recommends using device IDs when creating ZFS storage pools of less than 10 devices. Use [[Persistent block device naming#by-id and by-path]] to identify the list of drives to be used for ZFS pool. <br />
<br />
The disk IDs should look similar to the following:<br />
<br />
{{hc|# ls -lh /dev/disk/by-id/|<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb}}<br />
<br />
{{Warning|If you create zpools using device names (e.g. {{ic|/dev/sda}},{{ic|/dev/sdb}},...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
==== Using GPT labels ====<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
{{Tip|To minimize typing and copy/paste errors, set a local variable with the target PARTUUID: {{ic|<nowiki>$ UUID=$( lsblk --noheadings --output PARTUUID /dev/sd</nowiki>''XY''<nowiki> )</nowiki>}} }}<br />
<br />
=== Creating ZFS pools ===<br />
<br />
To create a ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> [raidz(2|3)|mirror] <ids><br />
<br />
{{Tip|One may want to read [[#Advanced Format disks]] first as it may be recommend to set {{ic|ashift}} on pool creation.}}<br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz(2|3)|mirror''': This is the type of virtual device that will be created from the pool of devices, raidz is a single disk of parity, raidz2 for 2 disks of parity and raidz3 for 3 disks of parity, similar to raid5 and raid6. Also available is '''mirror''', which is similar to raid1 or raid10, but is not constrained to just 2 device. If not specified, each device will be added as a vdev which is similar to raid0. After creation, a device can be added to each single drive vdev to turn it into a mirror, which can be useful for migrating data.<br />
<br />
* '''ids''': The [[Persistent_block_device_naming#by-id_and_by-path|ID's]] of the drives or partitions that to include into the pool.<br />
<br />
Create pool with single raidz vdev:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
Create pool with two mirror vdevs:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== Advanced Format disks ====<br />
<br />
At pool creation, '''ashift=12''' should always be used, except with SSDs that have 8k sectors where '''ashift=13''' is correct. A vdev of 512 byte disks using 4k sectors will not experience performance issues, but a 4k disk using 512 byte sectors will. Since '''ashift''' cannot be changed after pool creation, even a pool with only 512 byte disks should use 4k because those disks may need to be replaced with 4k disks or the pool may be expanded by adding a vdev composed of 4k disks. Because correct detection of 4k disks is not reliable, {{ic|<nowiki>-o ashift=12</nowiki>}} should always be specified during pool creation. See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ] for more details.<br />
<br />
{{Tip|Use {{man|8|blockdev}} (part of {{Pkg|util-linux}}) to print the sector size reported by the device's ioctls: {{ic|# blockdev --getpbsz /dev/sd''XY''}}.}}<br />
<br />
Create pool with ashift=12 and single raidz vdev:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== GRUB-compatible pool creation ====<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is not suitable for all the features of ZFSonLinux 0.8.0, and must have unsupported features disabled. We can explicitly name features to enable with the {{ic|-d}} argument to {{ic|zpool create}}, which disables all features by default.<br />
<br />
You can create a pool with only the compatible features enabled:<br />
<br />
# zpool create -d -o feature@allocation_classes=enabled \<br />
-o feature@async_destroy=enabled \<br />
-o feature@bookmarks=enabled \<br />
-o feature@embedded_data=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@enabled_txg=enabled \<br />
-o feature@extensible_dataset=enabled \<br />
-o feature@filesystem_limits=enabled \<br />
-o feature@hole_birth=enabled \<br />
-o feature@large_blocks=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@project_quota=enabled \<br />
-o feature@resilver_defer=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@spacemap_v2=enabled \<br />
-o feature@userobj_accounting=enabled \<br />
-o feature@zpool_checkpoint=enabled \<br />
$POOL_NAME $VDEVS<br />
<br />
=== Verifying pool status ===<br />
<br />
If the command is successful, there will be no output. Using the [[mount]] command will show that the pool is mounted. Using {{ic|zpool status}} will show that the pool has been created:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
{{Warning|Do not run {{ic|zpool import ''pool''}}! This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PCs would not boot when a floppy disk was left in a machine.}}<br />
<br />
Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with:<br />
<br />
# zpool import -d /dev/disk/by-id bigdata<br />
# zpool import -d /dev/disk/by-partlabel bigdata<br />
# zpool import -d /dev/disk/by-partuuid bigdata<br />
<br />
{{Note|Use the {{ic|-l}} flag when importing a pool that contains [[#Native encryption|encrypted datasets keys]]:<br />
# zpool import -l -d /dev/disk/by-id bigdata<br />
}}<br />
<br />
Finally check the state of the pool:<br />
<br />
# zpool status -v bigdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
ZFS pools can be further adjusted using parameters, most commonly {{ic|atime}} and {{ic|compression}}.<br />
<br />
To retrieve the current pool parameter status:<br />
<br />
# zfs get all <pool><br />
<br />
To disable access time (atime), which is enabled by default:<br />
<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, {{ic|relatime}} is available. This brings the default ext4/XFS atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between {{ic|1=atime=off}} and {{ic|1=atime=on}}. This property ''only'' takes effect if {{ic|atime}} is {{ic|on}}:<br />
<br />
# zfs set atime=on <pool><br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default, ''gzip'' is also available for seldom-written yet highly-compressible data; consult the [http://open-zfs.org/wiki/Performance_tuning#Compression OpenZFS Wiki] for more details. <br />
<br />
To enable compression:<br />
<br />
# zfs set compression=on <pool><br />
<br />
{{Note| To reset a property of a pool and/or dataset to it's default state, use {{ic|zfs inherit}}:<br />
# zfs inherit -rS atime <pool><br />
# zfs inherit -rS atime <pool>/<dataset><br />
}}<br />
<br />
=== Scrubbing ===<br />
<br />
Whenever data is read and ZFS encounters an error, it is silently repaired when possible, rewritten back to disk and logged so you can obtain an overview of errors on your pools. There is no fsck or equivalent tool for ZFS. Instead, ZFS supports a feature known as scrubbing. This traverses through all the data in a pool and verifies that all blocks can be read.<br />
<br />
To scrub a pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To cancel a running scrub:<br />
<br />
# zpool scrub -s <pool><br />
<br />
==== How often should I do this? ====<br />
<br />
From the Oracle blog post [https://blogs.oracle.com/wonders-of-zfs-storage/disk-scrub-why-and-when-v2 Disk Scrub - Why and When?]:<br />
<br />
:This question is challenging for Support to answer, because as always the true answer is "It Depends". So before I offer a general guideline, here are a few tips to help you create an answer more tailored to your use pattern.<br />
<br />
:* What is the expiration of your oldest backup? You should probably scrub your data at least as often as your oldest tapes expire so that you have a known-good restore point.<br />
:* How often are you experiencing disk failures? While the recruitment of a hot-spare disk invokes a "resilver" -- a targeted scrub of just the VDEV which lost a disk -- you should probably scrub at least as often as you experience disk failures on average in your specific environment.<br />
:* How often is the oldest piece of data on your disk read? You should scrub occasionally to prevent very old, very stale data from experiencing bit-rot and dying without you knowing it.<br />
<br />
:If any of your answers to the above are "I do not know", the general guideline is: you should probably be scrubbing your zpool at least once per month. It is a schedule that works well for most use cases, provides enough time for scrubs to complete before starting up again on all but the busiest & most heavily-loaded systems, and even on very large zpools (192+ disks) should complete fairly often between disk failures.<br />
<br />
In the [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ ZFS Administration Guide] by Aaron Toponce, he advises to scrub consumer disks once a week.<br />
<br />
==== Start with a service or timer ====<br />
Using a [[systemd]] timer/service it is possible to automatically scrub pools monthly:<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.timer|2=<br />
[Unit]<br />
Description=Monthly zpool scrub on %i<br />
<br />
[Timer]<br />
OnCalendar=monthly<br />
AccuracySec=1h<br />
Persistent=true<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.service|2=<br />
[Unit]<br />
Description=zpool scrub on %i<br />
<br />
[Service]<br />
Nice=19<br />
IOSchedulingClass=idle<br />
KillSignal=SIGINT<br />
ExecStart=/usr/bin/zpool scrub %i<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-scrub@''pool-to-scrub''.timer}} unit for monthly scrubbing the specified zpool.<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device.<br />
<br />
{{Warning|This command destroys '''any data''' containing in the pool and/or dataset.}}<br />
<br />
To destroy the pool:<br />
# zpool destroy <pool><br />
<br />
To destroy a dataset:<br />
# zpool destroy <pool>/<dataset><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|<br />
no pools available<br />
}}<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool:<br />
<br />
# zpool export <pool><br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== SSD Caching ===<br />
<br />
You can add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (L2ARC). The process to add them is very similar to adding a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from {{ic|/dev/disk/by-id/*}}.<br />
<br />
==== SLOG ====<br />
<br />
To add a mirrored SLOG:<br />
<br />
# zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
Or to add a single device SLOG (unsafe):<br />
<br />
# zpool add <pool> log <device-id><br />
<br />
Because the SLOG device stores data that has not been written to the pool, it is important to use devices that can finish writes when power is lost. It is also important to use redundancy, since a device failure can cause data loss. In addition, the SLOG is only used for sync writes, so may not provide any performance improvement.<br />
<br />
==== L2ARC ====<br />
<br />
To add L2ARC:<br />
<br />
# zpool add <pool> cache <device-id><br />
<br />
Because every block cached in L2ARC uses a small amount of memory, it is generally only useful in workloads where the amount of hot data is *bigger* than the maximum amount of memory that can fit in the computer, but small enough to fit into L2ARC. It is also cleared at reboot and is only a read cache, so redundancy is unnecessary. Un-intuitively, L2ARC can actually harm performance since it takes memory away from ARC.<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8 KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8 KiB tends to be a good value for most file systems, even when using 4 KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
=== I/O Scheduler ===<br />
<br />
When the pool is imported, for whole disk vdevs, the block device I/O scheduler is set to {{ic|zfs_vdev_scheduler}} [https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#zfs_vdev_scheduler]. The most common schedulers are: ''noop'', ''cfq'', ''bfq'', and ''deadline''.<br />
<br />
In some cases, the scheduler is not changeable using this method. Known schedulers that cannot be changed are: ''scsi_mq'' and ''none''. In these cases, the scheduler is unchanged and an error message can be reported to logs. [[Improving_performance#Changing_I/O_scheduler|Manually setting]] one of the common schedulers used by {{ic|zfs_vdev_scheduler}} can result in more consistent performance.<br />
<br />
== Creating datasets ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, see {{man|8|zfs|url=}} or {{man|8|zpool|url=}}.<br />
<br />
=== Native encryption ===<br />
<br />
ZFS offers the following supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-ccm}} will be used.<br />
<br />
The following keyformats are supported: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}.<br />
<br />
One can also specify/increase the default iterations of PBKDF2 when using {{ic|passphrase}} with {{ic|-o pbkdf2iters <n>}}, although it may increase the decryption time.<br />
<br />
{{Note|<br />
* Native ZFS encryption has been made available in the stable 0.8.0 release or newer. Previously it was only available in development versions provided by packages like {{AUR|zfs-linux-git}}, {{AUR|zfs-dkms-git}} or other development builds. Users who were only using the development versions for the native encryption, may now switch to the stable releases if they wish.<br />
* To import a pool with keys, one needs to specify the {{ic|-l}} flag, without this flag encrypted datasets will be left unavailable until the keys are loaded. See [[#Importing a pool created by id]].}}<br />
<br />
To create a dataset including native encryption with a passphrase, use:<br />
<br />
# zfs create -o encryption=on -o keyformat=passphrase <nameofzpool>/<nameofdataset><br />
<br />
To use a key instead of using a passphrase:<br />
<br />
# dd if=/dev/random of=/path/to/key bs=1 count=32<br />
# zfs create -o encryption=on -o keyformat=raw -o keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
To verify the key location:<br />
# zfs get keylocation <nameofzpool>/<nameofdataset><br />
<br />
To change the key location:<br />
<br />
# zfs set keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
You can also manually load the keys by using one of the following commands:<br />
<br />
# zfs load-key <nameofzpool>/<nameofdataset> # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
To mount the created encrypted dataset:<br />
<br />
# zfs mount <nameofzpool>/<nameofdataset><br />
<br />
==== Unlock at boot time ====<br />
It is possible to automatically unlock a pool dataset on boot time by using a [[systemd]] unit. For example create the following service to unlock any dataset of the pool named ''tank'': <br />
<br />
{{hc|/etc/systemd/system/zfskey-tank@.service|2=<br />
[Unit]<br />
Description=Load pool encryption keys<br />
#Before=systemd-user-sessions.service<br />
Before=zfs-mount.service<br />
After=zfs-import.target<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/bash -c '/usr/bin/zfs load-key %j/%i<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
}}<br />
<br />
[[Enable]]/[[start]] the service for each encrypted dataset, e.g. {{ic|# systemctl enable zfskey-tank@dataset}}.<br />
<br />
{{Note|The {{ic|1=Before=systemd-user-sessions.service}} ensures that systemd-ask-password is invoked before the local IO devices are handed over to the [[desktop environment]].}}<br />
<br />
An alternative is to load all possible keys:<br />
<br />
{{hc|etc/systemd/system/zfs-load-key.service|2=<br />
[Unit]<br />
Description=Load encryption keys<br />
Before=zfs-mount.service<br />
After=zfs-import.target<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/bash -c '/usr/bin/zfs load-key -a'<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-load-key.service}}.<br />
<br />
=== Swap volume ===<br />
<br />
{{Warning|On systems with extremely high memory pressure, using a zvol for swap can result in lockup, regardless of how much swap is still available. This issue is currently being investigated in https://github.com/zfsonlinux/zfs/issues/7734}}<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is important to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
=== Access Control Lists ===<br />
<br />
To use [[ACL]] on a ZFS pool:<br />
<br />
# zfs set acltype=posixacl <nameofzpool>/<nameofdataset><br />
# zfs set xattr=sa <nameofzpool>/<nameofdataset><br />
<br />
Setting {{ic|xattr}} is recommended for performance reasons [https://github.com/zfsonlinux/zfs/issues/170#issuecomment-27348094].<br />
<br />
It may be preferable to enable ACL on the zpool instead. Setting {{ic|1=aclinherit=passthrough}} may be wanted as the default mode is {{ic|restricted}} [https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaaz/index.html]:<br />
<br />
# zfs set aclinherit=passthrough <nameofzpool><br />
# zfs set acltype=posixacl <nameofzpool><br />
# zfs set xattr=sa <nameofzpool><br />
<br />
=== Databases ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
<br />
# systemctl mask tmp.mount<br />
<br />
== Troubleshooting ==<br />
<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then [[regenerate the initramfs]] image. Which will copy the hostid into the initramfs image.<br />
<br />
=== Pool cannot be found while booting from SAS/SCSI devices ===<br />
<br />
In case you are booting a SAS/SCSI based, you might occassionally get boot problems where the pool you are trying to boot from cannot be found. A likely reason for this is that your devices are initialized too late into the process. That means that zfs cannot find any devices at the time when it tries to assemble your pool.<br />
<br />
In this case you should force the scsi driver to wait for devices to come online before continuing. You can do this by putting this into {{ic|/etc/modprobe.d/zfs.conf}}:<br />
<br />
{{hc|1=/etc/modprobe.d/zfs.conf|2=<br />
options scsi_mod scan=sync<br />
}}<br />
<br />
Afterwards, [[regenerate the initramfs]].<br />
<br />
This works because the zfs hook will copy the file at {{ic|/etc/modprobe.d/zfs.conf}} into the initcpio which will then be used at build time.<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then [[regenerate the initramfs]] in normally booted system.<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
<br />
{{hc|$ hostid|<br />
0a0af0f8<br />
}}<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|1=spl.spl_hostid=0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|1=zfs_force=1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
=== Pool resilvering stuck/restarting/slow? ===<br />
<br />
According to the ZFSonLinux github it is a known issue since 2012 with ZFS-ZED which causes the resilvering process to constantly restart, sometimes get stuck and be generally slow for some hardware. The simplest mitigation is to stop zfs-zed.service until the resilver completes<br />
<br />
=== Fix slow boot caused by failed import of unavailable pools in the initramfs zpool.cache ===<br />
<br />
Your boot time can be significantly impacted if you update your intitramfs (eg when doing a kernel update) when you have additional but non-permanently attached pools imported because these pools will get added to your initramfs zpool.cache and ZFS will attempt to import these extra pools on every boot, regardless of whether you have exported it and removed it from your regular zpool.cache.<br />
<br />
If you notice ZFS trying to import unavailable pools at boot, first run:<br />
<br />
$ zdb -C<br />
<br />
To check you zpool.cache for pools you do not want imported at boot. If this command is showing (a) additional, currently unavailable pool(s), run:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache zroot<br />
<br />
To clear the zpool.cache of any pools other than the pool named zroot. Sometimes there is no need to refresh your zpool.cache, but instead all you need to do is [[regenerate the initramfs]].<br />
<br />
== Tips and tricks ==<br />
<br />
=== Embed the archzfs packages into an archiso ===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
{{Note|If you later have problems running modprobe zfs, you should include the linux-headers in the packages.x86_64. }}<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the {{ic|--keep parameter}} from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[#Scrub|scrub]]{{Broken section link}}). It is possible to override this by editing provided systemd unit ([[Systemd#Editing provided units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup rotation scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
=== Creating a share ===<br />
<br />
ZFS has support for creating shares by SMB or [[NFS]].<br />
<br />
==== NFS ====<br />
<br />
Make sure [[NFS]] has been installed/configured, note there is no need to edit the {{ic|/etc/exports}} file. For sharing over NFS the services {{ic|nfs-server.service}} and {{ic|zfs-share.service}} should be [[start|started]].<br />
<br />
To make a pool available on the network:<br />
<br />
# zfs set sharenfs=on <nameofzpool><br />
<br />
To make a dataset available on the network:<br />
<br />
# zfs set sharenfs=on <nameofzpool>/<nameofdataset><br />
<br />
To enable read/write access for a specific ip-range(s):<br />
<br />
# zfs set sharenfs="rw=@192.168.1.100/24,rw=@10.0.0.0/24" <nameofzpool>/<nameofdataset><br />
<br />
To check if the dataset is exported successful:<br />
<br />
{{hc|# showmount -e `hostname`|<br />
Export list for hostname:<br />
/path/of/dataset 192.168.1.100/24<br />
}}<br />
<br />
To view the current loaded exports state in more detail, use:<br />
<br />
{{hc|# exportfs -v|2=<br />
/path/of/dataset<br />
192.168.1.100/24(sync,wdelay,hide,no_subtree_check,mountpoint,sec=sys,rw,secure,no_root_squash,no_all_squash)<br />
}}<br />
<br />
==== SMB ====<br />
<br />
When sharing smb shares configuring usershares in your smb.conf will allow ZFS to setup and create the shares.<br />
<br />
{{bc|1=<br />
# [global]<br />
# usershare path = /var/lib/samba/usershares<br />
# usershare max shares = 100<br />
# usershare allow guests = yes<br />
# usershare owner only = no<br />
}}<br />
<br />
Create and set permissions on the user directory as root<br />
<br />
# mkdir /var/lib/samba/usershares<br />
# chmod +t /var/lib/samba/usershares<br />
<br />
=== Encryption in ZFS using dm-crypt ===<br />
<br />
The stable release version of ZFS on Linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text abstraction, it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there. Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinitcpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible. To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
[[Regenerate the initramfs]]. There should be no errors.<br />
<br />
=== Bind mount ===<br />
<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
=== Monitoring / Mailing on Events ===<br />
<br />
See [https://ramsdenj.com/2016/08/29/arch-linux-on-zfs-part-3-followup.html ZED: The ZFS Event Daemon] for more information.<br />
<br />
An email forwarder, such as [[S-nail]] (installed as part of {{Grp|base}}), is required to accomplish this. Test it to be sure it is working correctly.<br />
<br />
Uncomment the following in the configuration file:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|<nowiki><br />
ZED_EMAIL_ADDR="root"<br />
ZED_EMAIL_PROG="mailx"<br />
ZED_NOTIFY_VERBOSE=0<br />
ZED_EMAIL_OPTS="-s '@SUBJECT@' @ADDRESS@"<br />
</nowiki>}}<br />
<br />
Update 'root' in {{ic|1=ZED_EMAIL_ADDR="root"}} to the email address you want to receive notifications at.<br />
<br />
If you are keeping your mailrc in your home directory, you can tell mail to get it from there by setting {{ic|MAILRC}}:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|2=<br />
export MAILRC=/home/<user>/.mailrc<br />
}}<br />
<br />
This works because ZED sources this file, so {{ic|mailx}} sees this environment variable.<br />
<br />
If you want to receive an email no matter the state of your pool, you will want to set {{ic|1=ZED_NOTIFY_VERBOSE=1}}. You will need to do this temporary to test.<br />
<br />
[[Start]] and [[enable]] {{ic|zfs-zed.service}}.<br />
<br />
With {{ic|1=ZED_NOTIFY_VERBOSE=1}}, you can test by running a scrub as root: {{ic|1=zpool scrub <pool-name>}}.<br />
<br />
=== Wrap shell commands in pre & post snapshots ===<br />
<br />
Since it is so cheap to make a snapshot, we can use this as a measure of security for sensitive commands such as system and package upgrades. If we make a snapshot before, and one after, we can later diff these snapshots to find out what changed on the filesystem after the command executed. Furthermore we can also rollback in case the outcome was not desired.<br />
<br />
E.g.:<br />
<br />
# zfs snapshot -r zroot@pre<br />
# pacman -Syu<br />
# zfs snapshot -r zroot@post<br />
# zfs diff zroot@pre zroot@post <br />
# zfs rollback zroot@pre<br />
<br />
A utility that automates the creation of pre and post snapshots around a shell command is [https://gist.github.com/erikw/eeec35be33e847c211acd886ffb145d5 znp].<br />
<br />
E.g.:<br />
<br />
# znp pacman -Syu<br />
# znp find / -name "something*" -delete<br />
<br />
and you would get snapshots created before and after the supplied command, and also output of the commands logged to file for future reference so we know what command created the diff seen in a pair of pre/post snapshots.<br />
<br />
=== Remote unlocking of ZFS encrypted root ===<br />
<br />
As of [https://github.com/archzfs/archzfs/pull/261 PR #261], {{ic|archzfs}} supports SSH unlocking of natively-encrypted ZFS datasets. This section describes how to use this feature, and is largely based on [[dm-crypt/Specialties#Remote unlocking (hooks: netconf, dropbear, tinyssh, ppp)]].<br />
<br />
#Install {{Pkg|mkinitcpio-netconf}} to provide hooks for setting up early user space networking.<br />
#Choose an SSH server to use in early user space. The options are {{Pkg|mkinitcpio-tinyssh}} or {{Pkg|mkinitcpio-dropbear}}, and are mutually exclusive.<br />
##If using {{Pkg|mkinitcpio-tinyssh}}, it is also recommended to install {{Pkg|tinyssh-convert}} or {{AUR|tinyssh-convert-git}}. This tool converts an existing OpenSSH hostkey to the TinySSH key format, preserving the key fingerprint and avoiding connection warnings. The TinySSH and Dropbear mkinitcpio install scripts will automatically convert existing hostkeys when generating a new initcpio image.<br />
#Decide whether to use an existing OpenSSH key or generate a new one (recommended) for the host that will be connecting to and unlocking the encrypted ZFS machine. Copy the public key into {{ic|/etc/tinyssh/root_key}} or {{ic|/etc/dropbear/root_key}}. When generating the initcpio image, this file will be added to {{ic|authorized_keys}} for the root user and is only valid in the initrd environment.<br />
#Add the {{ic|1=ip=}} [[kernel parameter]] to your boot loader configuration. The {{ic|ip}} string is [https://www.kernel.org/doc/Documentation/filesystems/nfs/nfsroot.txt highly configurable]. A simple DHCP example is shown below.{{bc|1=ip=:::::eth0:dhcp}}<br />
#Edit {{ic|/etc/mkinitcpio.conf}} to include the {{ic|netconf}}, {{ic|dropbear}} or {{ic|tinyssh}}, and {{ic|zfsencryptssh}} hooks before the {{ic|zfs}} hook:{{bc|1=HOOKS=(... netconf <tinyssh>{{!}}<dropbear> zfsencryptssh zfs ...)}}<br />
#[[Regenerate the initramfs]].<br />
#Reboot and try it out!<br />
<br />
====Changing the SSH server port====<br />
<br />
By default, {{Pkg|mkinitcpio-tinyssh}} and {{Pkg|mkinitcpio-dropbear}} listen on port {{ic|22}}. You may wish to change this.<br />
<br />
For '''TinySSH''', copy {{ic|/usr/lib/initcpio/hooks/tinyssh}} to {{ic|/etc/initcpio/hooks/tinyssh}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/bin/tcpserver -HRDl0 0.0.0.0 <new_port> /usr/sbin/tinysshd -v /etc/tinyssh/sshkeydir &<br />
}}<br />
<br />
For '''Dropbear''', copy {{ic|/usr/lib/initcpio/hooks/dropbear}} to {{ic|/etc/initcpio/hooks/dropbear}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/sbin/dropbear -E -s -j -k -p <new_port><br />
}}<br />
<br />
[[Regenerate the initramfs]].<br />
<br />
==== Unlocking from a Windows machine using PuTTY/Plink ====<br />
<br />
First, we need to use {{ic|puttygen.exe}} to import and convert the OpenSSH key generated earlier into PuTTY's ''.ppk'' private key format. Let us call it {{ic|zfs_unlock.ppk}} for this example.<br />
<br />
The mkinitcpio-netconf process above does not setup a shell (nor do we need need one). However, because there is no shell, PuTTY will immediately close after a successful connection. This can be disabled in the PuTTY SSH configuration (''Connection -> SSH -> [X] Do not start a shell or command at all''), but it still does not allow us to see stdout or enter the encryption passphrase. Instead, we use {{ic|plink.exe}} with the following parameters:<br />
<br />
plink.exe -ssh -l root -i c:\path\to\zfs_unlock.ppk <hostname><br />
<br />
The plink command can be put into a batch script for ease of use.<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]<br />
* [https://github.com/danboid/creating-ZFS-disks-under-Linux How to create cross platform ZFS disks under Linux]<br />
* [https://blog.heckel.xyz/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/ How-To: Using ZFS Encryption at Rest in OpenZFS (ZFS on Linux, ZFS on FreeBSD, …)]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=574729ZFS2019-06-06T18:02:50Z<p>Chungy: /* GRUB-compatible pool creation */ Change of strategy: explicitly enable supported features. While there may be new ones in the future, this at least won't break when copy-pasting.</p>
<hr />
<div>[[Category:File systems]]<br />
[[Category:Oracle]]<br />
[[ja:ZFS]]<br />
[[ru:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|ZFS/Virtual disks}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|<br />
Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.<br />
}}<br />
<br />
== Installation ==<br />
<br />
=== General ===<br />
<br />
{{Warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for hardened kernels.<br />
* {{AUR|zfs-linux-zen}} for stable releases for zen kernels.<br />
* {{AUR|zfs-linux-zen-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for zen kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
* {{AUR|zfs-dkms-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl}}, {{ic|spl-utils}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, {{Ic|zfs-import-cache.service}} must be enabled to import the pools and {{Ic|zfs-mount.service}} must be enabled to mount the filesystems available in the pools. A benefit to this is that it is not necessary to mount ZFS filesystems in {{ic|/etc/fstab}}. {{Ic|zfs-import-cache.service}} imports the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each [[#Importing a pool created by id|imported pool]] you want automatically imported by {{Ic|zfs-import-cache.service}} execute:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run. See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.}}<br />
<br />
Enable the relevant service and target so the pools are automatically imported at boot time:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-import.target<br />
<br />
To mount the ZFS filesystems, you have 2 choices:<br />
* Enable the [[#Using zfs-mount.service|zfs-mount.service]]<br />
* Using [[#Using zfs-mount-generator|zfs-mount-generator]]<br />
<br />
==== Using zfs-mount.service ====<br />
<br />
In order to mount ZFS filesystems automatically on boot you need to enable the following services and targets:<br />
<br />
# systemctl enable zfs-mount<br />
# systemctl enable zfs.target<br />
<br />
==== Using zfs-mount-generator ====<br />
<br />
You can also use the zfs-mount-generator to create systemd mount units for your ZFS filesystems at boot. systemd will automatically mount the filesystems based on the mount units without having to use the {{Ic|zfs-mount.service}}. To do that, you need to:<br />
<br />
# Create the {{Ic|/etc/zfs/zfs-list.cache}} directory.<br />
# Enable the ZFS Event Daemon(ZED) script (called a ZEDLET) required to create a list of mountable ZFS filesystems. {{bc|# ln -s /usr/lib/zfs-0.8.0/zfs/zed.d/history_event-zfs-list-cacher.sh /etc/zfs/zed.d}}<br />
# Enable and start the ZFS Event Daemon. This service is responsible for running the script in the previous step. {{bc|# systemctl enable zfs-zed.service<br># systemctl enable zfs.target<br># systemctl start zfs-zed.service}}<br />
# You need to create an empty file named after your pool in {{Ic|/etc/zfs/zfs-list.cache}}. The ZEDLET will only update the list of filesystems if the file for the pool already exists. {{bc|# touch /etc/zfs/zfs-list.cache/<pool-name>}}<br />
# Check the contents of {{Ic|/etc/zfs/zfs-list.cache/<pool-name>}}. If it is empty, make sure that the {{Ic|zfs-zed.service}} is running and just change the canmount property of any of your ZFS filesystem by running: {{bc|1=zfs set canmount=off zroot/fs1}} This change causes ZFS to raise an event which is captured by ZED, which in turn runs the ZEDLET to update the file in {{Ic|/etc/zfs/zfs-list.cache}}. If the file in {{Ic|/etc/zfs/zfs-list.cache}} is updated, you can set the {{Ic|canmount}} property of the filesystem back by running: {{bc|1=zfs set canmount=on zroot/fs1}}<br />
You need to add a file in {{Ic|/etc/zfs/zfs-list.cache}} for each ZFS pool in your system. Make sure the pools are imported by enabling {{Ic|zfs-import-cache.service}} and {{Ic|zfs-import.target}} as [[#Automatic Start|explained above]].<br />
<br />
== Creating a storage pool ==<br />
<br />
Make sure the wanted disks contain an empty [[GPT]]/[[MBR]] [[partition table]]. It is not necessary nor recommended to partition the drives before creating the ZFS filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to [[mdadm#Prepare the devices|erase any old RAID configuration information]].}}<br />
<br />
{{Warning|For [[Advanced Format]] Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
=== Identify disks ===<br />
<br />
[https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool ZFS on Linux] recommends using device IDs when creating ZFS storage pools of less than 10 devices. Use [[Persistent block device naming#by-id and by-path]] to identify the list of drives to be used for ZFS pool. <br />
<br />
The disk IDs should look similar to the following:<br />
<br />
{{hc|# ls -lh /dev/disk/by-id/|<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb}}<br />
<br />
{{Warning|If you create zpools using device names (e.g. {{ic|/dev/sda}},{{ic|/dev/sdb}},...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
==== Using GPT labels ====<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
{{Tip|To minimize typing and copy/paste errors, set a local variable with the target PARTUUID: {{ic|<nowiki>$ UUID=$( lsblk --noheadings --output PARTUUID /dev/sd</nowiki>''XY''<nowiki> )</nowiki>}} }}<br />
<br />
=== Creating ZFS pools ===<br />
<br />
To create a ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> [raidz(2|3)|mirror] <ids><br />
<br />
{{Tip|One may want to read [[#Advanced Format disks]] first as it may be recommend to set {{ic|ashift}} on pool creation.}}<br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz(2|3)|mirror''': This is the type of virtual device that will be created from the pool of devices, raidz is a single disk of parity, raidz2 for 2 disks of parity and raidz3 for 3 disks of parity, similar to raid5 and raid6. Also available is '''mirror''', which is similar to raid1 or raid10, but is not constrained to just 2 device. If not specified, each device will be added as a vdev which is similar to raid0. After creation, a device can be added to each single drive vdev to turn it into a mirror, which can be useful for migrating data.<br />
<br />
* '''ids''': The [[Persistent_block_device_naming#by-id_and_by-path|ID's]] of the drives or partitions that to include into the pool.<br />
<br />
Create pool with single raidz vdev:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
Create pool with two mirror vdevs:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
mirror \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== Advanced Format disks ====<br />
<br />
At pool creation, '''ashift=12''' should always be used, except with SSDs that have 8k sectors where '''ashift=13''' is correct. A vdev of 512 byte disks using 4k sectors will not experience performance issues, but a 4k disk using 512 byte sectors will. Since '''ashift''' cannot be changed after pool creation, even a pool with only 512 byte disks should use 4k because those disks may need to be replaced with 4k disks or the pool may be expanded by adding a vdev composed of 4k disks. Because correct detection of 4k disks is not reliable, {{ic|<nowiki>-o ashift=12</nowiki>}} should always be specified during pool creation. See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ] for more details.<br />
<br />
{{Tip|Use {{man|8|blockdev}} (part of {{Pkg|util-linux}}) to print the sector size reported by the device's ioctls: {{ic|# blockdev --getpbsz /dev/sd''XY''}}.}}<br />
<br />
Create pool with ashift=12 and single raidz vdev:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
==== GRUB-compatible pool creation ====<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is not suitable for all the features of ZFSonLinux 0.8.0, and must have unsupported features disabled. We can explicitly name features to enable with the {{ic|-d}} argument to {{ic|zpool create}}, which disables all features by default.<br />
<br />
You can create a pool with only the compatible features enabled:<br />
<br />
# zpool create -d -o feature@allocation_classes=enabled \<br />
-o feature@async_destroy=enabled \<br />
-o feature@bookmarks=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@enabled_txg=enabled \<br />
-o feature@extensible_dataset=enabled \<br />
-o feature@filesystem_limits=enabled \<br />
-o feature@hole_birth=enabled \<br />
-o feature@large_blocks=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@project_quota=enabled \<br />
-o feature@resilver_defer=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@spacemap_v2=enabled \<br />
-o feature@userobj_accounting=enabled \<br />
-o feature@zpool_checkpoint=enabled \<br />
$POOL_NAME $VDEVS<br />
<br />
=== Verifying pool status ===<br />
<br />
If the command is successful, there will be no output. Using the [[mount]] command will show that the pool is mounted. Using {{ic|zpool status}} will show that the pool has been created:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
{{Warning|Do not run {{ic|zpool import ''pool''}}! This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PCs would not boot when a floppy disk was left in a machine.}}<br />
<br />
Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with:<br />
<br />
# zpool import -d /dev/disk/by-id bigdata<br />
# zpool import -d /dev/disk/by-partlabel bigdata<br />
# zpool import -d /dev/disk/by-partuuid bigdata<br />
<br />
{{Note|Use the {{ic|-l}} flag when importing a pool that contains [[#Native encryption|encrypted datasets keys]]:<br />
# zpool import -l -d /dev/disk/by-id bigdata<br />
}}<br />
<br />
Finally check the state of the pool:<br />
<br />
# zpool status -v bigdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
ZFS pools can be further adjusted using parameters, most commonly {{ic|atime}} and {{ic|compression}}.<br />
<br />
To retrieve the current pool parameter status:<br />
<br />
# zfs get all <pool><br />
<br />
To disable access time (atime), which is enabled by default:<br />
<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, {{ic|relatime}} is available. This brings the default ext4/XFS atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between {{ic|1=atime=off}} and {{ic|1=atime=on}}. This property ''only'' takes effect if {{ic|atime}} is {{ic|on}}:<br />
<br />
# zfs set atime=on <pool><br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default, ''gzip'' is also available for seldom-written yet highly-compressible data; consult the [http://open-zfs.org/wiki/Performance_tuning#Compression OpenZFS Wiki] for more details. <br />
<br />
To enable compression:<br />
<br />
# zfs set compression=on <pool><br />
<br />
{{Note| To reset a property of a pool and/or dataset to it's default state, use {{ic|zfs inherit}}:<br />
# zfs inherit -rS atime <pool><br />
# zfs inherit -rS atime <pool>/<dataset><br />
}}<br />
<br />
=== Scrubbing ===<br />
<br />
Whenever data is read and ZFS encounters an error, it is silently repaired when possible, rewritten back to disk and logged so you can obtain an overview of errors on your pools. There is no fsck or equivalent tool for ZFS. Instead, ZFS supports a feature known as scrubbing. This traverses through all the data in a pool and verifies that all blocks can be read.<br />
<br />
To scrub a pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To cancel a running scrub:<br />
<br />
# zpool scrub -s <pool><br />
<br />
==== How often should I do this? ====<br />
<br />
From the Oracle blog post [https://blogs.oracle.com/wonders-of-zfs-storage/disk-scrub-why-and-when-v2 Disk Scrub - Why and When?]:<br />
<br />
:This question is challenging for Support to answer, because as always the true answer is "It Depends". So before I offer a general guideline, here are a few tips to help you create an answer more tailored to your use pattern.<br />
<br />
:* What is the expiration of your oldest backup? You should probably scrub your data at least as often as your oldest tapes expire so that you have a known-good restore point.<br />
:* How often are you experiencing disk failures? While the recruitment of a hot-spare disk invokes a "resilver" -- a targeted scrub of just the VDEV which lost a disk -- you should probably scrub at least as often as you experience disk failures on average in your specific environment.<br />
:* How often is the oldest piece of data on your disk read? You should scrub occasionally to prevent very old, very stale data from experiencing bit-rot and dying without you knowing it.<br />
<br />
:If any of your answers to the above are "I do not know", the general guideline is: you should probably be scrubbing your zpool at least once per month. It is a schedule that works well for most use cases, provides enough time for scrubs to complete before starting up again on all but the busiest & most heavily-loaded systems, and even on very large zpools (192+ disks) should complete fairly often between disk failures.<br />
<br />
In the [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ ZFS Administration Guide] by Aaron Toponce, he advises to scrub consumer disks once a week.<br />
<br />
==== Start with a service or timer ====<br />
Using a [[systemd]] timer/service it is possible to automatically scrub pools monthly:<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.timer|2=<br />
[Unit]<br />
Description=Monthly zpool scrub on %i<br />
<br />
[Timer]<br />
OnCalendar=monthly<br />
AccuracySec=1h<br />
Persistent=true<br />
<br />
[Install]<br />
WantedBy=multi-user.target<br />
}}<br />
<br />
{{hc|/etc/systemd/system/zfs-scrub@.service|2=<br />
[Unit]<br />
Description=zpool scrub on %i<br />
<br />
[Service]<br />
Nice=19<br />
IOSchedulingClass=idle<br />
KillSignal=SIGINT<br />
ExecStart=/usr/bin/zpool scrub %i<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-scrub@''pool-to-scrub''.timer}} unit for monthly scrubbing the specified zpool.<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device.<br />
<br />
{{Warning|This command destroys '''any data''' containing in the pool and/or dataset.}}<br />
<br />
To destroy the pool:<br />
# zpool destroy <pool><br />
<br />
To destroy a dataset:<br />
# zpool destroy <pool>/<dataset><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|<br />
no pools available<br />
}}<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool:<br />
<br />
# zpool export <pool><br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== SSD Caching ===<br />
<br />
You can add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (L2ARC). The process to add them is very similar to adding a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from {{ic|/dev/disk/by-id/*}}.<br />
<br />
==== SLOG ====<br />
<br />
To add a mirrored SLOG:<br />
<br />
# zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
Or to add a single device SLOG (unsafe):<br />
<br />
# zpool add <pool> log <device-id><br />
<br />
Because the SLOG device stores data that has not been written to the pool, it is important to use devices that can finish writes when power is lost. It is also important to use redundancy, since a device failure can cause data loss. In addition, the SLOG is only used for sync writes, so may not provide any performance improvement.<br />
<br />
==== L2ARC ====<br />
<br />
To add L2ARC:<br />
<br />
# zpool add <pool> cache <device-id><br />
<br />
Because every block cached in L2ARC uses a small amount of memory, it is generally only useful in workloads where the amount of hot data is *bigger* than the maximum amount of memory that can fit in the computer, but small enough to fit into L2ARC. It is also cleared at reboot and is only a read cache, so redundancy is unnecessary. Un-intuitively, L2ARC can actually harm performance since it takes memory away from ARC.<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8 KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8 KiB tends to be a good value for most file systems, even when using 4 KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
=== I/O Scheduler ===<br />
<br />
When the pool is imported, for whole disk vdevs, the block device I/O scheduler is set to {{ic|zfs_vdev_scheduler}} [https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#zfs_vdev_scheduler]. The most common schedulers are: ''noop'', ''cfq'', ''bfq'', and ''deadline''.<br />
<br />
In some cases, the scheduler is not changeable using this method. Known schedulers that cannot be changed are: ''scsi_mq'' and ''none''. In these cases, the scheduler is unchanged and an error message can be reported to logs. [[Improving_performance#Changing_I/O_scheduler|Manually setting]] one of the common schedulers used by {{ic|zfs_vdev_scheduler}} can result in more consistent performance.<br />
<br />
== Creating datasets ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, see {{man|8|zfs|url=}} or {{man|8|zpool|url=}}.<br />
<br />
=== Native encryption ===<br />
<br />
ZFS offers the following supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-ccm}} will be used.<br />
<br />
The following keyformats are supported: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}.<br />
<br />
One can also specify/increase the default iterations of PBKDF2 when using {{ic|passphrase}} with {{ic|-o pbkdf2iters <n>}}, although it may increase the decryption time.<br />
<br />
{{Note|<br />
* Native ZFS encryption has been made available in the stable 0.8.0 release or newer. Previously it was only available in development versions provided by packages like {{AUR|zfs-linux-git}}, {{AUR|zfs-dkms-git}} or other development builds. Users who were only using the development versions for the native encryption, may now switch to the stable releases if they wish.<br />
* To import a pool with keys, one needs to specify the {{ic|-l}} flag, without this flag encrypted datasets will be left unavailable until the keys are loaded. See [[#Importing a pool created by id]].}}<br />
<br />
To create a dataset including native encryption with a passphrase, use:<br />
<br />
# zfs create -o encryption=on -o keyformat=passphrase <nameofzpool>/<nameofdataset><br />
<br />
To use a key instead of using a passphrase:<br />
<br />
# dd if=/dev/random of=/path/to/key bs=1 count=32<br />
# zfs create -o encryption=on -o keyformat=raw -o keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
To verify the key location:<br />
# zfs get keylocation <nameofzpool>/<nameofdataset><br />
<br />
To change the key location:<br />
<br />
# zfs set keylocation=file:///path/to/key <nameofzpool>/<nameofdataset><br />
<br />
You can also manually load the keys by using one of the following commands:<br />
<br />
# zfs load-key <nameofzpool>/<nameofdataset> # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
To mount the created encrypted dataset:<br />
<br />
# zfs mount <nameofzpool>/<nameofdataset><br />
<br />
==== Unlock at boot time ====<br />
It is possible to automatically unlock a pool dataset on boot time by using a [[systemd]] unit. For example create the following service to unlock any dataset of the pool named ''tank'': <br />
<br />
{{hc|/etc/systemd/system/zfskey-tank@.service|2=<br />
[Unit]<br />
Description=Load pool encryption keys<br />
#Before=systemd-user-sessions.service<br />
Before=zfs-mount.service<br />
After=zfs-import.target<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/bash -c '/usr/bin/zfs load-key %j/%i<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
}}<br />
<br />
[[Enable]]/[[start]] the service for each encrypted dataset, e.g. {{ic|# systemctl enable zfskey-tank@dataset}}.<br />
<br />
{{Note|The {{ic|1=Before=systemd-user-sessions.service}} ensures that systemd-ask-password is invoked before the local IO devices are handed over to the [[desktop environment]].}}<br />
<br />
An alternative is to load all possible keys:<br />
<br />
{{hc|etc/systemd/system/zfs-load-key.service|2=<br />
[Unit]<br />
Description=Load encryption keys<br />
Before=zfs-mount.service<br />
After=zfs-import.target<br />
<br />
[Service]<br />
Type=oneshot<br />
RemainAfterExit=yes<br />
ExecStart=/usr/bin/bash -c '/usr/bin/zfs load-key -a'<br />
<br />
[Install]<br />
WantedBy=zfs-mount.service<br />
}}<br />
<br />
[[Enable]]/[[start]] {{ic|zfs-load-key.service}}.<br />
<br />
=== Swap volume ===<br />
<br />
{{Warning|On systems with extremely high memory pressure, using a zvol for swap can result in lockup, regardless of how much swap is still available. This issue is currently being investigated in https://github.com/zfsonlinux/zfs/issues/7734}}<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is important to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
=== Access Control Lists ===<br />
<br />
To use [[ACL]] on a ZFS pool:<br />
<br />
# zfs set acltype=posixacl <nameofzpool>/<nameofdataset><br />
# zfs set xattr=sa <nameofzpool>/<nameofdataset><br />
<br />
Setting {{ic|xattr}} is recommended for performance reasons [https://github.com/zfsonlinux/zfs/issues/170#issuecomment-27348094].<br />
<br />
It may be preferable to enable ACL on the zpool instead. Setting {{ic|1=aclinherit=passthrough}} may be wanted as the default mode is {{ic|restricted}} [https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbaaz/index.html]:<br />
<br />
# zfs set aclinherit=passthrough <nameofzpool><br />
# zfs set acltype=posixacl <nameofzpool><br />
# zfs set xattr=sa <nameofzpool><br />
<br />
=== Databases ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
<br />
# systemctl mask tmp.mount<br />
<br />
== Troubleshooting ==<br />
<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then [[regenerate the initramfs]] image. Which will copy the hostid into the initramfs image.<br />
<br />
=== Pool cannot be found while booting from SAS/SCSI devices ===<br />
<br />
In case you are booting a SAS/SCSI based, you might occassionally get boot problems where the pool you are trying to boot from cannot be found. A likely reason for this is that your devices are initialized too late into the process. That means that zfs cannot find any devices at the time when it tries to assemble your pool.<br />
<br />
In this case you should force the scsi driver to wait for devices to come online before continuing. You can do this by putting this into {{ic|/etc/modprobe.d/zfs.conf}}:<br />
<br />
{{hc|1=/etc/modprobe.d/zfs.conf|2=<br />
options scsi_mod scan=sync<br />
}}<br />
<br />
Afterwards, [[regenerate the initramfs]].<br />
<br />
This works because the zfs hook will copy the file at {{ic|/etc/modprobe.d/zfs.conf}} into the initcpio which will then be used at build time.<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then [[regenerate the initramfs]] in normally booted system.<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
<br />
{{hc|$ hostid|<br />
0a0af0f8<br />
}}<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|1=spl.spl_hostid=0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|1=zfs_force=1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
=== Pool resilvering stuck/restarting/slow? ===<br />
<br />
According to the ZFSonLinux github it is a known issue since 2012 with ZFS-ZED which causes the resilvering process to constantly restart, sometimes get stuck and be generally slow for some hardware. The simplest mitigation is to stop zfs-zed.service until the resilver completes<br />
<br />
=== Fix slow boot caused by failed import of unavailable pools in the initramfs zpool.cache ===<br />
<br />
Your boot time can be significantly impacted if you update your intitramfs (eg when doing a kernel update) when you have additional but non-permanently attached pools imported because these pools will get added to your initramfs zpool.cache and ZFS will attempt to import these extra pools on every boot, regardless of whether you have exported it and removed it from your regular zpool.cache.<br />
<br />
If you notice ZFS trying to import unavailable pools at boot, first run:<br />
<br />
$ zdb -C<br />
<br />
To check you zpool.cache for pools you do not want imported at boot. If this command is showing (a) additional, currently unavailable pool(s), run:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache zroot<br />
<br />
To clear the zpool.cache of any pools other than the pool named zroot. Sometimes there is no need to refresh your zpool.cache, but instead all you need to do is [[regenerate the initramfs]].<br />
<br />
== Tips and tricks ==<br />
<br />
=== Embed the archzfs packages into an archiso ===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
{{Note|If you later have problems running modprobe zfs, you should include the linux-headers in the packages.x86_64. }}<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the {{ic|--keep parameter}} from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[#Scrub|scrub]]{{Broken section link}}). It is possible to override this by editing provided systemd unit ([[Systemd#Editing provided units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup rotation scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
=== Creating a share ===<br />
<br />
ZFS has support for creating shares by SMB or [[NFS]].<br />
<br />
==== NFS ====<br />
<br />
Make sure [[NFS]] has been installed/configured, note there is no need to edit the {{ic|/etc/exports}} file. For sharing over NFS the services {{ic|nfs-server.service}} and {{ic|zfs-share.service}} should be [[start|started]].<br />
<br />
To make a pool available on the network:<br />
<br />
# zfs set sharenfs=on <nameofzpool><br />
<br />
To make a dataset available on the network:<br />
<br />
# zfs set sharenfs=on <nameofzpool>/<nameofdataset><br />
<br />
To enable read/write access for a specific ip-range(s):<br />
<br />
# zfs set sharenfs="rw=@192.168.1.100/24,rw=@10.0.0.0/24" <nameofzpool>/<nameofdataset><br />
<br />
To check if the dataset is exported successful:<br />
<br />
{{hc|# showmount -e `hostname`|<br />
Export list for hostname:<br />
/path/of/dataset 192.168.1.100/24<br />
}}<br />
<br />
To view the current loaded exports state in more detail, use:<br />
<br />
{{hc|# exportfs -v|2=<br />
/path/of/dataset<br />
192.168.1.100/24(sync,wdelay,hide,no_subtree_check,mountpoint,sec=sys,rw,secure,no_root_squash,no_all_squash)<br />
}}<br />
<br />
==== SMB ====<br />
<br />
When sharing smb shares configuring usershares in your smb.conf will allow ZFS to setup and create the shares.<br />
<br />
{{bc|1=<br />
# [global]<br />
# usershare path = /var/lib/samba/usershares<br />
# usershare max shares = 100<br />
# usershare allow guests = yes<br />
# usershare owner only = no<br />
}}<br />
<br />
Create and set permissions on the user directory as root<br />
<br />
# mkdir /var/lib/samba/usershares<br />
# chmod +t /var/lib/samba/usershares<br />
<br />
=== Encryption in ZFS using dm-crypt ===<br />
<br />
The stable release version of ZFS on Linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text abstraction, it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there. Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinitcpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible. To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
[[Regenerate the initramfs]]. There should be no errors.<br />
<br />
=== Bind mount ===<br />
<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
=== Monitoring / Mailing on Events ===<br />
<br />
See [https://ramsdenj.com/2016/08/29/arch-linux-on-zfs-part-3-followup.html ZED: The ZFS Event Daemon] for more information.<br />
<br />
An email forwarder, such as [[S-nail]] (installed as part of {{Grp|base}}), is required to accomplish this. Test it to be sure it is working correctly.<br />
<br />
Uncomment the following in the configuration file:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|<nowiki><br />
ZED_EMAIL_ADDR="root"<br />
ZED_EMAIL_PROG="mailx"<br />
ZED_NOTIFY_VERBOSE=0<br />
ZED_EMAIL_OPTS="-s '@SUBJECT@' @ADDRESS@"<br />
</nowiki>}}<br />
<br />
Update 'root' in {{ic|1=ZED_EMAIL_ADDR="root"}} to the email address you want to receive notifications at.<br />
<br />
If you are keeping your mailrc in your home directory, you can tell mail to get it from there by setting {{ic|MAILRC}}:<br />
<br />
{{hc|/etc/zfs/zed.d/zed.rc|2=<br />
export MAILRC=/home/<user>/.mailrc<br />
}}<br />
<br />
This works because ZED sources this file, so {{ic|mailx}} sees this environment variable.<br />
<br />
If you want to receive an email no matter the state of your pool, you will want to set {{ic|1=ZED_NOTIFY_VERBOSE=1}}. You will need to do this temporary to test.<br />
<br />
[[Start]] and [[enable]] {{ic|zfs-zed.service}}.<br />
<br />
With {{ic|1=ZED_NOTIFY_VERBOSE=1}}, you can test by running a scrub as root: {{ic|1=zpool scrub <pool-name>}}.<br />
<br />
=== Wrap shell commands in pre & post snapshots ===<br />
<br />
Since it is so cheap to make a snapshot, we can use this as a measure of security for sensitive commands such as system and package upgrades. If we make a snapshot before, and one after, we can later diff these snapshots to find out what changed on the filesystem after the command executed. Furthermore we can also rollback in case the outcome was not desired.<br />
<br />
E.g.:<br />
<br />
# zfs snapshot -r zroot@pre<br />
# pacman -Syu<br />
# zfs snapshot -r zroot@post<br />
# zfs diff zroot@pre zroot@post <br />
# zfs rollback zroot@pre<br />
<br />
A utility that automates the creation of pre and post snapshots around a shell command is [https://gist.github.com/erikw/eeec35be33e847c211acd886ffb145d5 znp].<br />
<br />
E.g.:<br />
<br />
# znp pacman -Syu<br />
# znp find / -name "something*" -delete<br />
<br />
and you would get snapshots created before and after the supplied command, and also output of the commands logged to file for future reference so we know what command created the diff seen in a pair of pre/post snapshots.<br />
<br />
=== Remote unlocking of ZFS encrypted root ===<br />
<br />
As of [https://github.com/archzfs/archzfs/pull/261 PR #261], {{ic|archzfs}} supports SSH unlocking of natively-encrypted ZFS datasets. This section describes how to use this feature, and is largely based on [[dm-crypt/Specialties#Remote unlocking (hooks: netconf, dropbear, tinyssh, ppp)]].<br />
<br />
#Install {{Pkg|mkinitcpio-netconf}} to provide hooks for setting up early user space networking.<br />
#Choose an SSH server to use in early user space. The options are {{Pkg|mkinitcpio-tinyssh}} or {{Pkg|mkinitcpio-dropbear}}, and are mutually exclusive.<br />
##If using {{Pkg|mkinitcpio-tinyssh}}, it is also recommended to install {{Pkg|tinyssh-convert}} or {{AUR|tinyssh-convert-git}}. This tool converts an existing OpenSSH hostkey to the TinySSH key format, preserving the key fingerprint and avoiding connection warnings. The TinySSH and Dropbear mkinitcpio install scripts will automatically convert existing hostkeys when generating a new initcpio image.<br />
#Decide whether to use an existing OpenSSH key or generate a new one (recommended) for the host that will be connecting to and unlocking the encrypted ZFS machine. Copy the public key into {{ic|/etc/tinyssh/root_key}} or {{ic|/etc/dropbear/root_key}}. When generating the initcpio image, this file will be added to {{ic|authorized_keys}} for the root user and is only valid in the initrd environment.<br />
#Add the {{ic|1=ip=}} [[kernel parameter]] to your boot loader configuration. The {{ic|ip}} string is [https://www.kernel.org/doc/Documentation/filesystems/nfs/nfsroot.txt highly configurable]. A simple DHCP example is shown below.{{bc|1=ip=:::::eth0:dhcp}}<br />
#Edit {{ic|/etc/mkinitcpio.conf}} to include the {{ic|netconf}}, {{ic|dropbear}} or {{ic|tinyssh}}, and {{ic|zfsencryptssh}} hooks before the {{ic|zfs}} hook:{{bc|1=HOOKS=(... netconf <tinyssh>{{!}}<dropbear> zfsencryptssh zfs ...)}}<br />
#[[Regenerate the initramfs]].<br />
#Reboot and try it out!<br />
<br />
====Changing the SSH server port====<br />
<br />
By default, {{Pkg|mkinitcpio-tinyssh}} and {{Pkg|mkinitcpio-dropbear}} listen on port {{ic|22}}. You may wish to change this.<br />
<br />
For '''TinySSH''', copy {{ic|/usr/lib/initcpio/hooks/tinyssh}} to {{ic|/etc/initcpio/hooks/tinyssh}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/bin/tcpserver -HRDl0 0.0.0.0 <new_port> /usr/sbin/tinysshd -v /etc/tinyssh/sshkeydir &<br />
}}<br />
<br />
For '''Dropbear''', copy {{ic|/usr/lib/initcpio/hooks/dropbear}} to {{ic|/etc/initcpio/hooks/dropbear}}, and find/modify the following line in the {{ic|run_hook()}} function:<br />
<br />
{{hc|/etc/initcpio/hooks/tinyssh|<br />
/usr/sbin/dropbear -E -s -j -k -p <new_port><br />
}}<br />
<br />
[[Regenerate the initramfs]].<br />
<br />
==== Unlocking from a Windows machine using PuTTY/Plink ====<br />
<br />
First, we need to use {{ic|puttygen.exe}} to import and convert the OpenSSH key generated earlier into PuTTY's ''.ppk'' private key format. Let us call it {{ic|zfs_unlock.ppk}} for this example.<br />
<br />
The mkinitcpio-netconf process above does not setup a shell (nor do we need need one). However, because there is no shell, PuTTY will immediately close after a successful connection. This can be disabled in the PuTTY SSH configuration (''Connection -> SSH -> [X] Do not start a shell or command at all''), but it still does not allow us to see stdout or enter the encryption passphrase. Instead, we use {{ic|plink.exe}} with the following parameters:<br />
<br />
plink.exe -ssh -l root -i c:\path\to\zfs_unlock.ppk <hostname><br />
<br />
The plink command can be put into a batch script for ease of use.<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]<br />
* [https://github.com/danboid/creating-ZFS-disks-under-Linux How to create cross platform ZFS disks under Linux]<br />
* [https://blog.heckel.xyz/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/ How-To: Using ZFS Encryption at Rest in OpenZFS (ZFS on Linux, ZFS on FreeBSD, …)]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=485488ZFS2017-08-16T03:26:28Z<p>Chungy: /* Using dm-crypt */ ZFS does have native encryption now.</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
[[ru:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
* ZFS still resides in [[Arch User Repository]] and unofficial [[Unofficial user repositories #archzfs|archzfs]] repository.<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.}}<br />
<br />
== Installation ==<br />
=== General ===<br />
<br />
{{warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for the hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for the hardened kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl-linux}}, {{ic|spl-utils-linux}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
Note: Pacman does not take dependencies into consideration when rebuilding DKMS modules. This will result in build failures when pacman tries to rebuild DKMS modules after a kernel upgrade. See bug reports {{Bug|52901}} and {{Bug|53669}}.<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run.<br />
<br />
See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.<br />
<br />
}}<br />
<br />
In order to mount zfs pools automatically on boot you need to enable the following services:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-mount<br />
<br />
== Creating a storage pool ==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare the Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Warning|If you create zpools using device names (e.g. /dev/sda,/dev/sdb,...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced Format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because of backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is not suitable for all the features of ZFSonLinux 0.7.1, and must have unsupported features disabled.<br />
<br />
You can create a pool with the incompatible features disabled:<br />
<br />
# zpool create -o feature@multi_vdev_crash_dump=disabled \<br />
-o feature@large_dnode=disabled \<br />
-o feature@sha512=disabled \<br />
-o feature@skein=disabled \<br />
-o feature@edonr=disabled \<br />
$POOL_NAME $VDEVS<br />
<br />
When running the git version of ZFS on Linux, make sure to also add {{ic|1=-o feature@encryption=disabled}}.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== SSD Caching ===<br />
<br />
You can also add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (l2arc). The process to add them is very similar to creating a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from /dev/disk/by-id/*<br />
<br />
To add a ZIL:<br />
zpool add <pool> log <device-id><br />
<br />
or to add a mirrored ZIL:<br />
zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
<br />
To add an l2arc:<br />
zpool add <pool> cache <device-id><br />
<br />
or to add a mirrored l2arc:<br />
zpool add <pool> cache mirror <device-id-1> <device-id-2><br />
<br />
=== Database ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[#Scrub|scrub]]). It is possible to override this by editing provided systemd unit ([[Systemd#Editing provided units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.<br />
}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
== Troubleshooting ==<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on Linux ===<br />
==== Using dm-crypt ====<br />
The release version of ZFS on Linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text abstraction, it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there. Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible. To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
==== Native encryption ====<br />
To use native ZFS encryption, you will need a recent enough zfs package like {{AUR|zfs-linux-git}} 0.7.0.r26 or newer.<br />
<br />
To create an encrypted dataset just specify the encryption type and keyformat:<br />
# zfs create -o encryption=on -o keyformat=passphrase -o mountpoint=none pool/encr<br />
<br />
* Supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-ccm}} will be used.<br />
* Supported keyformats: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}<br />
You can also specify iterations of PBKDF2 with {{ic|-o pbkdf2iters <n>}} (Time it takes to decrypt the key)<br />
<br />
When importing a pool that contains encrypted datasets: ZFS will by default not decrypt these datasets. To do this use {{ic|-l}}<br />
# zpool import -l pool<br />
You can also manually load the keys and then mount the encrypted dataset<br />
# zfs load-key pool/dataset # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Bind mount ===<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=485487ZFS2017-08-16T03:22:47Z<p>Chungy: /* Native encryption */ Encryption has landed in master.</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
[[ru:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
* ZFS still resides in [[Arch User Repository]] and unofficial [[Unofficial user repositories #archzfs|archzfs]] repository.<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.}}<br />
<br />
== Installation ==<br />
=== General ===<br />
<br />
{{warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for the hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for the hardened kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl-linux}}, {{ic|spl-utils-linux}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
Note: Pacman does not take dependencies into consideration when rebuilding DKMS modules. This will result in build failures when pacman tries to rebuild DKMS modules after a kernel upgrade. See bug reports {{Bug|52901}} and {{Bug|53669}}.<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run.<br />
<br />
See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.<br />
<br />
}}<br />
<br />
In order to mount zfs pools automatically on boot you need to enable the following services:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-mount<br />
<br />
== Creating a storage pool ==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare the Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Warning|If you create zpools using device names (e.g. /dev/sda,/dev/sdb,...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced Format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because of backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is not suitable for all the features of ZFSonLinux 0.7.1, and must have unsupported features disabled.<br />
<br />
You can create a pool with the incompatible features disabled:<br />
<br />
# zpool create -o feature@multi_vdev_crash_dump=disabled \<br />
-o feature@large_dnode=disabled \<br />
-o feature@sha512=disabled \<br />
-o feature@skein=disabled \<br />
-o feature@edonr=disabled \<br />
$POOL_NAME $VDEVS<br />
<br />
When running the git version of ZFS on Linux, make sure to also add {{ic|1=-o feature@encryption=disabled}}.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== SSD Caching ===<br />
<br />
You can also add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (l2arc). The process to add them is very similar to creating a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from /dev/disk/by-id/*<br />
<br />
To add a ZIL:<br />
zpool add <pool> log <device-id><br />
<br />
or to add a mirrored ZIL:<br />
zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
<br />
To add an l2arc:<br />
zpool add <pool> cache <device-id><br />
<br />
or to add a mirrored l2arc:<br />
zpool add <pool> cache mirror <device-id-1> <device-id-2><br />
<br />
=== Database ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[#Scrub|scrub]]). It is possible to override this by editing provided systemd unit ([[Systemd#Editing provided units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.<br />
}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
== Troubleshooting ==<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on Linux ===<br />
==== Using dm-crypt ====<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
==== Native encryption ====<br />
To use native ZFS encryption, you will need a recent enough zfs package like {{AUR|zfs-linux-git}} 0.7.0.r26 or newer.<br />
<br />
To create an encrypted dataset just specify the encryption type and keyformat:<br />
# zfs create -o encryption=on -o keyformat=passphrase -o mountpoint=none pool/encr<br />
<br />
* Supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-ccm}} will be used.<br />
* Supported keyformats: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}<br />
You can also specify iterations of PBKDF2 with {{ic|-o pbkdf2iters <n>}} (Time it takes to decrypt the key)<br />
<br />
When importing a pool that contains encrypted datasets: ZFS will by default not decrypt these datasets. To do this use {{ic|-l}}<br />
# zpool import -l pool<br />
You can also manually load the keys and then mount the encrypted dataset<br />
# zfs load-key pool/dataset # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Bind mount ===<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=485486ZFS2017-08-16T03:19:09Z<p>Chungy: /* GRUB-compatible pool creation */ Fix formating with the = sign.</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
[[ru:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
* ZFS still resides in [[Arch User Repository]] and unofficial [[Unofficial user repositories #archzfs|archzfs]] repository.<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.}}<br />
<br />
== Installation ==<br />
=== General ===<br />
<br />
{{warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for the hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for the hardened kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl-linux}}, {{ic|spl-utils-linux}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
Note: Pacman does not take dependencies into consideration when rebuilding DKMS modules. This will result in build failures when pacman tries to rebuild DKMS modules after a kernel upgrade. See bug reports {{Bug|52901}} and {{Bug|53669}}.<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run.<br />
<br />
See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.<br />
<br />
}}<br />
<br />
In order to mount zfs pools automatically on boot you need to enable the following services:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-mount<br />
<br />
== Creating a storage pool ==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare the Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Warning|If you create zpools using device names (e.g. /dev/sda,/dev/sdb,...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced Format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because of backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is not suitable for all the features of ZFSonLinux 0.7.1, and must have unsupported features disabled.<br />
<br />
You can create a pool with the incompatible features disabled:<br />
<br />
# zpool create -o feature@multi_vdev_crash_dump=disabled \<br />
-o feature@large_dnode=disabled \<br />
-o feature@sha512=disabled \<br />
-o feature@skein=disabled \<br />
-o feature@edonr=disabled \<br />
$POOL_NAME $VDEVS<br />
<br />
When running the git version of ZFS on Linux, make sure to also add {{ic|1=-o feature@encryption=disabled}}.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== SSD Caching ===<br />
<br />
You can also add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (l2arc). The process to add them is very similar to creating a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from /dev/disk/by-id/*<br />
<br />
To add a ZIL:<br />
zpool add <pool> log <device-id><br />
<br />
or to add a mirrored ZIL:<br />
zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
<br />
To add an l2arc:<br />
zpool add <pool> cache <device-id><br />
<br />
or to add a mirrored l2arc:<br />
zpool add <pool> cache mirror <device-id-1> <device-id-2><br />
<br />
=== Database ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[#Scrub|scrub]]). It is possible to override this by editing provided systemd unit ([[Systemd#Editing provided units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.<br />
}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
== Troubleshooting ==<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on Linux ===<br />
==== Using dm-crypt ====<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
==== Native encryption ====<br />
{{Warning|Encryption in ZFS is not yet merged upsteam. So do this at you own risk!}}<br />
To use native ZFS encryption, you will need a patched zfs package like {{AUR|zfs-encryption-dkms-git}}<br />
<br />
To create an encrypted dataset just specify the encryption type and keyformat:<br />
# zfs create -o encryption=on -o keyformat=passphrase -o mountpoint=none pool/encr<br />
<br />
* Supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-ccm}} will be used.<br />
* Supported keyformats: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}<br />
You can also specify iterations of PBKDF2 with {{ic|-o pbkdf2iters <n>}} (Time it takes to decrypt the key)<br />
<br />
When importing a pool that contains encrypted datasets: ZFS will by default not decrypt these datasets. To do this use {{ic|-l}}<br />
# zpool import -l pool<br />
You can also manually load the keys and then mount the encrypted dataset<br />
# zfs load-key pool/dataset # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Bind mount ===<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=485485ZFS2017-08-16T03:17:51Z<p>Chungy: /* GRUB-compatible pool creation */ Note to disable encryption support too.</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
[[ru:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
* ZFS still resides in [[Arch User Repository]] and unofficial [[Unofficial user repositories #archzfs|archzfs]] repository.<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.}}<br />
<br />
== Installation ==<br />
=== General ===<br />
<br />
{{warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for the hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for the hardened kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl-linux}}, {{ic|spl-utils-linux}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
Note: Pacman does not take dependencies into consideration when rebuilding DKMS modules. This will result in build failures when pacman tries to rebuild DKMS modules after a kernel upgrade. See bug reports {{Bug|52901}} and {{Bug|53669}}.<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run.<br />
<br />
See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.<br />
<br />
}}<br />
<br />
In order to mount zfs pools automatically on boot you need to enable the following services:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-mount<br />
<br />
== Creating a storage pool ==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare the Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Warning|If you create zpools using device names (e.g. /dev/sda,/dev/sdb,...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced Format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because of backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is not suitable for all the features of ZFSonLinux 0.7.1, and must have unsupported features disabled.<br />
<br />
You can create a pool with the incompatible features disabled:<br />
<br />
# zpool create -o feature@multi_vdev_crash_dump=disabled \<br />
-o feature@large_dnode=disabled \<br />
-o feature@sha512=disabled \<br />
-o feature@skein=disabled \<br />
-o feature@edonr=disabled \<br />
$POOL_NAME $VDEVS<br />
<br />
When running the git version of ZFS on Linux, make sure to also add {{ic|-o feature@encryption=disabled}}.<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== SSD Caching ===<br />
<br />
You can also add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (l2arc). The process to add them is very similar to creating a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from /dev/disk/by-id/*<br />
<br />
To add a ZIL:<br />
zpool add <pool> log <device-id><br />
<br />
or to add a mirrored ZIL:<br />
zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
<br />
To add an l2arc:<br />
zpool add <pool> cache <device-id><br />
<br />
or to add a mirrored l2arc:<br />
zpool add <pool> cache mirror <device-id-1> <device-id-2><br />
<br />
=== Database ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[#Scrub|scrub]]). It is possible to override this by editing provided systemd unit ([[Systemd#Editing provided units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.<br />
}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
== Troubleshooting ==<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on Linux ===<br />
==== Using dm-crypt ====<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
==== Native encryption ====<br />
{{Warning|Encryption in ZFS is not yet merged upsteam. So do this at you own risk!}}<br />
To use native ZFS encryption, you will need a patched zfs package like {{AUR|zfs-encryption-dkms-git}}<br />
<br />
To create an encrypted dataset just specify the encryption type and keyformat:<br />
# zfs create -o encryption=on -o keyformat=passphrase -o mountpoint=none pool/encr<br />
<br />
* Supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-ccm}} will be used.<br />
* Supported keyformats: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}<br />
You can also specify iterations of PBKDF2 with {{ic|-o pbkdf2iters <n>}} (Time it takes to decrypt the key)<br />
<br />
When importing a pool that contains encrypted datasets: ZFS will by default not decrypt these datasets. To do this use {{ic|-l}}<br />
# zpool import -l pool<br />
You can also manually load the keys and then mount the encrypted dataset<br />
# zfs load-key pool/dataset # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Bind mount ===<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=483765ZFS2017-08-04T06:18:54Z<p>Chungy: /* GRUB-compatible pool creation */ There are no 0.6.5.x packages any longer, remove text about it.</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
[[ru:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
* ZFS still resides in [[Arch User Repository]] and unofficial [[Unofficial user repositories #archzfs|archzfs]] repository.<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.}}<br />
<br />
== Installation ==<br />
=== General ===<br />
<br />
{{warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kernel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-linux-lts-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for LTS kernels.<br />
* {{AUR|zfs-linux-hardened}} for stable releases for the hardened kernels.<br />
* {{AUR|zfs-linux-hardened-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases for the hardened kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl-linux}}, {{ic|spl-utils-linux}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
Note: Pacman does not take dependencies into consideration when rebuilding DKMS modules. This will result in build failures when pacman tries to rebuild DKMS modules after a kernel upgrade. See bug reports {{Bug|52901}} and {{Bug|53669}}.<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run.<br />
<br />
See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.<br />
<br />
}}<br />
<br />
In order to mount zfs pools automatically on boot you need to enable the following services:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-mount<br />
<br />
== Creating a storage pool ==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare_the_Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Warning|If you create zpools using device names (e.g. /dev/sda,/dev/sdb,...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced Format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because of backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is not suitable for all the features of ZFSonLinux 0.7.0, and must have unsupported features disabled.<br />
<br />
You can create a pool with the incompatible features disabled:<br />
<br />
# zpool create -o feature@multi_vdev_crash_dump=disabled \<br />
-o feature@large_dnode=disabled \<br />
-o feature@sha512=disabled \<br />
-o feature@skein=disabled \<br />
-o feature@edonr=disabled \<br />
$POOL_NAME $VDEVS<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== SSD Caching ===<br />
<br />
You can also add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (l2arc). The process to add them is very similar to creating a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from /dev/disk/by-id/*<br />
<br />
To add a ZIL:<br />
zpool add <pool> log <device-id><br />
<br />
or to add a mirrored ZIL:<br />
zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
<br />
To add an l2arc:<br />
zpool add <pool> cache <device-id><br />
<br />
or to add a mirrored l2arc:<br />
zpool add <pool> cache mirror <device-id-1> <device-id-2><br />
<br />
=== Database ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[ZFS#Scrub|scrub]]). It is possible to override this by editing provided systemd unit ([[Systemd#Editing_provided_units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.<br />
}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
== Troubleshooting ==<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on Linux ===<br />
==== Using dm-crypt ====<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
<br />
# zfs create -o compression=off -o dedup=off -o mountpoint=/home/<username> <zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
==== Native encryption ====<br />
{{Warning|Encryption in ZFS is not yet merged upsteam. So do this at you own risk!}}<br />
To use native ZFS encryption, you will need a patched zfs package like {{AUR|zfs-encryption-dkms-git}}<br />
<br />
To create an encrypted dataset just specify the encryption type and keyformat:<br />
# zfs create -o encryption=on -o keyformat=passphrase -o mountpoint=none pool/encr<br />
<br />
* Supported encryption options: {{ic|aes-128-ccm}}, {{ic|aes-192-ccm}}, {{ic|aes-256-ccm}}, {{ic|aes-128-gcm}}, {{ic|aes-192-gcm}} and {{ic|aes-256-gcm}}. When encryption is set to {{ic|on}}, {{ic|aes-256-ccm}} will be used.<br />
* Supported keyformats: {{ic|passphrase}}, {{ic|raw}}, {{ic|hex}}<br />
You can also specify iterations of PBKDF2 with {{ic|-o pbkdf2iters <n>}} (Time it takes to decrypt the key)<br />
<br />
When importing a pool that contains encrypted datasets: ZFS will by default not decrypt these datasets. To do this use {{ic|-l}}<br />
# zpool import -l pool<br />
You can also manually load the keys and then mount the encrypted dataset<br />
# zfs load-key pool/dataset # load key for a specific dataset<br />
# zfs load-key -a # load all keys<br />
# zfs load-key -r zpool/dataset # load all keys in a dataset<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Bind mount ===<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=480773ZFS2017-07-01T00:14:02Z<p>Chungy: /* GRUB-compatible pool creation */ GRUB most definitely does not support the 0.7.x features ;)</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
[[zh-hans:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
{{Note|Due to potential legal incompatibilities between CDDL license of ZFS code and GPL of the Linux kernel ([https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ ],[[wikipedia:Common_Development_and_Distribution_License#GPL_compatibility|CDDL-GPL]],[[wikipedia:ZFS#Linux|ZFS in Linux]]) - ZFS development is not supported by the kernel.<br />
<br />
As a result:<br />
* ZFS still resides in [[Arch User Repository]] and unofficial [[Unofficial user repositories #archzfs|archzfs]] repository.<br />
* ZFSonLinux project must keep up with Linux kernel versions. After making stable ZFSonLinux release - Arch ZFS maintainers release them.<br />
* This situation sometimes locks down the normal rolling update process by unsatisfied dependencies because the new kernel version, proposed by update, is unsupported by ZFSonLinux.}}<br />
<br />
== Installation ==<br />
=== General ===<br />
<br />
{{warning|Unless you use the [[dkms]] versions of these packages, the ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kenel is newer.}}<br />
<br />
Install from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
* {{AUR|zfs-linux}} for [http://zfsonlinux.org/ stable] releases.<br />
* {{AUR|zfs-linux-git}} for [https://github.com/zfsonlinux/zfs/releases development] releases (with support of newer kernel versions).<br />
* {{AUR|zfs-linux-lts}} for stable releases for LTS kernels.<br />
* {{AUR|zfs-dkms}} for versions with dynamic kernel module support.<br />
<br />
These branches have (according to them) dependencies on the {{ic|zfs-utils}}, {{ic|spl-linux}}, {{ic|spl-utils-linux}} packages. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-linux}} or {{AUR|zfs-dkms}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
Note: Pacman does not take dependencies into consideration when rebuilding DKMS modules. This will result in build failures when pacman tries to rebuild DKMS modules after a kernel upgrade. See bug reports {{Bug|52901}} and {{Bug|53669}}.<br />
<br />
== Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
== Configuration ==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
=== Automatic Start ===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
{{Note|Beginning with ZOL version 0.6.5.8 the ZFS service unit files have been changed so that you need to explicitly enable any ZFS services you want to run.<br />
<br />
See [https://github.com/archzfs/archzfs/issues/72 https://github.com/archzfs/archzfs/issues/72] for more information.<br />
<br />
}}<br />
<br />
In order to mount zfs pools automatically on boot you need to enable the following services:<br />
<br />
# systemctl enable zfs-import-cache<br />
# systemctl enable zfs-mount<br />
<br />
== Creating a storage pool ==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare_the_Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Warning|If you create zpools using device names (e.g. /dev/sda,/dev/sdb,...) ZFS might not be able to detect zpools intermittently on boot.}}<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced Format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because of backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
{{note|This section frequently goes out of date with updates to GRUB and ZFS. Consult the manual pages for the most up-to-date information.}}<br />
<br />
By default, ''zpool create'' enables all features on a pool. If {{ic|/boot}} resides on ZFS when using [[GRUB]] you must only enable features supported by GRUB otherwise GRUB will not be able to read the pool. GRUB 2.02 supports the read-write features {{ic|lz4_compress}}, {{ic|hole_birth}}, {{ic|embedded_data}}, {{ic|extensible_dataset}}, and {{ic|large_blocks}}; this is suitable for all the features of ZFSonLinux 0.6.5.x, but not for the 0.7.0 branch provided by the git packages.<br />
<br />
When using the git-based packages (0.7.0) and GRUB 2.02, you can create a pool with the incompatible features disabled:<br />
<br />
# zpool create -o feature@multi_vdev_crash_dump=disabled \<br />
-o feature@large_dnode=disabled \<br />
-o feature@sha512=disabled \<br />
-o feature@skein=disabled \<br />
-o feature@edonr=disabled \<br />
$POOL_NAME $VDEVS<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== SSD Caching ===<br />
<br />
You can also add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (l2arc). The process to add them is very similar to creating a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from /dev/disk/by-id/*<br />
<br />
To add a ZIL:<br />
zpool add <pool> log <device-id><br />
<br />
or to add a mirrored ZIL:<br />
zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
<br />
To add an l2arc:<br />
zpool add <pool> cache <device-id><br />
<br />
or to add a mirrored l2arc:<br />
zpool add <pool> cache mirror <device-id-1> <device-id-2><br />
<br />
=== Database ===<br />
<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Exporting a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Renaming a zpool ===<br />
<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a different mount point ===<br />
<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8 GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o logbias=throughput -o sync=always\<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
{{Note|zfs-auto-snapshot-git will not create snapshots during scrubbing ([[ZFS#Scrub|scrub]]). It is possible to override this by editing provided systemd unit ([[Systemd#Editing_provided_units]]) and removing `--skip-scrub` from `ExecStart` line. Consequences not known, someone please edit.<br />
}}<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
== Troubleshooting ==<br />
=== Creating a zpool fails ===<br />
<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on Linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Bind mount ===<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ Aaron Toponce's 17-part blog on ZFS]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html FreeBSD Handbook -- The Z File System]<br />
* [https://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]{{Dead link|2017|05|30}}<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ How Pingdom uses ZFS to back up 5TB of MySQL data every day]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=449192ZFS2016-09-03T10:06:46Z<p>Chungy: /* GRUB-compatible pool creation */ Latest GRUB supports all the features of the latest ZFS stable</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. <br />
<br />
Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum 256 Quadrillion [[Wikipedia:Zettabyte|Zettabytes]] storage with no limit on number of filesystems (datasets) or files[http://docs.oracle.com/cd/E19253-01/819-5461/zfsover-2/index.html]. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
=== General ===<br />
Install {{AUR|zfs-linux-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository. This package has {{AUR|zfs-utils-linux-git}} and {{AUR|spl-linux-git}} as a dependency, which in turn has {{AUR|spl-utils-linux-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-linux-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#archzfs|archzfs]] repository. A script to build ZFS and its dependencies automatically can be found [[#Automated build script|here]]{{Broken section link}}.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#archzfs|archzfs]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
{{Tip| You can [[downgrade]] your linux version to the one from [[Unofficial user repositories#archzfs|archzfs]] repo if your current kenel is newer.}}<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-git}}{{Broken package link|{{aur-mirror|zfs-git}}}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
=== DKMS ===<br />
<br />
Users can make use of DKMS [[Dynamic Kernel Module Support]] to rebuild the ZFS modules automatically with every kernel upgrade. <br />
<br />
Install {{AUR|zfs-dkms}} or {{AUR|zfs-dkms-git}} and apply the post-install instructions given by these packages.<br />
<br />
{{Tip|Add an {{ic|IgnorePkg}} entry to [[pacman.conf]] to prevent these packages from upgrading when doing a regular update.}}<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
{{Note|If {{ic|zfs-mount.service}} fails on boot due to running before the kernel module is loaded, you may have to manually enable the {{ic|zfs-import-cache.service}}. <br />
<br />
# systemctl enable zfs-import-cache.service<br />
<br />
}}<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare_the_Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?], [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [https://github.com/zfsonlinux/zfs/wiki/faq#selecting-dev-names-when-creating-a-pool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, then the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [https://github.com/zfsonlinux/zfs/wiki/faq#advanced-format-disks ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
By default, ''zpool'' will enable all features on a pool. If {{ic|/boot}} resides on ZFS and when using [[GRUB]], you must only enable read-only, or non-read-only features supported by GRUB, otherwise GRUB will not be able to read the pool. As of GRUB 2.02.beta3, GRUB supports all features in ZFS-on-Linux 0.6.5.7. However, the Git master branch of ZoL contains one extra feature, {{ic|large_dnodes}} that is not yet supported by GRUB.<br />
<br />
This example line is only necessary if you are using the Git branch of ZoL:<br />
<br />
# zpool create -f -d \<br />
-o feature@async_destroy=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@enabled_txg=enabled \<br />
-o feature@hole_birth=enabled \<br />
-o feature@bookmarks=enabled \<br />
-o feature@filesystem_limits \<br />
-o feature@embedded_data=enabled \<br />
-o feature@large_blocks=enabled \<br />
<pool_name> <vdevs><br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== SSD Caching ===<br />
You can also add SSD devices as a write intent log (external ZIL or SLOG) and also as a layer 2 adaptive replacement cache (l2arc). The process to add them is very similar to creating a new VDEV.<br />
<br />
All of the below references to device-id are the IDs from /dev/disk/by-id/*<br />
<br />
To add a ZIL:<br />
zpool add <pool> log <device-id><br />
<br />
or to add a mirrored ZIL:<br />
zpool add <pool> log mirror <device-id-1> <device-id-2><br />
<br />
<br />
To add an l2arc:<br />
zpool add <pool> cache <device-id><br />
<br />
or to add a mirrored l2arc:<br />
zpool add <pool> cache mirror <device-id-1> <device-id-2><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0x0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [https://github.com/zfsonlinux/zfs/wiki/faq#performance-considerations 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#archzfs|archzfs]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[archzfs]<br />
Server = http://archzfs.com/$repo/x86_64<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-linux}} group to the list of packages to be installed (the {{ic|archzfs}} repository provides packages for the x86_64 architecture only).<br />
<br />
{{hc|~/archlive/packages.x86_64|<br />
...<br />
archzfs-linux<br />
}}<br />
<br />
Complete [[Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#archzfs|archzfs]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-archiso-linux'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux but using the matching kernel modules directory name under the chroot's /lib/modules)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Bindmount ===<br />
Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The configuration ensures that the zfs pool is ready before the bind mount is created.<br />
<br />
==== fstab ====<br />
See [http://www.freedesktop.org/software/systemd/man/systemd.mount.html systemd.mount] for more information on how systemd converts fstab into mount unit files with [http://www.freedesktop.org/software/systemd/man/systemd-fstab-generator.html systemd-fstab-generator].<br />
<br />
{{hc|/etc/fstab|<nowiki><br />
/mnt/zfspool /srv/nfs4/music none bind,defaults,nofail,x-systemd.requires=zfs-mount.service 0 0<br />
</nowiki>}}<br />
<br />
==== systemd mount unit ====<br />
<br />
If it is not possible to bindmount a directory residing on zfs onto another directory using fstab, because the fstab is read before the zfs pool is ready, you can overcome this limitation with a systemd mount unit can be used for the bind mount. The name of the mount unit must be equal to the directory mentioned after "Where", replace slashes with minuses. See [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdAndBindMounts]] and [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdBindMountUnits]] for more details.<br />
{{hc|srv-nfs4-music.mount|<nowiki><br />
[Mount]<br />
What=/mnt/zfspool<br />
Where=/srv/nfs4/music<br />
Type=none<br />
Options=bind<br />
<br />
[Unit]<br />
DefaultDependencies=no<br />
Conflicts=umount.target<br />
Before=local-fs.target umount.target<br />
After=zfs-mount.service<br />
Requires=zfs-mount.service<br />
ConditionPathIsDirectory=/mnt/zfspool<br />
<br />
[Install]<br />
WantedBy=local-fs.target<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [https://github.com/zfsonlinux/zfs/wiki/faq ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
* [https://www.linuxquestions.org/questions/linux-from-scratch-13/%5Bhow-to%5D-add-zfs-to-the-linux-kernel-4175514510/ Tutorial on adding the modules to a custom kernel]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=402062ZFS2015-09-28T05:23:56Z<p>Chungy: /* General */ Recommend "compression=on" to agree with defaults ZFS developers decide on</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
=== General ===<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. A script to build ZFS and its dependencies automatically can be found [[#Automated build script|here]].<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-git}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare_the_Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandleAdvacedFormatDrives 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandleAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
By default, ''zpool'' will enable all features on a pool. If {{ic|/boot}} resides on ZFS and when using [[GRUB]], you must only enable read-only, or non-read-only features supported by GRUB ({{ic|lz4_compress}} as of version 2.02.beta2). Otherwise GRUB will not be able to read the pool.<br />
<br />
# zpool create -d \<br />
-o feature@async_destroy=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@enabled_txg=enabled \<br />
<pool_name> <vdevs><br />
<br />
{{Tip|As of September 2015, GRUB's development tree supports {{ic|extensible_dataset}}, {{ic|hole_birth}}, {{ic|embedded_data}}, and {{ic|large_blocks}}, making it viable to use a pool with all features enabled, either at create time or by using {{ic|zpool upgrade <pool_name>}}, if {{AUR|grub-git}} is installed.}}<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. ZFS supports a few different algorithms, presently lz4 is the default. '''gzip''' is also available for seldom-written yet highly-compressable data; consult the man page for more details. Enable compression using the zfs command:<br />
# zfs set compression=on <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[demz-repo-core]<br />
Server = http://demizerone.com/$repo/$arch<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-git}} group to the list of packages to be installed:<br />
<br />
{{hc|~/archlive/packages.both|<br />
...<br />
archzfs-git<br />
}}<br />
<br />
Complete [[Archiso#Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#demz-repo-archiso|demz-repo-archiso]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-git'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Automated build script ===<br />
<br />
{{Deletion|The wiki isn't the place to maintain massive script dumps}}<br />
<br />
The following script may be used to build ZFS and its dependencies automatically.<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
When ZFS is used as a data drive and boot support is not needed, these two shell scripts will build and remove all zfs packages. The only requirements are {{pkg|sudo}}, {{pkg|git}}, and answering a couple of prompts. On each kernel upgrade you remove ZFS with {{ic|zfsun.sh}}, update, and install ZFS with {{ic|zfsbuild.sh}}.<br />
{{hc|~/build/zfspkg/zfsbuild.sh|<nowiki><br />
#!/usr/bin/bash<br />
#<br />
# 2015-07-17 zfsbuild.sh by severach for AUR 4<br />
# 2015-08-08 AUR4 -> AUR, added git pull, safer AUR 3.5 update folder<br />
# Adapted from ZFS Builder by graysky<br />
# place this in a user home folder.<br />
# I recommend ~/build/zfspkg/. Do not name the folder 'zfs'.<br />
<br />
# 1 to add conflicts=(linux>,linux<) which offers automatic removal on upgrade.<br />
# Manual removal with zfsun.sh is preferred.<br />
_opt_AutoRemove=0<br />
_opt_ZFSPool='zfsdata'<br />
#_opt_ZFSbyid='/dev/disk/by-partlabel'<br />
_opt_ZFSbyid='/dev/disk/by-id'<br />
# '' for manual answer to prompts. --noconfirm to go ahead and do it all.<br />
_opt_AutoInstall='' #--noconfirm'<br />
<br />
# Multiprocessor compile enabled!<br />
# Huuuuuuge performance improvement. Watch in htop.<br />
# An E3-1245 can peg all 8 processors.<br />
#1 [|||||||||||||||||||||||||96.2%]<br />
#2 [|||||||||||||||||||||||||97.6%]<br />
#3 [|||||||||||||||||||||||||95.7%]<br />
#4 [|||||||||||||||||||||||||96.7%]<br />
#5 [|||||||||||||||||||||||||95.7%]<br />
#6 [|||||||||||||||||||||||||97.1%]<br />
#7 [|||||||||||||||||||||||||98.6%]<br />
#8 [|||||||||||||||||||||||||96.2%]<br />
#Mem[||| 596/31974MB]<br />
#Swp[ 0/0MB]<br />
<br />
set -u<br />
set -e<br />
<br />
if [ "${EUID}" -eq 0 ]; then<br />
echo "This script must NOT be run as root"<br />
sleep 1<br />
exit 1<br />
fi<br />
<br />
for i in 'sudo' 'git'; do<br />
command -v "${i}" >/dev/null 2>&1 || {<br />
echo "I require ${i} but it's not installed. Aborting." 1>&2<br />
exit 1; }<br />
done<br />
<br />
cd "$(dirname "$0")"<br />
OPWD="$(pwd)"<br />
for cwpackage in 'spl-utils-git' 'spl-git' 'zfs-utils-git' 'zfs-git'; do<br />
#cower -dc -f "${cwpackage}"<br />
if [ -d "${cwpackage}" -a ! -d "${cwpackage}/.git" ]; then<br />
echo "${cwpackage}: Convert AUR3.5 to AUR4"<br />
cd "${cwpackage}"<br />
git clone "https://aur.archlinux.org/${cwpackage}.git/" "${cwpackage}.temp"<br />
cd "${cwpackage}.temp"<br />
mv '.git' ..<br />
cd ..<br />
rm -rf "${cwpackage}.temp"<br />
cd ..<br />
fi<br />
if [ -d "${cwpackage}" ]; then<br />
echo "${cwpackage}: Update local copy"<br />
cd "${cwpackage}"<br />
git fetch<br />
git reset --hard 'origin/master'<br />
git pull # this line was missed in previous versions<br />
else<br />
echo "${cwpackage}: Clone to new folder"<br />
git clone "https://aur.archlinux.org/${cwpackage}.git/" <br />
cd "${cwpackage}"<br />
fi<br />
sed -i -e 's:^\s\+make$:'"& -s -j $(nproc):g" 'PKGBUILD'<br />
if [ "${_opt_AutoRemove}" -ne 0 ]; then<br />
sed -i -e 's:^conflicts=(.*$: &\n_kernelversionsmall="`uname -r | cut -d - -f 1`"\nconflicts+=("linux>${_kernelversionsmall}" "linux<${_kernelversionsmall}")\n:g' 'PKGBUILD'<br />
fi<br />
if ! makepkg -sCcfi ${_opt_AutoInstall}; then<br />
cd "${OPWD}"<br />
break<br />
fi<br />
#rm -rf 'zfs' 'spl'<br />
cd "${OPWD}"<br />
done<br />
which fsck.zfs<br />
if [ "$?" -eq 0 ]; then<br />
sudo mkinitcpio -p 'linux' # Stores fsck.zfs into the initrd image. I don't know why it would be needed.<br />
fi<br />
#sudo zpool import "${_opt_ZFSPool}" # Don't do this or zpool will mount via /dev/sd?, which you won't like!<br />
sudo zpool import -d "${_opt_ZFSbyid}" "${_opt_ZFSPool}"<br />
sudo zpool status<br />
sudo -k<br />
</nowiki>}}<br />
<br />
{{hc|~/build/zfspkg/zfsun.sh|<nowiki><br />
#!/usr/bin/bash<br />
<br />
# 2015-07-17 zfs uninstaller by severach for AUR4<br />
# Removing ZFS forgets to unmount the pools, which might be desirable if you're<br />
# running ZFS on the root file system.<br />
<br />
_opt_ZFSFolder='/home/zfsdata/foo'<br />
_opt_ZFSPool='zfsdata'<br />
<br />
if [ "${EUID}" -ne 0 ]; then<br />
echo 'Must be root, try sudo !!'<br />
sleep 1<br />
exit 1<br />
fi<br />
<br />
systemctl stop 'smbd.service' # Active shares can lock the mount. You might want to stop nfs too.<br />
zpool export "${_opt_ZFSPool}" # zpool import no longer works with drives that were zfs umount<br />
if [ ! -d "${_opt_ZFSFolder}" ]; then<br />
echo "${_opt_ZFSPool} exported"<br />
pacman -Rc 'spl-utils-git' # This works even if some are already removed.<br />
#pacman -R 'zfs-utils-git' 'spl-git' 'spl-utils-git' 'zfs-git'<br />
else<br />
echo "ZFS didn't unmount"<br />
fi<br />
systemctl start 'smbd.service'<br />
</nowiki>}}<br />
<br />
=== Bindmount ===<br />
<br />
It is not possible too bindmount a directory residing on zfs onto another directory using fstab, because the fstab is read before the zfs pool is ready. To overcome this limitation, a systemd mount unit can be used for the bind mount, as in the following example. Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The unit configuration ensures that the zfs pool is ready before the bind mount is created. The name of the mount unit must be equal to the directory mentioned after "Where", replace slashes with minuses. See [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdAndBindMounts]] and [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdBindMountUnits]] for more details.<br />
{{hc|srv-nfs4-music.mount|<nowiki><br />
[Mount]<br />
What=/mnt/zfspool<br />
Where=/srv/nfs4/music<br />
Type=none<br />
Options=bind<br />
<br />
[Unit]<br />
DefaultDependencies=no<br />
Conflicts=umount.target<br />
Before=local-fs.target umount.target<br />
After=zfs-mount.service<br />
Requires=zfs-mount.service<br />
ConditionPathIsDirectory=/mnt/zfspool<br />
<br />
[Install]<br />
WantedBy=local-fs.target<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=399732ZFS2015-09-13T10:25:26Z<p>Chungy: /* GRUB-compatible pool creation */ Update notice about GRUB; it didn't really change, but ZoL gained a feature and useful to know it still works</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
=== General ===<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. A script to build ZFS and its dependencies automatically can be found [[#Automated build script|here]].<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-git}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare_the_Devices]]) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandleAdvacedFormatDrives 1.15 How does ZFS on Linux handle Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandleAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
By default, ''zpool'' will enable all features on a pool. If {{ic|/boot}} resides on ZFS and when using [[GRUB]], you must only enable read-only, or non-read-only features supported by GRUB ({{ic|lz4_compress}} as of version 2.02.beta2). Otherwise GRUB will not be able to read the pool.<br />
<br />
# zpool create -d \<br />
-o feature@async_destroy=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@enabled_txg=enabled \<br />
<pool_name> <vdevs><br />
<br />
{{Tip|As of September 2015, GRUB's development tree supports {{ic|extensible_dataset}}, {{ic|hole_birth}}, {{ic|embedded_data}}, and {{ic|large_blocks}}, making it viable to use a pool with all features enabled, either at create time or by using {{ic|zpool upgrade <pool_name>}}, if {{AUR|grub-git}} is installed.}}<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== ZVOLs ===<br />
<br />
ZFS volumes (ZVOLs) can suffer from the same block size-related issues as RDBMSes, but it is worth noting that the default recordsize for ZVOLs is 8KiB already. If possible, it is best to align any partitions contained in a ZVOL to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the ZVOL as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a ZVOL gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a ZVOL, requiring 2× or more physical storage capacity than the ZVOL's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the ZVOL data.<br />
<br />
{{warning|While a ZVOL creates a block device and allows you to use it as swap, swap on ZVOLs can deadlock and should be avoided. See [https://clusterhq.com/2014/09/11/zfs-on-linux-runtime-stability/ ZFS on Linux Runtime Stability] and [https://github.com/zfsonlinux/zfs/issues/1274 ZFS Swap Lockup]}}<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
{{Out of date|The hibernate hook is deprecated. Does the limitation still apply?}}<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository:<br />
<br />
{{hc|~/archlive/pacman.conf|<nowiki><br />
...<br />
[demz-repo-core]<br />
Server = http://demizerone.com/$repo/$arch<br />
</nowiki>}}<br />
<br />
Add the {{ic|archzfs-git}} group to the list of packages to be installed:<br />
<br />
{{hc|~/archlive/packages.both|<br />
...<br />
archzfs-git<br />
}}<br />
<br />
Complete [[Archiso#Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#demz-repo-archiso|demz-repo-archiso]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-git'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Automated build script ===<br />
<br />
{{Deletion|The wiki isn't the place to maintain massive script dumps}}<br />
<br />
The following script may be used to build ZFS and its dependencies automatically.<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
When ZFS is used as a data drive and boot support is not needed, these two shell scripts will build and remove all zfs packages. The only requirements are {{pkg|sudo}}, {{pkg|git}}, and answering a couple of prompts. On each kernel upgrade you remove ZFS with {{ic|zfsun.sh}}, update, and install ZFS with {{ic|zfsbuild.sh}}.<br />
{{hc|~/build/zfspkg/zfsbuild.sh|<nowiki><br />
#!/usr/bin/bash<br />
#<br />
# 2015-07-17 zfsbuild.sh by severach for AUR 4<br />
# 2015-08-08 AUR4 -> AUR, added git pull, safer AUR 3.5 update folder<br />
# Adapted from ZFS Builder by graysky<br />
# place this in a user home folder.<br />
# I recommend ~/build/zfspkg/. Do not name the folder 'zfs'.<br />
<br />
# 1 to add conflicts=(linux>,linux<) which offers automatic removal on upgrade.<br />
# Manual removal with zfsun.sh is preferred.<br />
_opt_AutoRemove=0<br />
_opt_ZFSPool='zfsdata'<br />
#_opt_ZFSbyid='/dev/disk/by-partlabel'<br />
_opt_ZFSbyid='/dev/disk/by-id'<br />
# '' for manual answer to prompts. --noconfirm to go ahead and do it all.<br />
_opt_AutoInstall='' #--noconfirm'<br />
<br />
# Multiprocessor compile enabled!<br />
# Huuuuuuge performance improvement. Watch in htop.<br />
# An E3-1245 can peg all 8 processors.<br />
#1 [|||||||||||||||||||||||||96.2%]<br />
#2 [|||||||||||||||||||||||||97.6%]<br />
#3 [|||||||||||||||||||||||||95.7%]<br />
#4 [|||||||||||||||||||||||||96.7%]<br />
#5 [|||||||||||||||||||||||||95.7%]<br />
#6 [|||||||||||||||||||||||||97.1%]<br />
#7 [|||||||||||||||||||||||||98.6%]<br />
#8 [|||||||||||||||||||||||||96.2%]<br />
#Mem[||| 596/31974MB]<br />
#Swp[ 0/0MB]<br />
<br />
set -u<br />
set -e<br />
<br />
if [ "${EUID}" -eq 0 ]; then<br />
echo "This script must NOT be run as root"<br />
sleep 1<br />
exit 1<br />
fi<br />
<br />
for i in 'sudo' 'git'; do<br />
command -v "${i}" >/dev/null 2>&1 || {<br />
echo "I require ${i} but it's not installed. Aborting." 1>&2<br />
exit 1; }<br />
done<br />
<br />
cd "$(dirname "$0")"<br />
OPWD="$(pwd)"<br />
for cwpackage in 'spl-utils-git' 'spl-git' 'zfs-utils-git' 'zfs-git'; do<br />
#cower -dc -f "${cwpackage}"<br />
if [ -d "${cwpackage}" -a ! -d "${cwpackage}/.git" ]; then<br />
echo "${cwpackage}: Convert AUR3.5 to AUR4"<br />
cd "${cwpackage}"<br />
git clone "https://aur.archlinux.org/${cwpackage}.git/" "${cwpackage}.temp"<br />
cd "${cwpackage}.temp"<br />
mv '.git' ..<br />
cd ..<br />
rm -rf "${cwpackage}.temp"<br />
cd ..<br />
fi<br />
if [ -d "${cwpackage}" ]; then<br />
echo "${cwpackage}: Update local copy"<br />
cd "${cwpackage}"<br />
git fetch<br />
git reset --hard 'origin/master'<br />
git pull # this line was missed in previous versions<br />
else<br />
echo "${cwpackage}: Clone to new folder"<br />
git clone "https://aur.archlinux.org/${cwpackage}.git/" <br />
cd "${cwpackage}"<br />
fi<br />
sed -i -e 's:^\s\+make$:'"& -s -j $(nproc):g" 'PKGBUILD'<br />
if [ "${_opt_AutoRemove}" -ne 0 ]; then<br />
sed -i -e 's:^conflicts=(.*$: &\n_kernelversionsmall="`uname -r | cut -d - -f 1`"\nconflicts+=("linux>${_kernelversionsmall}" "linux<${_kernelversionsmall}")\n:g' 'PKGBUILD'<br />
fi<br />
if ! makepkg -sCcfi ${_opt_AutoInstall}; then<br />
cd "${OPWD}"<br />
break<br />
fi<br />
#rm -rf 'zfs' 'spl'<br />
cd "${OPWD}"<br />
done<br />
which fsck.zfs<br />
if [ "$?" -eq 0 ]; then<br />
sudo mkinitcpio -p 'linux' # Stores fsck.zfs into the initrd image. I don't know why it would be needed.<br />
fi<br />
#sudo zpool import "${_opt_ZFSPool}" # Don't do this or zpool will mount via /dev/sd?, which you won't like!<br />
sudo zpool import -d "${_opt_ZFSbyid}" "${_opt_ZFSPool}"<br />
sudo zpool status<br />
sudo -k<br />
</nowiki>}}<br />
<br />
{{hc|~/build/zfspkg/zfsun.sh|<nowiki><br />
#!/usr/bin/bash<br />
<br />
# 2015-07-17 zfs uninstaller by severach for AUR4<br />
# Removing ZFS forgets to unmount the pools, which might be desirable if you're<br />
# running ZFS on the root file system.<br />
<br />
_opt_ZFSFolder='/home/zfsdata/foo'<br />
_opt_ZFSPool='zfsdata'<br />
<br />
if [ "${EUID}" -ne 0 ]; then<br />
echo 'Must be root, try sudo !!'<br />
sleep 1<br />
exit 1<br />
fi<br />
<br />
systemctl stop 'smbd.service' # Active shares can lock the mount. You might want to stop nfs too.<br />
zpool export "${_opt_ZFSPool}" # zpool import no longer works with drives that were zfs umount<br />
if [ ! -d "${_opt_ZFSFolder}" ]; then<br />
echo "${_opt_ZFSPool} exported"<br />
pacman -Rc 'spl-utils-git' # This works even if some are already removed.<br />
#pacman -R 'zfs-utils-git' 'spl-git' 'spl-utils-git' 'zfs-git'<br />
else<br />
echo "ZFS didn't unmount"<br />
fi<br />
systemctl start 'smbd.service'<br />
</nowiki>}}<br />
<br />
=== Bindmount ===<br />
<br />
It is not possible too bindmount a directory residing on zfs onto another directory using fstab, because the fstab is read before the zfs pool is ready. To overcome this limitation, a systemd mount unit can be used for the bind mount, as in the following example. Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The unit configuration ensures that the zfs pool is ready before the bind mount is created. The name of the mount unit must be equal to the directory mentioned after "Where", replace slashes with minuses. See [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdAndBindMounts]] and [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdBindMountUnits]] for more details.<br />
{{hc|srv-nfs4-music.mount|<nowiki><br />
[Mount]<br />
What=/mnt/zfspool<br />
Where=/srv/nfs4/music<br />
Type=none<br />
Options=bind<br />
<br />
[Unit]<br />
DefaultDependencies=no<br />
Conflicts=umount.target<br />
Before=local-fs.target umount.target<br />
After=zfs-mount.service<br />
Requires=zfs-mount.service<br />
ConditionPathIsDirectory=/mnt/zfspool<br />
<br />
[Install]<br />
WantedBy=local-fs.target<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=Talk:ZFS&diff=397981Talk:ZFS2015-09-03T02:49:48Z<p>Chungy: /* Automatic snapshots */</p>
<hr />
<div>== resume hook ==<br />
In think in the page is a typo, the page should state ''resume hook'' instead of hibernate, but the limitation still applies. Can anyone confirm that the resume hook must appear before filesystems? [[User:Ezzetabi|Ezzetabi]] ([[User talk:Ezzetabi|talk]]) 09:49, 18 August 2015 (UTC)<br />
<br />
== Automatic build script ==<br />
<br />
I'm fine with deleting the scripts. I only posted it because graysky's script never worked for me. Long stuff like this would be useful if the ArchWiki featured roll-up text. [[User:Severach|Severach]] ([[User talk:Severach|talk]]) 10:07, 9 August 2015 (UTC)<br />
<br />
:I'd suggest to maintain it in a github repo. You get better versioning, syntax highlighting, cloning, etc. -- [[User:Alad|Alad]] ([[User talk:Alad|talk]]) 12:46, 9 August 2015 (UTC)<br />
<br />
::...or an [https://help.github.com/articles/about-gists/#anonymous-gists anonymous gist] if you don't have nor want to create a GitHub account. — [[User:Kynikos|Kynikos]] ([[User talk:Kynikos|talk]]) 08:40, 10 August 2015 (UTC)<br />
<br />
== Automatic snapshots ==<br />
{{AUR|zfs-auto-snapshot-git}} seems to have disappeared from the AUR. I haven't been able to find any information on why it was deleted; does anyone know? In any case, it should probably be removed from this page.<br />
[[User:Warai otoko|warai otoko]] ([[User talk:Warai otoko|talk]]) 03:21, 2 September 2015 (UTC)<br />
<br />
:On further inspection, looks like it may have gotten lost in the transition to AUR4. It should be resubmitted if we want to continue recommending it here; I've found it useful, at any rate. [[User:Warai otoko|Warai otoko]] ([[User talk:Warai otoko|talk]]) 04:43, 2 September 2015 (UTC)<br />
<br />
:: I've recreated it. I use this script as well. --[[User:Chungy|Chungy]] ([[User talk:Chungy|talk]]) 02:49, 3 September 2015 (UTC)</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=380336ZFS2015-06-28T08:28:23Z<p>Chungy: Split a few long lines to avoid horizontal scrollbars and improve readability.</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
=== General ===<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. A script to build ZFS and its dependencies automatically can be found [[#Automated build script|here]].<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-git}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare_the_Devices)]] }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata \<br />
raidz \<br />
ata-ST3000DM001-9YN166_S1F0KDGY \<br />
ata-ST3000DM001-9YN166_S1F0JKRR \<br />
ata-ST3000DM001-9YN166_S1F0KBP8 \<br />
ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
By default, ''zpool'' will enable all features on a pool. If {{ic|/boot}} resides on ZFS and when using [[GRUB]], you must only enable read-only, or non-read-only features supported by GRUB ({{ic|lz4_compress}} as of version 2.02.beta2). Otherwise GRUB will not be able to read the pool.<br />
<br />
# zpool create -d \<br />
-o feature@async_destroy=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@enabled_txg=enabled \<br />
<pool_name> <vdevs><br />
<br />
{{Tip|As of June 2015, GRUB's development tree supports {{ic|extensible_dataset}}, {{ic|hole_birth}}, and {{ic|embedded_data}}, making it viable to use a pool with all features enabled, either at create time or by using {{ic|zpool upgrade <pool_name>}}, if {{AUR|grub-git}} is installed.}}<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
<br />
zvols can suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a zvol gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a zvol, requiring 2× or more physical storage capacity than the zvol's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository and add the {{ic|archzfs-git}} group to the list of packages to be installed:<br />
$ echo archzfs-git >> ~/archlive/packages.both<br />
<br />
Complete [[Archiso#Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 \<br />
--key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#demz-repo-archiso|demz-repo-archiso]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-git'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Automated build script ===<br />
<br />
The following script may be used to build ZFS and its dependencies automatically.<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
=== Bindmount ===<br />
<br />
It is not possible too bindmount a directory residing on zfs onto another directory using fstab, because the fstab is read before the zfs pool is ready. To overcome this limitation, a systemd mount unit can be used for the bind mount, as in the following example. Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The unit configuration ensures that the zfs pool is ready before the bind mount is created. The name of the mount unit must be equal to the directory mentioned after "Where", replace slashes with minuses. See [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdAndBindMounts]] and [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdBindMountUnits]] for more details.<br />
{{hc|srv-nfs4-music.mount|<nowiki><br />
[Mount]<br />
What=/mnt/zfspool<br />
Where=/srv/nfs4/music<br />
Type=none<br />
Options=bind<br />
<br />
[Unit]<br />
DefaultDependencies=no<br />
Conflicts=umount.target<br />
Before=local-fs.target umount.target<br />
After=zfs-mount.service<br />
Requires=zfs-mount.service<br />
ConditionPathIsDirectory=/mnt/zfspool<br />
<br />
[Install]<br />
WantedBy=local-fs.target<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=380335ZFS2015-06-28T08:16:44Z<p>Chungy: /* GRUB-compatible pool creation */ grub-git supports all pool features, make note of it</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
=== General ===<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. A script to build ZFS and its dependencies automatically can be found [[#Automated build script|here]].<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-git}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. ([[Mdadm#Prepare_the_Devices)]] }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
# ls -lh /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
{{Style|Missing references to [[Persistent block device naming]], it is useless to explain the differences (or even what they are) here.}}<br />
<br />
Disk labels and UUID can also be used for ZFS mounts by using [[GUID Partition Table|GPT]] partitions. ZFS drives have labels but Linux is unable to read them at boot. Unlike [[Master Boot Record|MBR]] partitions, GPT partitions directly support both UUID and labels independent of the format inside the partition. Partitioning rather than using the whole disk for ZFS offers two additional advantages. The OS does not generate bogus partition numbers from whatever unpredictable data ZFS has written to the partition sector, and if desired, you can easily over provision SSD drives, and slightly over provision spindle drives to ensure that different models with slightly different sector counts can zpool replace into your mirrors. This is a lot of organization and control over ZFS using readily available tools and techniques at almost zero cost.<br />
<br />
Use [[GUID Partition Table|gdisk]] to partition the all or part of the drive as a single partition. gdisk does not automatically name partitions so if partition labels are desired use gdisk command "c" to label the partitions. Some reasons you might prefer labels over UUID are: labels are easy to control, labels can be titled to make the purpose of each disk in your arrangement readily apparent, and labels are shorter and easier to type. These are all advantages when the server is down and the heat is on. GPT partition labels have plenty of space and can store most international characters [[wikipedia:GUID_Partition_Table#Partition_entries]] allowing large data pools to be labeled in an organized fashion.<br />
<br />
Drives partitioned with GPT have labels and UUID that look like this. <br />
<br />
# ls -l /dev/disk/by-partlabel<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata1 -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 zfsdata2 -> ../../sdc1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 zfsl2arc -> ../../sda1<br />
<br />
# ls -l /dev/disk/by-partuuid<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 148c462c-7819-431a-9aba-5bf42bb5a34e -> ../../sdd1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:59 4f95da30-b2fb-412b-9090-fc349993df56 -> ../../sda1<br />
# lrwxrwxrwx 1 root root 10 Apr 30 01:44 e5ccef58-5adf-4094-81a7-3bac846a885f -> ../../sdc1<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#Does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
By default, ''zpool'' will enable all features on a pool. If {{ic|/boot}} resides on ZFS and when using [[GRUB]], you must only enable read-only, or non-read-only features supported by GRUB ({{ic|lz4_compress}} as of version 2.02.beta2). Otherwise GRUB will not be able to read the pool.<br />
<br />
# zpool create -d \<br />
-o feature@async_destroy=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@enabled_txg=enabled \<br />
<pool_name> <vdevs><br />
<br />
{{Tip|As of June 2015, GRUB's development tree supports {{ic|extensible_dataset}}, {{ic|hole_birth}}, and {{ic|embedded_data}}, making it viable to use a pool with all features enabled, either at create time or by using {{ic|zpool upgrade <pool_name>}}, if {{AUR|grub-git}} is installed.}}<br />
<br />
=== Importing a pool created by id ===<br />
<br />
Eventually a pool may fail to auto mount and you need to import to bring your pool back. Take care to avoid the most obvious solution.<br />
<br />
# ###zpool import zfsdata # Do not do this! Always use -d<br />
<br />
This will import your pools using {{ic|/dev/sd?}} which will lead to problems the next time you rearrange your drives. This may be as simple as rebooting with a USB drive left in the machine, which harkens back to a time when PC's would not boot when a floppy disk was left in a machine. Adapt one of the following commands to import your pool so that pool imports retain the persistence they were created with.<br />
<br />
# zpool import -d /dev/disk/by-id zfsdata<br />
# zpool import -d /dev/disk/by-partlabel zfsdata<br />
# zpool import -d /dev/disk/by-partuuid zfsdata<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with {{ic|zfs get all <pool>}}. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with {{ic|fsync}} or {{ic|O_SYNC}}) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
<br />
zvols can suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a zvol gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a zvol, requiring 2× or more physical storage capacity than the zvol's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the {{ic|getconf PAGESIZE}} command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ====<br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set {{ic|1=com.sun:auto-snapshot=false}} on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set {{ic|1=com.sun:auto-snapshot:monthly=false}}.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [[wikipedia:Backup_rotation_scheme#Grandfather-father-son|"Grandfather-father-son"]] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use {{ic|-f}} with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding {{ic|1=spl.spl_hostid=0x00bab10c}}.<br />
<br />
The other solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
Follow the [[Archiso]] steps for creating a fully functional Arch Linux live CD/DVD/USB image.<br />
<br />
Enable the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository and add the {{ic|archzfs-git}} group to the list of packages to be installed:<br />
$ echo archzfs-git >> ~/archlive/packages.both<br />
<br />
Complete [[Archiso#Archiso#Build_the_ISO|Build the ISO]] to finally build the iso.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
To get into the ZFS filesystem from live system for maintenance, there are two options:<br />
<br />
# Build custom archiso with ZFS as described in [[#Embed the archzfs packages into an archiso]].<br />
# Boot the latest official archiso and bring up the network. Then enable [[Unofficial_user_repositories#demz-repo-archiso|demz-repo-archiso]] repository inside the live system as usual, sync the pacman package database and install the ''archzfs-git'' package.<br />
<br />
To start the recovery, load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Automated build script ===<br />
<br />
The following script may be used to build ZFS and its dependencies automatically.<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
=== Bindmount ===<br />
<br />
It is not possible too bindmount a directory residing on zfs onto another directory using fstab, because the fstab is read before the zfs pool is ready. To overcome this limitation, a systemd mount unit can be used for the bind mount, as in the following example. Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The unit configuration ensures that the zfs pool is ready before the bind mount is created. The name of the mount unit must be equal to the directory mentioned after "Where", replace slashes with minuses. See [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdAndBindMounts]] and [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdBindMountUnits]] for more details.<br />
{{hc|srv-nfs4-music.mount|<nowiki><br />
[Mount]<br />
What=/mnt/zfspool<br />
Where=/srv/nfs4/music<br />
Type=none<br />
Options=bind<br />
<br />
[Unit]<br />
DefaultDependencies=no<br />
Conflicts=umount.target<br />
Before=local-fs.target umount.target<br />
After=zfs-mount.service<br />
Requires=zfs-mount.service<br />
ConditionPathIsDirectory=/mnt/zfspool<br />
<br />
[Install]<br />
WantedBy=local-fs.target<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=369267ZFS2015-04-11T06:43:58Z<p>Chungy: /* GRUB-compatible pool creation */ add newline escapes for all the shell lines</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
=== General ===<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. A script to build ZFS and its dependencies automatically can be found [[#Automated build script|here]].<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-git}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. (https://wiki.archlinux.org/index.php/Mdadm#Prepare_the_Devices) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
By default, ''zpool'' will enable all features on a pool. If {{ic|/boot}} resides on ZFS and when using [[GRUB]], you must only enable read-only, or non-read-only features supported by GRUB ({{ic|lz4_compress}} as of version 2.02.beta2). Otherwise GRUB will not be able to read the pool.<br />
<br />
# zpool create -d \<br />
-o feature@async_destroy=enabled \<br />
-o feature@empty_bpobj=enabled \<br />
-o feature@lz4_compress=enabled \<br />
-o feature@spacemap_histogram=enabled \<br />
-o feature@enabled_txg=enabled \<br />
<pool_name> <vdevs><br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
<br />
zvols can suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a zvol gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a zvol, requiring 2× or more physical storage capacity than the zvol's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package and copy the ''releng'' template to a working directory:<br />
<br />
# pacman -S archiso<br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|pacman.conf}} file to include the {{ic|demz-repo-core}} repository:<br />
[demz-repo-core]<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Import the repository mainainer's key into your pacman keyring and locally sign it after verifying its fingerprint using the [[http://demizerone.com/|author's website]]:<br />
# pacman-key -r 0EE7A126<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Add the {{ic|archzfs-git}} group to the list of packages to install:<br />
# echo archzfs-git >> /root/media/packages.both<br />
<br />
Add other packages to {{ic|packages.both}}, {{ic|packages.i686}} or {{ic|packages.x86_64}} as desired and create the image.<br />
# ./build.sh -v<br />
<br />
The resulting image will be output to the {{ic|/root/media/out}} directory.<br />
<br />
More information about the process can be found in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Automated build script ===<br />
<br />
The following script may be used to build ZFS and its dependencies automatically.<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
=== Bindmount ===<br />
<br />
It is not possible too bindmount a directory residing on zfs onto another directory using fstab, because the fstab is read before the zfs pool is ready. To overcome this limitation, a systemd mount unit can be used for the bind mount, as in the following example. Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The unit configuration ensures that the zfs pool is ready before the bind mount is created. The name of the mount unit must be equal to the directory mentioned after "Where", replace slashes with minuses. See [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdAndBindMounts]] and [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdBindMountUnits]] for more details.<br />
{{hc|srv-nfs4-music.mount|<nowiki><br />
[Mount]<br />
What=/mnt/zfspool<br />
Where=/srv/nfs4/music<br />
Type=none<br />
Options=bind<br />
<br />
[Unit]<br />
DefaultDependencies=no<br />
Conflicts=umount.target<br />
Before=local-fs.target umount.target<br />
After=zfs-mount.service<br />
Requires=zfs-mount.service<br />
ConditionPathIsDirectory=/mnt/zfspool<br />
<br />
[Install]<br />
WantedBy=local-fs.target<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=Install_Arch_Linux_on_ZFS&diff=369225Install Arch Linux on ZFS2015-04-10T18:02:30Z<p>Chungy: /* Create the root zpool */</p>
<hr />
<div>[[Category:Getting and installing Arch]]<br />
[[ja:ZFS に Arch Linux をインストール]]<br />
{{Related articles start}}<br />
{{Related|ZFS}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
This article details the steps required to install Arch Linux onto a root ZFS filesystem. This article supplements the [[Beginners' guide]].<br />
<br />
== Installation ==<br />
<br />
See [[ZFS#Installation]] for installing the ZFS packages. If installing Arch Linux onto ZFS from the archiso, it would be easier to use the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
=== Embedding archzfs into archiso ===<br />
<br />
See [[ZFS#Embed_the_archzfs_packages_into_an_archiso|ZFS]] article.<br />
<br />
== Partition the destination drive ==<br />
<br />
Review [[Beginners' guide#Prepare_the_storage_drive]] for information on determining the partition table type to use for ZFS. ZFS supports GPT and MBR partition tables.<br />
<br />
ZFS manages its own partitions, so only a basic partition table scheme is required. The partition that will contain the ZFS filesystem should be of the type {{ic|bf00}}, or "Solaris Root".<br />
<br />
=== Partition scheme ===<br />
<br />
Here is an example, using MBR, of a basic partition scheme that could be employed for your ZFS root setup:<br />
<br />
{{bc|<nowiki><br />
Part Size Type<br />
---- ---- -------------------------<br />
1 512M Ext boot partition (8300)<br />
2 XXXG Solaris Root (bf00)</nowiki><br />
}}<br />
<br />
Here is an example using GPT. The BIOS boot partition contains the bootloader.<br />
<br />
{{bc|<nowiki><br />
Part Size Type<br />
---- ---- -------------------------<br />
1 2M BIOS boot partition (ef02)<br />
1 512M Ext boot partition (8300)<br />
2 XXXG Solaris Root (bf00)</nowiki><br />
}}<br />
<br />
An additional partition may be required depending on your hardware and chosen bootloader. Consult [[Beginners' guide#Install_and_configure_a_bootloader]] for more info.<br />
<br />
{{Tip|Bootloaders with support for ZFS are described in [[#Install and configure the bootloader]].}}<br />
{{Warning|Several GRUB bugs ([https://savannah.gnu.org/bugs/?42861 bug #42861], [https://github.com/zfsonlinux/grub/issues/5 zfsonlinux/grub/issues/5]) prevent or complicate installing it on ZFS partitions, use of a separate boot partition is recommended}}<br />
<br />
== Format the destination disk ==<br />
<br />
Format the boot partition as well as any other system partitions. Do not do anything to the Solaris partition nor to the BIOS boot partition. ZFS will manage the first, and your bootloader the second.<br />
<br />
== Setup the ZFS filesystem ==<br />
<br />
First, make sure the ZFS modules are loaded,<br />
<br />
# modprobe zfs<br />
<br />
=== Create the root zpool ===<br />
<br />
# zpool create zroot /dev/disk/by-id/''id-to-partition''<br />
<br />
{{Warning|Always use id names when working with ZFS, otherwise import errors will occur.}}<br />
<br />
{{Warning|The zpool command will normally activate all features. If you use ZFS for the /boot directory and plan on using GRUB, it is necessary to enable only features compatible with GRUB:}}<br />
<br />
# zpool create -d \<br />
-o feature@async_destroy=enabled<br />
-o feature@empty_bpobj=enabled<br />
-o feature@lz4_compress=enabled<br />
-o feature@spacemap_histogram=enabled<br />
-o feature@enabled_txg=enabled<br />
zroot /dev/disk/by-id/''id-to-partition''<br />
<br />
=== Create necessary filesystems ===<br />
<br />
If so desired, sub-filesystem mount points such as {{ic|/home}} and {{ic|/root}} can be created with the following commands:<br />
<br />
# zfs create zroot/home -o mountpoint=/home<br />
# zfs create zroot/root -o mountpoint=/root<br />
<br />
Note that if you want to use other datasets for system directories ({{ic|/var}} or {{ic|/etc}} included) your system will not boot unless they are listed in {{ic|/etc/fstab}}! We will address that at the appropriate time in this tutorial.<br />
<br />
=== Swap partition ===<br />
<br />
See [[ZFS#Swap volume]].<br />
<br />
=== Configure the root filesystem ===<br />
<br />
First, set the mount point of the root filesystem:<br />
<br />
# zfs set mountpoint=/ zroot<br />
<br />
and optionally, any sub-filesystems:<br />
<br />
# zfs set mountpoint=/home zroot/home<br />
# zfs set mountpoint=/root zroot/root<br />
<br />
and if you have seperate datasets for system directories (ie {{ic|/var}} or {{ic|/usr}})<br />
<br />
# zfs set mountpoint=legacy zroot/usr<br />
# zfs set mountpoint=legacy zroot/var<br />
<br />
and put them in {{ic|/etc/fstab}}<br />
{{hc|/etc/fstab|<br />
# <file system> <dir> <type> <options> <dump> <pass><br />
zroot/usr /usr zfs defaults,noatime 0 0<br />
zroot/var /var zfs defaults,noatime 0 0}}<br />
<br />
Set the bootfs property on the descendant root filesystem so the boot loader knows where to find the operating system.<br />
<br />
# zpool set bootfs=zroot zroot<br />
<br />
Export the pool,<br />
<br />
# zpool export zroot<br />
<br />
{{Warning|Do not skip this, otherwise you will be required to use {{ic|-f}} when importing your pools. This unloads the imported pool.}}<br />
{{Note|This might fail if you added a swap partition above. Need to turn it off with the ''swapoff'' command.}}<br />
<br />
Finally, re-import the pool,<br />
<br />
# zpool import -d /dev/disk/by-id -R /mnt zroot<br />
<br />
{{Note|{{ic|-d}} is not the actual device id, but the {{ic|/dev/by-id}} directory containing the symbolic links.}}<br />
<br />
If there is an error in this step, you can export the pool to redo the command. The ZFS filesystem is now ready to use.<br />
<br />
Be sure to bring the zpool.cache file into your new system. This is required later for the ZFS daemon to start.<br />
<br />
# cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache<br />
<br />
if you don't have /etc/zfs/zpool.cache, create it:<br />
<br />
# zpool set cachefile=/etc/zfs/zpool.cache zroot<br />
<br />
== Install and configure Arch Linux ==<br />
<br />
Follow the following steps using the [[Beginners' guide]]. It will be noted where special consideration must be taken for ZFSonLinux.<br />
<br />
* First mount any boot or system partitions using the mount command.<br />
<br />
* Install the base system.<br />
<br />
* The procedure described in [[Beginners' guide#Generate an fstab]] is usually overkill for ZFS. ZFS usually auto mounts its own partitions, so we do not need ZFS partitions in {{ic|fstab}} file, unless the user made datasets of system directories. To generate the {{ic|fstab}} for filesystems, use:<br />
# genfstab -U -p /mnt | grep boot >> /mnt/etc/fstab<br />
<br />
* Edit the {{ic|/etc/fstab}}:<br />
<br />
{{Note|<br />
* If you chose to create datasets for system directories, keep them in this {{ic|fstab}}! Comment out the lines for the '{{ic|/}}, {{ic|/root}}, and {{ic|/home}} mountpoints, rather than deleting them. You may need those UUIDs later if something goes wrong.<br />
* Anyone who just stuck with the guide's directions can delete everything except for the swap file and the boot/EFI partition. It seems convention to replace the swap's uuid with {{ic|/dev/zvol/zroot/swap}}.<br />
}}<br />
<br />
* When creating the initial ramdisk, first edit {{ic|/etc/mkinitcpio.conf}} and add {{ic|zfs}} before filesystems. Also, move {{ic|keyboard}} hook before {{ic|zfs}} so you can type in console if something goes wrong. You may also remove fsck (if you are not using Ext3 or Ext4). Your {{ic|HOOKS}} line should look something like this:<br />
HOOKS="base udev autodetect modconf block keyboard zfs filesystems"<br />
<br />
* Regenerate the initramfs with the command:<br />
# mkinitcpio -p linux<br />
<br />
== Install and configure the bootloader ==<br />
<br />
=== For BIOS motherboards ===<br />
<br />
Follow [[GRUB#BIOS_systems_2]] to install GRUB onto your disk. {{ic|grub-mkconfig}} does not properly detect the ZFS filesystem, so it is necessary to edit {{ic|grub.cfg}} manually:<br />
<br />
{{hc|/boot/grub/grub.cfg|<nowiki><br />
set timeout=2<br />
set default=0<br />
<br />
# (0) Arch Linux<br />
menuentry "Arch Linux" {<br />
set root=(hd0,msdos1)<br />
linux /vmlinuz-linux zfs=zroot rw<br />
initrd /initramfs-linux.img<br />
}<br />
</nowiki>}}<br />
<br />
if you did not create a separate /boot participation, kernel and initrd paths have to be in the following format:<br />
<br />
/dataset/@/actual/path <br />
<br />
Example:<br />
<br />
linux /@/boot/vmlinuz-linux zfs=zroot rw<br />
initrd /@/boot/initramfs-linux.img<br />
=== For UEFI motherboards ===<br />
<br />
Use {{ic|EFISTUB}} and {{ic|rEFInd}} for the UEFI boot loader. See [[Beginners' guide#For UEFI motherboards]]. The kernel parameters in {{ic|refind_linux.conf}} for ZFS should include {{ic|1=zfs=bootfs}} or {{ic|1=zfs=zroot}} so the system can boot from ZFS. The {{ic|root}} and {{ic|rootfstype}} parameters are not needed.<br />
<br />
== Unmount and restart ==<br />
<br />
We are almost done!<br />
# exit<br />
# umount /mnt/boot<br />
# zfs umount -a<br />
# zpool export zroot<br />
Now reboot.<br />
<br />
{{Warning|If you do not properly export the zpool, the pool will refuse to import in the ramdisk environment and you will be stuck at the busybox terminal.}}<br />
<br />
== After the first boot ==<br />
<br />
If everything went fine up to this point, your system will boot. Once.<br />
For your system to be able to reboot without issues, you need to enable the {{ic|zfs.target}} to auto mount the pools and set the hostid.<br />
<br />
For each pool you want automatically mounted execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
Enable the target with [[systemd]]:<br />
# systemctl enable zfs.target<br />
<br />
When running ZFS on root, the machine's hostid will not be available at the time of mounting the root filesystem. There are two solutions to this. You can either place your spl hostid in the [[kernel parameters]] in your boot loader. For example, adding {{ic|<nowiki>spl.spl_hostid=0x00bab10c</nowiki>}}, to get your number use the {{ic|hostid}} command.<br />
<br />
The other, and suggested, solution is to make sure that there is a hostid in {{ic|/etc/hostid}}, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image. To do write the hostid file safely you need to use a small C program:<br />
<br />
#include <stdio.h><br />
#include <errno.h><br />
#include <unistd.h><br />
<br />
int main() {<br />
int res;<br />
res = sethostid(gethostid());<br />
if (res != 0) {<br />
switch (errno) {<br />
case EACCES:<br />
fprintf(stderr, "Error! No permission to write the"<br />
" file used to store the host ID.\n"<br />
"Are you root?\n");<br />
break;<br />
case EPERM:<br />
fprintf(stderr, "Error! The calling process's effective"<br />
" user or group ID is not the same as"<br />
" its corresponding real ID.\n");<br />
break;<br />
default:<br />
fprintf(stderr, "Unknown error.\n");<br />
}<br />
return 1;<br />
}<br />
return 0;<br />
}<br />
<br />
Copy it, save it as {{ic|writehostid.c}} and compile it with {{ic|gcc -o writehostid writehostid.c}}, finally execute it and regenerate the initramfs image:<br />
<br />
# ./writehostid<br />
# mkinitcpio -p linux<br />
<br />
You can now delete the two files {{ic|writehostid.c}} and {{ic|writehostid}}. Your system should work and reboot properly now.<br />
<br />
== See also ==<br />
<br />
* [https://github.com/dajhorn/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem HOWTO install Ubuntu to a Native ZFS Root]<br />
* [http://lildude.co.uk/zfs-cheatsheet ZFS cheatsheet]<br />
* [http://www.funtoo.org/wiki/ZFS_Install_Guide Funtoo ZFS install guide]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=369224ZFS2015-04-10T17:57:46Z<p>Chungy: /* Create a storage pool */ GRUB-compatible info</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management – zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
=== General ===<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. A script to build ZFS and its dependencies automatically can be found [[#Automated build script|here]].<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Root on ZFS ===<br />
<br />
When performing an Arch install on ZFS, {{AUR|zfs-git}} and its dependencies can be installed in the archiso environment as outlined in the previous section.<br />
<br />
It may be useful to prepare a [[#Embed the archzfs packages into an archiso|customized archiso]] with ZFS support builtin. For a much more detailed guide on installing Arch with ZFS as its root file system, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. (https://wiki.archlinux.org/index.php/Mdadm#Prepare_the_Devices) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
=== GRUB-compatible pool creation ===<br />
<br />
As of ZoL 0.6.4, many new features are available to ZFS, but at least as of GRUB 1:2.02.beta2-5, the only non-read-only compatible feature supported by GRUB is lz4_compress. By default, the zpool command will enable all features on a pool and create one that GRUB cannot read. If your /boot directory will be on ZFS, it is necessary to carefully control the creation to only enable features that either GRUB supports, or at least are read-only compatible:<br />
<br />
# zpool create -d \<br />
-o feature@async_destroy=enabled<br />
-o feature@empty_bpobj=enabled<br />
-o feature@lz4_compress=enabled<br />
-o feature@spacemap_histogram=enabled<br />
-o feature@enabled_txg=enabled<br />
<pool_name> <vdevs><br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
<br />
zvols can suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a zvol gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a zvol, requiring 2× or more physical storage capacity than the zvol's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [https://github.com/zfsonlinux/zfs/issues/2582 ZFS expects pool creation to take less than 1 second]. This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package and copy the ''releng'' template to a working directory:<br />
<br />
# pacman -S archiso<br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|pacman.conf}} file to include the {{ic|demz-repo-core}} repository:<br />
[demz-repo-core]<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Import the repository mainainer's key into your pacman keyring and locally sign it after verifying its fingerprint using the [[http://demizerone.com/|author's website]]:<br />
# pacman-key -r 0EE7A126<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Add the {{ic|archzfs-git}} group to the list of packages to install:<br />
# echo archzfs-git >> /root/media/packages.both<br />
<br />
Add other packages to {{ic|packages.both}}, {{ic|packages.i686}} or {{ic|packages.x86_64}} as desired and create the image.<br />
# ./build.sh -v<br />
<br />
The resulting image will be output to the {{ic|/root/media/out}} directory.<br />
<br />
More information about the process can be found in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
=== Automated build script ===<br />
<br />
The following script may be used to build ZFS and its dependencies automatically.<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
=== Bindmount ===<br />
<br />
It is not possible too bindmount a directory residing on zfs onto another directory using fstab, because the fstab is read before the zfs pool is ready. To overcome this limitation, a systemd mount unit can be used for the bind mount, as in the following example. Here a bind mount from /mnt/zfspool to /srv/nfs4/music is created. The unit configuration ensures that the zfs pool is ready before the bind mount is created. The name of the mount unit must be equal to the directory mentioned after "Where", replace slashes with minuses. See [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdAndBindMounts]] and [[http://utcc.utoronto.ca/~cks/space/blog/linux/SystemdBindMountUnits]] for more details.<br />
{{hc|srv-nfs4-music.mount|<nowiki><br />
[Mount]<br />
What=/mnt/zfspool<br />
Where=/srv/nfs4/music<br />
Type=none<br />
Options=bind<br />
<br />
[Unit]<br />
DefaultDependencies=no<br />
Conflicts=umount.target<br />
Before=local-fs.target umount.target<br />
After=zfs-mount.service<br />
Requires=zfs-mount.service<br />
ConditionPathIsDirectory=/mnt/zfspool<br />
<br />
[Install]<br />
WantedBy=local-fs.target<br />
</nowiki>}}<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=342035ZFS2014-10-26T23:27:51Z<p>Chungy: /* Swap volume */ Recommending sync=always was misguided. Swap data doesn't matter following a crash and Linux will sync appropriately on its own time.</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. (https://wiki.archlinux.org/index.php/Mdadm#Prepare_the_Devices) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
<br />
zvols can suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a zvol gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a zvol, requiring 2× or more physical storage capacity than the zvol's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Another option useful for keeping the system running well in low-memory situations is not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/zroot/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [ZFS expects pool creation to take less than 1 second](https://github.com/zfsonlinux/zfs/issues/2582). This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=341427ZFS2014-10-24T07:13:59Z<p>Chungy: /* zvols */ Information about zvols+AF+raidz</p>
<hr />
<div>[[Category:File systems]]<br />
[[ja:ZFS]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. (https://wiki.archlinux.org/index.php/Mdadm#Prepare_the_Devices) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
<br />
zvols can suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
==== RAIDZ and Advanced Format physical disks ====<br />
<br />
Each block of a zvol gets its own parity disks, and if you have physical media with logical block sizes of 4096B, 8192B, or so on, the parity needs to be stored in whole physical blocks, and this can drastically increase the space requirements of a zvol, requiring 2× or more physical storage capacity than the zvol's logical capacity. Setting the '''recordsize''' to 16k or 32k can help reduce this footprint drastically.<br />
<br />
See [https://github.com/zfsonlinux/zfs/issues/1807 ZFS on Linux issue #1807 for details]<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/zroot/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZPool creation fails ===<br />
If the following error occurs then it can be fixed.<br />
<br />
# the kernel failed to rescan the partition table: 16<br />
# cannot label 'sdc': try using parted(8) and then provide a specific slice: -1<br />
<br />
One reason this can occur is because [ZFS expects pool creation to take less than 1 second](https://github.com/zfsonlinux/zfs/issues/2582). This is a reasonable assumption under ordinary conditions, but in many situations it may take longer. Each drive will need to be cleared again before another attempt can be made.<br />
<br />
# parted /dev/sda rm 1<br />
# parted /dev/sda rm 1<br />
# dd if=/dev/zero of=/dev/sdb bs=512 count=1<br />
# zpool labelclear /dev/sda<br />
<br />
A brute force creation can be attempted over and over again, and with some luck the ZPool creation will take less than 1 second.<br />
Once cause for creation slowdown can be slow burst read writes on a drive. By reading from the disk in parallell to ZPool creation, it may be possible to increase burst speeds.<br />
<br />
# dd if=/dev/sda of=/dev/null<br />
<br />
This can be done with multiple drives by saving the above command for each drive to a file on separate lines and running <br />
<br />
# cat $FILE | parallel<br />
<br />
Then run ZPool creation at the same time.<br />
<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=325554ZFS2014-07-18T01:08:03Z<p>Chungy: /* Database */ Correction: sync=disabled is the *only* property that disables ZIL, logbias only indicates a preference</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. (https://wiki.archlinux.org/index.php/Mdadm#Prepare_the_Devices) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this is that ZFS will be committing data '''twice''' to the data disks, and it can severely impact performance. You can tell ZFS to prefer to not use the ZIL, and in which case, data is only committed to the file system once. Setting this for non-database file systems, or for pools with configured log devices, can actually ''negatively'' impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
zvols might suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/zroot/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=325553ZFS2014-07-18T01:03:12Z<p>Chungy: /* /tmp */ Change some wording to be more precise and clearer.</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. (https://wiki.archlinux.org/index.php/Mdadm#Prepare_the_Devices) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this, is that ZFS will be committing data '''twice''' to the data disks and it can severely impact performance. You can tell ZFS to not use the ZIL, and in which case data is only committed to the file system once. Disabling the ZIL for non-database file systems or for pools with configured log devices (eg, with SSDs) can actually negatively impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you would like to use ZFS to store your /tmp directory, which may be useful for storing arbitrarily-large sets of files or simply keeping your RAM free of idle data, you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe application-side data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected. Please note this does ''not'' affect the integrity of ZFS itself, only the possibility that data an application expects on-disk may not have actually been written out following a crash.<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents some kinds of privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
zvols might suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/zroot/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=Python&diff=325551Python2014-07-18T00:53:14Z<p>Chungy: /* Old versions */ add Python 3.3, update support list</p>
<hr />
<div>[[Category:Programming language]]<br />
[[de:Python]]<br />
[[zh-CN:Python]]<br />
{{Related articles start}}<br />
{{Related|Python Package Guidelines}}<br />
{{Related|mod_python}}<br />
{{Related|Python VirtualEnv}}<br />
{{Related articles end}}<br />
[http://www.python.org Python] "is a remarkably powerful dynamic programming language that is used in a wide variety of application domains. Python is often compared to Tcl, Perl, Ruby, Scheme or Java."<br />
<br />
== Installation ==<br />
<br />
There are currently two versions of Python: Python 3 (which is the default) and the older Python 2.<br />
<br />
=== Python 3 ===<br />
<br />
Python 3 is the latest version of the language, and is '''incompatible with Python 2'''. The language is mostly the same, but many details, especially how built-in objects like dictionaries and strings work, have changed considerably, and a lot of deprecated features have finally been removed. Also, the standard library has been reorganized in a few prominent places. For an overview of the differences, visit [http://wiki.python.org/moin/Python2orPython3 Python2orPython3] and their relevant [http://getpython3.com/diveintopython3/porting-code-to-python-3-with-2to3.html chapter] in Dive into Python 3.<br />
<br />
To install the latest version of Python 3, [[pacman|install]] the {{Pkg|python}} package from the [[official repositories]].<br />
<br />
If you would like to build the latest RC/betas from source, visit [http://www.python.org/download/ Python Downloads]. The [[Arch User Repository]] also contains good [[PKGBUILD]]s. If you do decide to build the RC, note that the binary (by default) installs to {{ic|/usr/local/bin/python3.x}}.<br />
<br />
=== Python 2 ===<br />
<br />
To install the latest version of Python 2, [[pacman|install]] the {{Pkg|python2}} package from the [[official repositories]].<br />
<br />
Python 2 will happily run alongside Python 3. You need to specify '''python2''' in order to run this version.<br />
<br />
Any program requiring Python 2 needs to point to {{ic|/usr/bin/python2}}, instead of {{ic|/usr/bin/python}}, which points to Python 3.<br />
<br />
To do so, open the program or script in a text editor and change the first line.<br />
<br />
The line will show one of the following:<br />
#!/usr/bin/env python<br />
or<br />
#!/usr/bin/python<br />
<br />
In both cases, just change {{ic|python}} to {{ic|python2}} and the program will then use Python 2 instead of Python 3.<br />
<br />
Another way to force the use of python2 without altering the scripts is to call it explicitely with python2, i.e.<br />
{{bc|python2 myScript.py}}<br />
<br />
Finally, you may not be able to control the script calls, but there is a way to trick the environment. It only works if the scripts use {{ic|#!/usr/bin/env python}}, it won't work with {{ic|#!/usr/bin/python}}. This trick relies on {{ic|env}} searching for the first corresponding entry in the PATH variable.<br />
First create a dummy folder.<br />
$ mkdir ~/bin<br />
Then add a symlink 'python' to python2 and the config scripts in it.<br />
$ ln -s /usr/bin/python2 ~/bin/python<br />
$ ln -s /usr/bin/python2-config ~/bin/python-config<br />
Finally put the new folder ''at the beginning'' of your PATH variable.<br />
$ export PATH=~/bin:$PATH<br />
Note that this change is not permanent and is only active in the current terminal session.<br />
To check which python interpreter is being used by {{ic|env}}, use the following command:<br />
$ which python<br />
<br />
A similar approach in tricking the environment, which also relies on {{ic|#!/usr/bin/env python}} to be called by the script in question, is to use a [[Virtualenv]]. When a Virtualenv is activated, the Python executable pointed to by {{ic|$PATH}} will be the one the Virtualenv was installed with. Therefore, if the Virtualenv is installed with Python 2, {{ic|python}} will refer to Python 2. To start, [[pacman|install]] {{Pkg|python2-virtualenv}}.<br />
<br />
Then create the Virtualenv.<br />
$ virtualenv2 venv # Creates a directory, venv/, containing the Virtualenv<br />
Activate the Virtualenv, which will update {{ic|$PATH}} to point at Python 2. Note that this activation is only active for the current terminal session.<br />
$ source venv/bin/activate<br />
The desired script should then run using Python 2.<br />
<br />
== Dealing with version problem in build scripts ==<br />
<br />
Many projects' build scripts assume {{ic|python}} to be Python 2, and that would eventually result in an error - typically complaining that {{ic|print 'foo'}} is invalid syntax. Luckily, many of them call {{ic|python}} in the {{ic|$PATH}} instead of hardcoding {{ic|#!/usr/bin/python}} in the shebang line, and the Python scripts are all contained within the project tree. So, instead of modifying the build scripts manually, there is an easy workaround. Just create {{ic|/usr/local/bin/python}} with content like this:<br />
<br />
{{hc|/usr/local/bin/python|<nowiki><br />
#!/bin/bash<br />
script=$(readlink -f -- "$1")<br />
case "$script" in (/path/to/project1/*|/path/to/project2/*|/path/to/project3*)<br />
exec python2 "$@"<br />
;;<br />
esac<br />
<br />
exec python3 "$@"<br />
</nowiki>}}<br />
<br />
Where {{ic|<nowiki>/path/to/project1/*|/path/to/project2/*|/path/to/project3*</nowiki>}} is a list of patterns separated by {{ic|<nowiki>|</nowiki>}} matching all project trees.<br />
<br />
Don't forget to make it executable:<br />
<br />
# chmod +x /usr/local/bin/python<br />
<br />
Afterwards scripts within the specified project trees will be run with Python 2.<br />
<br />
== Integrated development environments ==<br />
<br />
{{Deletion|Can just point to [[List of applications#Integrated_development_environments]] that we could integrate with the items not yet there.}}<br />
There are some Python specific IDEs available in the [[official repositories]].<br />
<br />
=== Eclipse===<br />
<br />
Eclipse supports both Python 2.x and 3.x series by using the [[Eclipse#PyDev|PyDev]] extension.<br />
<br />
===Eric===<br />
For the latest Python 3 compatible version, [[pacman|install]] the {{Pkg|eric}} package.<br />
<br />
Version 4 of Eric is Python 2 compatible and can be installed with the {{Pkg|eric4}} package.<br />
<br />
These IDEs can also handle [[Ruby]].<br />
<br />
===IEP===<br />
<br />
IEP is an interactive (e.g. MATLAB) python IDE with basic debugging capabilities and is especially suitable for scientific computing. It is provided by the package {{AUR|iep}}.<br />
<br />
===Ninja===<br />
<br />
The Ninja IDE is provided by the package {{Pkg|ninja-ide}}.<br />
<br />
===Spyder===<br />
<br />
Spyder (previously known as Pydee) is a powerful interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. It focuses on scientific computations, providing a matlab-like environment. It can be installed with the package {{AUR|spyder}}<br />
<br />
===PyCharm===<br />
PyCharm 3. The intelligent Python IDE with unique code assistance and analysis, for productive Python development on all levels. <br />
The community edition is available for free. {{AUR|pycharm-community}}<br />
<br />
== Getting easy_install ==<br />
<br />
The easy_install tool is available in the package {{Pkg|python-setuptools}}.<br />
<br />
== Getting completion in Python shell ==<br />
<br />
Copy this into Python's interactive shell<br />
{{hc|/usr/bin/python|2=<br />
import rlcompleter<br />
import readline<br />
readline.parse_and_bind("tab: complete")<br />
}}<br />
[http://algorithmicallyrandom.blogspot.com.es/2009/09/tab-completion-in-python-shell-how-to.html Source]<br />
<br />
== Widget bindings ==<br />
<br />
The following [[Wikipedia:widget toolkit|widget toolkit]] bindings are available:<br />
* {{App|TkInter|Tk bindings|http://wiki.python.org/moin/TkInter|standard module}}<br />
* {{App|pyQt|[[Qt]] bindings|http://www.riverbankcomputing.co.uk/software/pyqt/intro|{{Pkg|python2-pyqt4}} {{Pkg|python2-pyqt5}} {{Pkg|python-pyqt4}} {{Pkg|python-pyqt5}}}}<br />
* {{App|pySide|[[Qt]] bindings|http://www.pyside.org/|{{AUR|python2-pyside}} {{AUR|python-pyside}}}}<br />
* {{App|pyGTK|[[GTK+|GTK+ 2]] bindings|http://www.pygtk.org/|{{Pkg|pygtk}}}}<br />
* {{App|PyGObject|[[GTK+|GTK+ 2/3]] bindings via GObject Introspection|https://wiki.gnome.org/PyGObject/|{{Pkg|python2-gobject2}} {{Pkg|python2-gobject}} {{Pkg|python-gobject2}} {{Pkg|python-gobject}}}}<br />
* {{App|wxPython|wxWidgets bindings|http://wxpython.org/|{{Pkg|wxpython}}}}<br />
To use these with Python, you may need to install the associated widget kits.<br />
<br />
== Old versions ==<br />
<br />
Old versions of Python are available via the [[AUR]] and may be useful for historical curiosity, old applications that don't run on current versions, or for testing Python programs intended to run on a distribution that comes with an older version (eg, RHEL 5.x has Python 2.4, or Ubuntu 12.04 has Python 3.1):<br />
* {{AUR|python15}}: Python 1.5.2<br />
* {{AUR|python16}}: Python 1.6.1<br />
* {{AUR|python24}}: Python 2.4.6<br />
* {{AUR|python25}}: Python 2.5.6<br />
* {{AUR|python26}}: Python 2.6.8<br />
* {{AUR|python30}}: Python 3.0.1<br />
* {{AUR|python31}}: Python 3.1.5<br />
* {{AUR|python32}}: Python 3.2.5<br />
* {{AUR|python33}}: Python 3.3.5<br />
<br />
As of July 2014, Python upstream only supports Python 2.7, 3.2, 3.3, and 3.4 for security fixes. Using older versions for Internet-facing applications or untrusted code may be dangerous and is not recommended.<br />
<br />
Extra modules/libraries for old versions of Python may be found on the AUR by searching for python(version without decimal), eg searching for "python26" for 2.6 modules.<br />
<br />
== Tips and tricks ==<br />
<br />
[http://ipython.org/ IPython] is a enhanced Python command line available in the official repositories as {{Pkg|ipython}} and {{Pkg|ipython2}}.<br />
<br />
== See also ==<br />
<br />
* [http://shop.oreilly.com/product/9780596158071.do Learning Python] is one of the most comprehensive, up to date, and well-written books on Python available today.<br />
* [http://www.diveintopython.net/ Dive Into Python] is an excellent (free) resource, but perhaps for more advanced readers and [http://diveintopython3.ep.io/ has been updated for Python 3].<br />
* [http://www.swaroopch.com/notes/Python A Byte of Python] is a book suitable for users new to Python (and scripting in general).<br />
* [http://learnpythonthehardway.org Learn Python The Hard Way] the best intro to programming.<br />
* [http://facts.learnpython.org facts.learnpython.org] nice site to learn python.<br />
* [http://stephensugden.com/crash_into_python/ Crash into Python] Also known as ''Python for Programmers with 3 Hours'', this guide gives experienced developers from other languages a crash course on Python.<br />
* [http://www.apress.com/book/view/9781590598726 Beginning Game Development with Python and Pygame: From Novice to Professional] for games<br />
<br />
== For Fun ==<br />
<br />
Try the following snippets from Python's interactive shell:<br />
<br />
>>> import this<br />
<br />
>>> from __future__ import braces<br />
<br />
>>> import antigravity</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=325444ZFS2014-07-17T06:36:54Z<p>Chungy: /* General */ relatime is now available in all the ZFS packages, given the release of ZoL 0.6.3</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Experimenting with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the CDDL, and thus incompatible with GPL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|zfs-utils-git}} and {{AUR|spl-git}} as a dependency, which in turn has {{AUR|spl-utils-git}} as dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Experimenting with ZFS ==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Experimenting with ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
{{Note|If some or all device have been used in a software RAID set it is paramount to erase any old RAID configuration information. (https://wiki.archlinux.org/index.php/Mdadm#Prepare_the_Devices) }}<br />
<br />
{{Warning|For Advanced Format Disks with 4KB sector size, an ashift of 12 is recommended for best performance. Advanced Format disks emulate a sector size of 512 bytes for compatibility with legacy systems, this causes ZFS to sometimes use an ashift option number that is not ideal. Once the pool has been created, the only way to change the ashift option is to recreate the pool. Using an ashift of 12 would also decrease available capacity. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?], [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives 1.15 How does ZFS on Linux handles Advanced Format disks?], and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].}}<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Advanced format disks ===<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
=== Verifying pool creation ===<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time has not been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this, is that ZFS will be committing data '''twice''' to the data disks and it can severely impact performance. You can tell ZFS to not use the ZIL, and in which case data is only committed to the file system once. Disabling the ZIL for non-database file systems or for pools with configured log devices (eg, with SSDs) can actually negatively impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you plan on using ZFS to store your /tmp directory (which may be useful for storing arbitrarily-large sets of files, or simply keeping your RAM free of idle data), you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected:<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents any privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
zvols might suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
# swapon /dev/zvol/zroot/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
Make sure to unmount all ZFS filesystems before rebooting the machine, otherwise any ZFS pools will refuse to be imported:<br />
# zfs umount -a<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
# zpool import -a -f<br />
<br />
now export the pool:<br />
<br />
# zpool export <pool><br />
<br />
To see the available pools, use,<br />
<br />
# zpool status<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
# mkinitcpio -p linux<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
This number have to be added to the [[kernel parameters]] as {{ic|spl.spl_hostid&#61;0a0af0f8}}. Another solution is writing the hostid inside the initram image, see the [[Installing Arch Linux on ZFS#After the first boot|installation guide]] explanation about this.<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
=== Devices have different sector alignment ===<br />
<br />
Once a drive has become faulted it should be replaced A.S.A.P. with an identical drive.<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -f<br />
<br />
but in this instance, the following error is produced:<br />
<br />
cannot replace ata-ST3000DM001-9YN166_S1F0KDGY with ata-ST3000DM001-1CH166_W1F478BD: devices have different sector alignment<br />
<br />
ZFS uses the ashift option to adjust for physical block size. When replacing the faulted disk, ZFS is attempting to use {{ic|<nowiki>ashift=12</nowiki>}}, but the faulted disk is using a different ashift (probably {{ic|<nowiki>ashift=9</nowiki>}}) and this causes the resulting error. <br />
<br />
For Advanced Format Disks with 4KB blocksize, an ashift of 12 is recommended for best performance. See [http://zfsonlinux.org/faq.html#PerformanceConsideration 1.10 What’s going on with performance?] and [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ZFS and Advanced Format disks].<br />
<br />
Use zdb to find the ashift of the zpool: {{ic|zdb | grep ashift}}, then use the {{ic|-o}} argument to set the ashift of the replacement drive:<br />
<br />
# zpool replace bigdata ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-1CH166_W1F478BD -o ashift=9 -f<br />
<br />
Check the zpool status for confirmation:<br />
<br />
{{hc|# zpool status -v|<br />
pool: bigdata<br />
state: DEGRADED<br />
status: One or more devices is currently being resilvered. The pool will<br />
continue to function, possibly in a degraded state.<br />
action: Wait for the resilver to complete.<br />
scan: resilver in progress since Mon Jun 16 11:16:28 2014<br />
10.3G scanned out of 5.90T at 81.7M/s, 20h59m to go<br />
2.57G resilvered, 0.17% done<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata DEGRADED 0 0 0<br />
raidz1-0 DEGRADED 0 0 0<br />
replacing-0 OFFLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY OFFLINE 0 0 0<br />
ata-ST3000DM001-1CH166_W1F478BD ONLINE 0 0 0 (resilvering)<br />
ata-ST3000DM001-9YN166_S1F0JKRR ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1 ONLINE 0 0 0<br />
<br />
errors: No known data errors}}<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=315099ZFS2014-05-14T08:33:47Z<p>Chungy: /* Database */ zfs logbias</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Playing with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|spl-git}} as a dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Playing with ZFS==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Playing_with_ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available for ZFSonLinux HEAD snapshots (which is normally installed on non-LTS kernels), to be released as a new feature of 0.6.3. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time hasn't been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
If your pool has no configured log devices, ZFS reserves space on the pool's data disks for its intent log (the ZIL). ZFS uses this for crash recovery, but databases are often syncing their data files to the file system on their own transaction commits anyway. The end result of this, is that ZFS will be committing data '''twice''' to the data disks and it can severely impact performance. You can tell ZFS to not use the ZIL, and in which case data is only committed to the file system once. Disabling the ZIL for non-database file systems or for pools with configured log devices (eg, with SSDs) can actually negatively impact the performance, so beware:<br />
# zfs set logbias=throughput <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K \<br />
-o primarycache=metadata \<br />
-o mountpoint=/var/lib/postgres \<br />
-o logbias=throughput \<br />
<pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you plan on using ZFS to store your /tmp directory (which may be useful for storing arbitrarily-large sets of files, or simply keeping your RAM free of idle data), you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected:<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents any privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
zvols might suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
{{bc|# zpool import -a -f}}<br />
<br />
now export the pool:<br />
<br />
{{bc|# zpool export <pool>}}<br />
<br />
To see the available pools, use,<br />
<br />
{{bc|# zpool status}}<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
{{bc|# mkinitcpio -p linux}}<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=315084ZFS2014-05-14T05:15:01Z<p>Chungy: /* Tuning */ more typos</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Playing with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|spl-git}} as a dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Playing with ZFS==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Playing_with_ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available for ZFSonLinux HEAD snapshots (which is normally installed on non-LTS kernels), to be released as a new feature of 0.6.3. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time hasn't been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly referred to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accommodate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K -o primarycache=metadata -o mountpoint=/var/lib/postgres <pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you plan on using ZFS to store your /tmp directory (which may be useful for storing arbitrarily-large sets of files, or simply keeping your RAM free of idle data), you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected:<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents any privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
zvols might suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accommodate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
{{bc|# zpool import -a -f}}<br />
<br />
now export the pool:<br />
<br />
{{bc|# zpool export <pool>}}<br />
<br />
To see the available pools, use,<br />
<br />
{{bc|# zpool status}}<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
{{bc|# mkinitcpio -p linux}}<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=315083ZFS2014-05-14T05:11:23Z<p>Chungy: /* Database */ typo</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Playing with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|spl-git}} as a dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Playing with ZFS==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Playing_with_ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available for ZFSonLinux HEAD snapshots (which is normally installed on non-LTS kernels), to be released as a new feature of 0.6.3. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time hasn't been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly refered to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accomidate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own caching algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K -o primarycache=metadata -o mountpoint=/var/lib/postgres <pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you plan on using ZFS to store your /tmp directory (which may be useful for storing arbitrarily-large sets of files, or simply keeping your RAM free of idle data), you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected:<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents any privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
zvols might suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accomidate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
{{bc|# zpool import -a -f}}<br />
<br />
now export the pool:<br />
<br />
{{bc|# zpool export <pool>}}<br />
<br />
To see the available pools, use,<br />
<br />
{{bc|# zpool status}}<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
{{bc|# mkinitcpio -p linux}}<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=315082ZFS2014-05-14T04:59:50Z<p>Chungy: /* Tuning */</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|File systems}}<br />
{{Related|Playing with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Installation==<br />
<br />
Install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository. This package has {{AUR|spl-git}} as a dependency. SPL (Solaris Porting Layer) is a Linux Kernel module implementing Solaris APIs for ZFS compatibility.<br />
<br />
{{Note|1=The zfs-git package replaces the original zfs package from [[AUR]]. ZFSonLinux.org is slow to make stable releases and kernel API changes broke stable builds of ZFSonLinux for Arch. Changes submitted to the master branch of the ZFSonLinux repository are regression tested and therefore considered stable.}}<br />
<br />
For users that desire ZFS builds from stable releases, {{AUR|zfs-lts}} is available from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.<br />
<br />
{{warning|The ZFS and SPL kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the [[Unofficial user repositories#demz-repo-core|demz-repo-core]] repository.}}<br />
<br />
Test the installation by issuing {{ic|zpool status}} on the command line. If an "insmod" error is produced, try {{ic|depmod -a}}.<br />
<br />
=== Archiso ===<br />
<br />
For installing Arch Linux into a ZFS root filesystem, install {{AUR|zfs-git}} from the [[Arch User Repository]] or the [[Unofficial user repositories#demz-repo-archiso|demz-repo-archiso]] repository.<br />
<br />
See [[Installing Arch Linux on ZFS]] for more information.<br />
<br />
=== Automated build script ===<br />
<br />
The build order of the above is important due to nested dependencies. One can automate the entire process, including downloading the packages with the following shell script. The only requirements for it to work are:<br />
*{{pkg|sudo}} - Note that your user needed sudo rights to {{ic|/usr/bin/clean-chroot-manager}} for the script below to work.<br />
*{{pkg|rsync}} - Needed for moving over the build files.<br />
*{{AUR|cower}} - Needed to grab sources from the AUR.<br />
*{{AUR|clean-chroot-manager}} - Needed to build in a clean chroot and add packages to a local repo.<br />
<br />
Be sure to add the local repo to {{ic|/etc/pacman.conf}} like so:<br />
{{hc|$ tail /etc/pacman.conf|<nowiki><br />
[chroot_local]<br />
SigLevel = Optional TrustAll<br />
Server = file:///path/to/localrepo/defined/below<br />
</nowiki>}}<br />
<br />
{{hc|~/bin/build_zfs|<nowiki><br />
#!/bin/bash<br />
#<br />
# ZFS Builder by graysky<br />
#<br />
<br />
# define the temp space for building here<br />
WORK='/scratch'<br />
<br />
# create this dir and chown it to your user<br />
# this is the local repo which will store your zfs packages<br />
REPO='/var/repo'<br />
<br />
# Add the following entry to /etc/pacman.conf for the local repo<br />
#[chroot_local]<br />
#SigLevel = Optional TrustAll<br />
#Server = file:///path/to/localrepo/defined/above<br />
<br />
for i in rsync cower clean-chroot-manager; do<br />
command -v $i >/dev/null 2>&1 || {<br />
echo "I require $i but it's not installed. Aborting." >&2<br />
exit 1; }<br />
done<br />
<br />
[[ -f ~/.config/clean-chroot-manager.conf ]] &&<br />
. ~/.config/clean-chroot-manager.conf || exit 1<br />
<br />
[[ ! -d "$REPO" ]] &&<br />
echo "Make the dir for your local repo and chown it: $REPO" && exit 1<br />
<br />
[[ ! -d "$WORK" ]] &&<br />
echo "Make a work directory: $WORK" && exit 1<br />
<br />
cd "$WORK"<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
[[ -d $i ]] && rm -rf $i<br />
cower -d $i<br />
done<br />
<br />
for i in spl-utils-git spl-git zfs-utils-git zfs-git; do<br />
cd "$WORK/$i"<br />
sudo ccm s<br />
done<br />
<br />
rsync -auvxP "$CHROOTPATH/root/repo/" "$REPO"<br />
</nowiki>}}<br />
<br />
==Playing with ZFS==<br />
<br />
Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like {{ic|~/zfs0.img}} {{ic|~/zfs1.img}} {{ic|~/zfs2.img}} etc. with no possibility of real data loss are encouraged to see the [[Playing_with_ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.target<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.target<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
<br />
=== General ===<br />
Many parameters are available for zfs file systems, you can view a full list with <code>zfs get all <pool></code>. Two common ones to adjust are '''atime''' and '''compression'''.<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
As an alternative to turning off atime completely, '''relatime''' is available for ZFSonLinux HEAD snapshots (which is normally installed on non-LTS kernels), to be released as a new feature of 0.6.3. This brings the default ext4/xfs atime semantics to ZFS, where access time is only updated if the modified time or changed time changes, or if the existing access time hasn't been updated within the past 24 hours. It is a compromise between atime=off and atime=on. This property ''only'' takes effect if '''atime''' is '''on''':<br />
# zfs set relatime=on <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
=== Database ===<br />
ZFS, unlike most other file systems, has a variable record size, or what is commonly refered to as a block size. By default, the recordsize on ZFS is 128KiB, which means it will dynamically allocate blocks of any size from 512B to 128KiB depending on the size of file being written. This can often help fragmentation and file access, at the cost that ZFS would have to allocate new 128KiB blocks each time only a few bytes are written to.<br />
<br />
Most RDBMSes work in 8KiB-sized blocks by default. Although the block size is tunable for [[MySQL|MySQL/MariaDB]], [[PostgreSQL]], and [[Oracle]], all three of them use an 8KiB block size ''by default''. For both performance concerns and keeping snapshot differences to a minimum (for backup purposes, this is helpful), it is usually desirable to tune ZFS instead to accomidate the databases, using a command such as:<br />
# zfs set recordsize=8K <pool>/postgres<br />
<br />
These RDBMSes also tend to implement their own cachine algorithm, often similar to ZFS's own ARC. In the interest of saving memory, it is best to simply disable ZFS's caching of the database's file data and let the database do its own job:<br />
# zfs set primarycache=metadata <pool>/postgres<br />
<br />
These can also be done at file system creation time, for example:<br />
# zfs create -o recordsize=8K -o primarycache=metadata -o mountpoint=/var/lib/postgres <pool>/postgres<br />
<br />
Please note: these kinds of tuning parameters are ideal for specialized applications like RDBMSes. You can easily ''hurt'' ZFS's performance by setting these on a general-purpose file system such as your /home directory.<br />
<br />
=== /tmp ===<br />
If you plan on using ZFS to store your /tmp directory (which may be useful for storing arbitrarily-large sets of files, or simply keeping your RAM free of idle data), you can generally improve performance of certain applications writing to /tmp by disabling file system sync. This causes ZFS to ignore an application's sync requests (eg, with <code>fsync</code> or <code>O_SYNC</code>) and return immediately. While this has severe data consistency consequences (never disable sync for a database!), files in /tmp are less likely to be important and affected:<br />
# zfs set sync=disabled <pool>/tmp<br />
<br />
Additionally, for security purposes, you may want to disable '''setuid''' and '''devices''' on the /tmp file system, which prevents any privilege-escalation attacks or the use of device nodes:<br />
# zfs set setuid=off <pool>/tmp<br />
# zfs set devices=off <pool>/tmp<br />
<br />
Combining all of these for a create command would be as follows:<br />
# zfs create -o setuid=off -o devices=off -o sync=disabled -o mountpoint=/tmp <pool>/tmp<br />
<br />
Please note, also, that if you want /tmp on ZFS, you will need to mask (disable) [[systemd]]'s automatic tmpfs-backed /tmp, else ZFS will be unable to mount your dataset at boot-time or import-time:<br />
# systemctl mask tmp.mount<br />
<br />
=== zvols ===<br />
zvols might suffer from the same block size-related issues as RDBMSes, but it's worth noting that the default recordsize for zvols is 8KiB already. If possible, it is best to align any partitions contained in a zvol to your recordsize (current versions of fdisk and gdisk by default automatically align at 1MiB segments, which works), and file system block sizes to the same size. Other than this, you might tweak the '''recordsize''' to accomidate the data inside the zvol as necessary (though 8KiB tends to be a good value for most file systems, even when using 4KiB blocks on that level).<br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
==== ZFS Automatic Snapshot Service for Linux ==== <br />
<br />
The {{AUR|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==== ZFS Snapshot Manager ====<br />
<br />
The {{AUR|zfs-snap-manager}} package from [[Arch User Repository|AUR]] provides a python service that takes daily snapshots from a configurable set of ZFS datasets and cleans them out in a [http://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son "Grandfather-father-son"] scheme. It can be configured to e.g. keep 7 daily, 5 weekly, 3 monthly and 2 yearly snapshots. <br />
<br />
The package also supports configurable replication to other machines running ZFS by means of {{ic|zfs send}} and {{ic|zfs receive}}. If the destination machine runs this package as well, it could be configured to keep these replicated snapshots for a longer time. This allows a setup where a source machine has only a few daily snapshots locally stored, while on a remote storage server a much longer retention is available.<br />
<br />
==Troubleshooting==<br />
=== ZFS is using too much RAM ===<br />
<br />
By default, ZFS caches file operations ([[wikipedia:Adaptive replacement cache|ARC]]) using up to two-thirds of available system memory on the host. Remember, ZFS is designed for enterprise class storage systems! To adjust the ARC size, add the following to the [[Kernel parameters]] list:<br />
<br />
zfs.zfs_arc_max=536870912 # (for 512MB)<br />
<br />
For a more detailed description, as well as other configuration options, see [http://wiki.gentoo.org/wiki/ZFS#ARC gentoo-wiki:zfs#arc].<br />
<br />
=== Does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
{{bc|# zpool import -a -f}}<br />
<br />
now export the pool:<br />
<br />
{{bc|# zpool export <pool>}}<br />
<br />
To see the available pools, use,<br />
<br />
{{bc|# zpool status}}<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
{{bc|# mkinitcpio -p linux}}<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils-git<br />
spl-git<br />
zfs-utils-git<br />
zfs-git<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
In the case of a root filesystem pool, the {{ic|mkinicpio.conf}} HOOKS line will enable the keyboard for the password, create the devices, and load the pools. It will contain something like:<br />
<br />
HOOKS="... keyboard encrypt zfs ..."<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
Creating encrypted zpools works fine. But if you need encrypted directories, for example to protect your users' homes, ZFS loses some functionality.<br />
<br />
ZFS will see the encrypted data, not the plain-text abstraction, so compression and deduplication will not work. The reason is that encrypted data has always high entropy making compression ineffective and even from the same input you get different output (thanks to salting) making deduplication impossible.<br />
To reduce the unnecessary overhead it is possible to create a sub-filesystem for each encrypted directory and use [[eCryptfs]] on it.<br />
<br />
For example to have an encrypted home: (the two passwords, encryption and login, must be the same)<br />
# zfs create -o compression=off \<br />
-o dedup=off \<br />
-o mountpoint=/home/<username> \<br />
<zpool>/<username><br />
# useradd -m <username><br />
# passwd <username><br />
# ecryptfs-migrate-home -u <username><br />
<log in user and complete the procedure with ecryptfs-unwrap-passphrase><br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Add the archzfs maintainer's PGP key to the local (installer image) trust:<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs-git<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [[Installing Arch Linux on ZFS]]<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=304604ZFS2014-03-15T12:02:49Z<p>Chungy: /* Swap volume */ compression doesn't negatively affect swap (quite the opposite), and the secondarycache option was somewhat silly.</p>
<hr />
<div>[[Category:File systems]]<br />
{{Related articles start}}<br />
{{Related|ZFS Installation}}<br />
{{Related|Playing with ZFS}}<br />
{{Related|Installing Arch Linux on ZFS}}<br />
{{Related|ZFS on FUSE}}<br />
{{Related articles end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], a maximum [[Wikipedia:Exabyte|16 Exabyte]] file size, and a maximum [[Wikipedia:Zettabyte|256 Zettabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
ZOL is a project funded by the [https://www.llnl.gov/ Lawrence Livermore National Laboratory] to develop a native Linux kernel module for its massive storage requirements and super computers.<br />
<br />
==Playing with ZFS==<br />
The rest of this article cover basic setup and usage of ZFS on physical block devices (HDD and SSD for example). Users wishing to experiment with ZFS on ''virtual block devices'' (known in ZFS terms as VDEVs) which can be simple files like ~/zfs0.img ~/zfs1.img ~/zfs2.img etc. with no possibility of real data loss are encouraged to see the [[Playing_with_ZFS]] article. Common tasks like building a RAIDZ array, purposefully corrupting data and recovering it, snapshotting datasets, etc. are covered.<br />
<br />
==Installation==<br />
The requisite packages are available in the AUR and in an unofficial repo. Details are provided on the [[ZFS Installation]] article.<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
{{Note|The following section is ONLY needed if users wish to install their root filesystem to a ZFS volume. Users wishing to have a data partition with ZFS do NOT need to read the next section.}}<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount the zpool in {{ic|/etc/fstab}}; the zfs daemon can import and mount zfs pools automatically. The daemon mounts the zfs pools reading the file {{ic|/etc/zfs/zpool.cache}}.<br />
<br />
For each pool you want automatically mounted by the zfs daemon execute:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary nor recommended to partition the drives before creating the zfs filesystem.<br />
<br />
Having identified the list of drives, it is now time to get the id's of the drives to add to the zpool. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than the pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that to include into the pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
In case Advanced Format disks are used which have a native sector size of 4096 bytes instead of 512 bytes, the automated sector size detection algorithm of ZFS might detect 512 bytes because the backwards compatibility with legacy systems. This would result in degraded performance. To make sure a correct sector size is used, the {{ic|<nowiki>ashift=12</nowiki>}} option should be used (See the [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives ZFS on Linux FAQ]). The full command would in this case be:<br />
<br />
# zpool create -f -o ashift=12 -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that the pool is mounted. Using {{ic|# zpool status}} will show that the pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot the machine to ensure that the ZFS pool is mounted at boot. It is best to deal with all errors before transferring data.<br />
<br />
== Tuning ==<br />
Although many knobs are available on a zfs pool, there are two major ones user can consider:<br />
*atime<br />
*compression<br />
<br />
Atime is enabled by default but for most users, it represents superfluous writes to the zpool and it can be disabled using the zfs command:<br />
# zfs set atime=off <pool><br />
<br />
Compression is just that, transparent compression of data. Consult the man page for various options. A recent advancement is the lz4 algorithm which offers excellent compression and performance. Enable it (or any other) using the zfs command:<br />
# zfs set compression=lz4 <pool><br />
<br />
Other options for zfs can be displayed again, using the zfs command:<br />
# sudo zfs get all <pool><br />
<br />
== Usage ==<br />
<br />
Users can optionally create a dataset under the zpool as opposed to manually creating directories under the zpool. Datasets allow for an increased level of control (quotas for example) in addition to snapshots. To be able to create and mount a dataset, a directory of the same name must not pre-exist in the zpool. To create a dataset, use:<br />
<br />
# zfs create <nameofzpool>/<nameofdataset><br />
<br />
It is then possible to apply ZFS specific attributes to the dataset. For example, one could assign a quota limit to a specific directory within a dataset:<br />
<br />
# zfs set quota=20G <nameofzpool>/<nameofdataset>/<directory><br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub the pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in the root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of the ZFS pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about the ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of the pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Rename a Zpool ===<br />
Renaming a zpool that is already created is accomplished in 2 steps:<br />
<br />
# zpool export oldname<br />
# zpool import oldname newname<br />
<br />
=== Setting a Different Mount Point ===<br />
The mount point for a given zpool can be moved at will with one command:<br />
# zfs set mountpoint=/foo/bar poolname<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but users can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o primarycache=metadata \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permanent, edit {{ic|/etc/fstab}}. ZVOLs support discard, which can potentially help ZFS's block allocator and reduce fragmentation for all other datasets when/if swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
Keep in mind the Hibernate hook must be loaded before filesystems, so using ZVOL as swap will not allow to use hibernate function. If you need hibernate, keep a partition for it.<br />
<br />
=== Automatic snapshots ===<br />
<br />
The {{aur|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. Optionally adjust the --keep parameter from the defaults depending on how far back the snapshots are to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, set more fine-grained control as well by label, if, for example, no monthlies are to be kept on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==Troubleshooting==<br />
<br />
=== does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. Either place the spl hostid in the [[kernel parameters]] in the boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
==== Unexported pool ====<br />
<br />
If the new installation does not boot because the zpool cannot be imported, chroot into the installation and properly export the zpool. See [[ZFS#Emergency chroot repair with archzfs]].<br />
<br />
Once inside the chroot environment, load the ZFS module and force import the zpool,<br />
<br />
{{bc|# zpool import -a -f}}<br />
<br />
now export the pool:<br />
<br />
{{bc|# zpool export <pool>}}<br />
<br />
To see the available pools, use,<br />
<br />
{{bc|# zpool status}}<br />
<br />
It is necessary to export a pool because of the way ZFS uses the hostid to track the system the zpool was created on. The hostid is generated partly based on the network setup. During the installation in the archiso the network configuration could be different generating a different hostid than the one contained in the new installation. Once the zfs filesystem is exported and then re-imported in the new installation, the hostid is reset. See [http://osdir.com/ml/zfs-discuss/2011-06/msg00227.html Re: Howto zpool import/export automatically? - msg#00227].<br />
<br />
If ZFS complains about "pool may be in use" after every reboot, properly export pool as described above, and then rebuild ramdisk in normally booted system:<br />
<br />
{{bc|# mkinitcpio -p linux}}<br />
<br />
==== Incorrect hostid ====<br />
<br />
Double check that the pool is properly exported. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means the hostid is not yet correctly set in the early boot phase and it confuses zfs. Manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down the hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
<br />
Users can always ignore the check adding {{ic|zfs_force&#61;1}} in the [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
== Tips and tricks ==<br />
<br />
===Embed the archzfs packages into an archiso===<br />
<br />
It is a good idea make an installation media with the needed software included. Otherwise, the latest archiso installation media burned to a CD or a USB key is required. <br />
<br />
To embed {{ic|zfs}} in the archiso, from an existing install, download the {{ic|archiso}} package.<br />
<br />
# pacman -S archiso<br />
<br />
Start the process: <br />
# cp -r /usr/share/archiso/configs/releng /root/media<br />
<br />
Edit the {{ic|packages.x86_64}} file adding those lines:<br />
spl-utils<br />
spl<br />
zfs-utils<br />
zfs<br />
<br />
Edit the {{ic|pacman.conf}} file adding those lines (TODO, correctly embed keys in the installation media?):<br />
[demz-repo-archiso]<br />
SigLevel = Never<br />
Server = <nowiki>http://demizerone.com/$repo/$arch</nowiki><br />
<br />
Add other packages in {{ic|packages.both}}, {{ic|packages.i686}}, or {{ic|packages.x86_64}} if needed and create the image.<br />
# ./build.sh -v<br />
<br />
The image will be in the {{ic|/root/media/out}} directory.<br />
<br />
More informations about the process can be read in [http://kroweer.wordpress.com/2011/09/07/creating-a-custom-arch-linux-live-usb/ this guide] or in the [[Archiso]] article.<br />
<br />
If installing onto a UEFI system, see [[Unified Extensible Firmware Interface#Create UEFI bootable USB from ISO]] for creating UEFI compatible installation media.<br />
<br />
=== Encryption in ZFS on linux ===<br />
<br />
ZFS on linux does not support encryption directly, but zpools can be created in dm-crypt block devices. Since the zpool is created on the plain-text<br />
abstraction it is possible to have the data encrypted while having all the advantages of ZFS like deduplication, compression, and data robustness.<br />
<br />
dm-crypt, possibly via LUKS, creates devices in {{ic|/dev/mapper}} and their name is fixed. So you just need to change {{ic|zpool create}} commands to<br />
point to that names. The idea is configuring the system to create the {{ic|/dev/mapper}} block devices and import the zpools from there.<br />
Since zpools can be created in multiple devices (raid, mirroring, striping, ...), it is important all the devices are encrypted otherwise the protection<br />
might be partially lost.<br />
<br />
<br />
For example, an encrypted zpool can be created using plain dm-crypt (without LUKS) with:<br />
<br />
# cryptsetup --hash=sha512 --cipher=twofish-xts-plain64 --offset=0 --key-file=/dev/sdZ --key-size=512 open --type=plain /dev/sdX enc<br />
# zpool create zroot /dev/mapper/enc<br />
<br />
Since the {{ic|/dev/mapper/enc}} name is fixed no import errors will occur.<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into the ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up the network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install a text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
SigLevel = Required<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import the pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount the boot partitions (if any):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into the ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check the kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, run depmod (in the chroot) with the correct kernel version of the chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in the chroot installation.<br />
<br />
Regenerate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]<br />
* [http://royal.pingdom.com/2013/06/04/zfs-backup/ Pingdom details how it backs up 5TB of MySQL data every day with ZFS]<br />
<br />
; Aaron Toponce has authored a 17-part blog on ZFS which is an excellent read.<br />
# [https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/ VDEVs]<br />
# [https://pthree.org/2012/12/05/zfs-administration-part-ii-raidz/ RAIDZ Levels]<br />
# [https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/ The ZFS Intent Log]<br />
# [https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/ The ARC]<br />
# [https://pthree.org/2012/12/10/zfs-administration-part-v-exporting-and-importing-zpools/ Import/export zpools]<br />
# [https://pthree.org/2012/12/11/zfs-administration-part-vi-scrub-and-resilver/ Scrub and Resilver]<br />
# [https://pthree.org/2012/12/12/zfs-administration-part-vii-zpool-properties/ Zpool Properties]<br />
# [https://pthree.org/2012/12/13/zfs-administration-part-viii-zpool-best-practices-and-caveats/ Zpool Best Practices]<br />
# [https://pthree.org/2012/12/14/zfs-administration-part-ix-copy-on-write/ Copy on Write]<br />
# [https://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/ Creating Filesystems]<br />
# [https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ Compression and Deduplication]<br />
# [https://pthree.org/2012/12/19/zfs-administration-part-xii-snapshots-and-clones/ Snapshots and Clones]<br />
# [https://pthree.org/2012/12/20/zfs-administration-part-xiii-sending-and-receiving-filesystems/ Send/receive Filesystems]<br />
# [https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/ ZVOLs]<br />
# [https://pthree.org/2012/12/31/zfs-administration-part-xv-iscsi-nfs-and-samba/ iSCSI, NFS, and Samba]<br />
# [https://pthree.org/2013/01/02/zfs-administration-part-xvi-getting-and-setting-properties/ Get/Set Properties]<br />
# [https://pthree.org/2013/01/03/zfs-administration-part-xvii-best-practices-and-caveats/ ZFS Best Practices]</div>Chungyhttps://wiki.archlinux.org/index.php?title=Python&diff=298982Python2014-02-19T20:53:16Z<p>Chungy: /* Old versions */ Python 2.6 reached EOL with the 2.6.9 release in October 2013; still to be updated in AUR</p>
<hr />
<div>[[Category:Programming language]]<br />
[[de:Python]]<br />
[[zh-CN:Python]]<br />
{{Related articles start}}<br />
{{Related|Python Package Guidelines}}<br />
{{Related|mod_python}}<br />
{{Related|Python VirtualEnv}}<br />
{{Related articles end}}<br />
[http://www.python.org Python] "is a remarkably powerful dynamic programming language that is used in a wide variety of application domains. Python is often compared to Tcl, Perl, Ruby, Scheme or Java."<br />
<br />
==Installation==<br />
<br />
There are currently two versions of Python: Python 3 (which is the default) and the older Python 2.<br />
<br />
===Python 3===<br />
<br />
Python 3 is the latest version of the language, and is '''incompatible with Python 2'''. The language is mostly the same, but many details, especially how built-in objects like dictionaries and strings work, have changed considerably, and a lot of deprecated features have finally been removed. Also, the standard library has been reorganized in a few prominent places. For an overview of the differences, visit [http://wiki.python.org/moin/Python2orPython3 Python2orPython3] and their relevant [http://getpython3.com/diveintopython3/porting-code-to-python-3-with-2to3.html chapter] in Dive into Python 3.<br />
<br />
To install the latest version of Python 3, [[pacman|install]] the {{Pkg|python}} package from the [[Official Repositories|official repositories]].<br />
<br />
If you would like to build the latest RC/betas from source, visit [http://www.python.org/download/ Python Downloads]. The [[Arch User Repository]] also contains good [[PKGBUILD]]s. If you do decide to build the RC, note that the binary (by default) installs to {{ic|/usr/local/bin/python3.x}}.<br />
<br />
===Python 2===<br />
To install the latest version of Python 2, [[pacman|install]] the {{Pkg|python2}} package from the [[Official Repositories|official repositories]].<br />
<br />
Python 2 will happily run alongside Python 3. You need to specify '''python2''' in order to run this version.<br />
<br />
Any program requiring Python 2 needs to point to {{ic|/usr/bin/python2}}, instead of {{ic|/usr/bin/python}}, which points to Python 3.<br />
<br />
To do so, open the program or script in a text editor and change the first line.<br />
<br />
The line will show one of the following:<br />
#!/usr/bin/env python<br />
or<br />
#!/usr/bin/python<br />
<br />
In both cases, just change {{ic|python}} to {{ic|python2}} and the program will then use Python 2 instead of Python 3.<br />
<br />
Another way to force the use of python2 without altering the scripts is to call it explicitely with python2, i.e.<br />
{{bc|python2 myScript.py}}<br />
<br />
Finally, you may not be able to control the script calls, but there is a way to trick the environment. It only works if the scripts use {{ic|#!/usr/bin/env python}}, it won't work with {{ic|#!/usr/bin/python}}. This trick relies on {{ic|env}} searching for the first corresponding entry in the PATH variable.<br />
First create a dummy folder.<br />
$ mkdir ~/bin<br />
Then add a symlink 'python' to python2 and the config scripts in it.<br />
$ ln -s /usr/bin/python2 ~/bin/python<br />
$ ln -s /usr/bin/python2-config ~/bin/python-config<br />
Finally put the new folder ''at the beginning'' of your PATH variable.<br />
$ export PATH=~/bin:$PATH<br />
Note that this change is not permanent and is only active in the current terminal session.<br />
To check which python interpreter is being used by {{ic|env}}, use the following command:<br />
$ which python<br />
<br />
A similar approach in tricking the environment, which also relies on {{ic|#!/usr/bin/env python}} to be called by the script in question, is to use a [[Virtualenv]]. When a Virtualenv is activated, the Python executable pointed to by {{ic|$PATH}} will be the one the Virtualenv was installed with. Therefore, if the Virtualenv is installed with Python 2, {{ic|python}} will refer to Python 2. To start, [[pacman|install]] {{pkg|python2-virtualenv}}.<br />
# pacman -S python2-virtualenv<br />
Then create the Virtualenv.<br />
$ virtualenv2 venv # Creates a directory, venv/, containing the Virtualenv<br />
Activate the Virtualenv, which will update {{ic|$PATH}} to point at Python 2. Note that this activation is only active for the current terminal session.<br />
$ source venv/bin/activate<br />
The desired script should then run using Python 2.<br />
<br />
==Dealing with version problem in build scripts==<br />
Many projects' build scripts assume {{ic|python}} to be Python 2, and that would eventually result in an error - typically complaining that {{ic|print 'foo'}} is invalid syntax. Luckily, many of them call {{ic|python}} in the {{ic|$PATH}} instead of hardcoding {{ic|#!/usr/bin/python}} in the shebang line, and the Python scripts are all contained within the project tree. So, instead of modifying the build scripts manually, there is an easy workaround. Just create {{ic|/usr/local/bin/python}} with content like this:<br />
<br />
{{hc|/usr/local/bin/python|<nowiki><br />
#!/bin/bash<br />
script=`readlink -f -- "$1"`<br />
case "$script" in (/path/to/project1/*|/path/to/project2/*|/path/to/project3*)<br />
exec python2 "$@"<br />
;;<br />
esac<br />
<br />
script=`readlink -f -- "$2"`<br />
case "$script" in (/path/to/project1/*|/path/to/project2/*|/path/to/project3*)<br />
exec python2 "$@"<br />
;;<br />
esac<br />
<br />
exec python3 "$@"<br />
</nowiki><br />
}}<br />
<br />
Where {{ic|<nowiki>/path/to/project1/*|/path/to/project2/*|/path/to/project3*</nowiki>}} is a list of patterns separated by {{ic|<nowiki>|</nowiki>}} matching all project trees.<br />
<br />
Don't forget to make it executable:<br />
<br />
# chmod +x /usr/local/bin/python<br />
<br />
Afterwards scripts within the specified project trees will be run with Python 2.<br />
<br />
==Integrated Development Environments==<br />
There are some IDEs for Python available in the [[Official Repositories|official repositories]].<br />
<br />
===Eclipse===<br />
<br />
Eclipse supports both Python 2.x and 3.x series by using the [[Eclipse#PyDev|PyDev]] extension.<br />
<br />
===Eric===<br />
For the latest Python 3 compatible version, [[pacman|install]] the {{Pkg|eric}} package.<br />
<br />
Version 4 of Eric is Python 2 compatible and can be installed with the {{Pkg|eric4}} package.<br />
<br />
These IDEs can also handle [[Ruby]].<br />
<br />
===IEP===<br />
<br />
IEP is an interactive (e.g. MATLAB) python IDE with basic debugging capabilities and is especially suitable for scientific computing. It is provided by the package {{AUR|iep}}.<br />
<br />
===Ninja===<br />
<br />
The Ninja IDE is provided by the package {{Pkg|ninja-ide}}.<br />
<br />
===Spyder===<br />
<br />
Spyder (previously known as Pydee) is a powerful interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. It focuses on scientific computations, providing a matlab-like environment. It can be installed with the package {{AUR|spyder}}<br />
<br />
== Getting easy_install ==<br />
<br />
The easy_install tool is available in the package {{Pkg|python-setuptools}}.<br />
<br />
== Getting completion in Python shell ==<br />
<br />
Copy this into Python's interactive shell<br />
{{hc|/usr/bin/python|<nowiki><br />
import rlcompleter<br />
import readline<br />
readline.parse_and_bind("tab: complete")<br />
</nowiki><br />
}}<br />
[http://algorithmicallyrandom.blogspot.com.es/2009/09/tab-completion-in-python-shell-how-to.html Source]<br />
<br />
==Widget bindings ==<br />
The following [[wikipedia:widget toolkit|widget toolkit]] bindings are available:<br />
*{{App|TkInter|Tk bindings|http://wiki.python.org/moin/TkInter|standard module}}<br />
*{{App|pyQt|[[Qt]] bindings|http://www.riverbankcomputing.co.uk/software/pyqt/intro|{{Pkg|python2-pyqt4}} {{Pkg|python2-pyqt5}} {{Pkg|python-pyqt4}} {{Pkg|python-pyqt5}}}}<br />
*{{App|pySide|[[Qt]] bindings|http://www.pyside.org/|{{AUR|python2-pyside}} {{AUR|python-pyside}}}}<br />
*{{App|pyGTK|[[GTK+|GTK+ 2]] bindings|http://www.pygtk.org/|{{Pkg|pygtk}}}}<br />
*{{App|PyGObject|[[GTK+|GTK+ 2/3]] bindings via GObject Introspection|https://wiki.gnome.org/PyGObject/|{{Pkg|python2-gobject2}} {{Pkg|python2-gobject}} {{Pkg|python-gobject2}} {{Pkg|python-gobject}}}}<br />
*{{App|wxPython|wxWidgets bindings|http://wxpython.org/|{{Pkg|wxpython}}}}<br />
To use these with Python, you may need to install the associated widget kits.<br />
<br />
==Old versions==<br />
Old versions of Python are available via the [[Arch User Repository|AUR]] and may be useful for historical curiosity, old applications that don't run on current versions, or for testing Python programs intended to run on a distribution that comes with an older version (eg, RHEL 5.x has Python 2.4, or Ubuntu 12.04 has Python 3.1):<br />
*{{AUR|python15}}: Python 1.5.2<br />
*{{AUR|python16}}: Python 1.6.1<br />
*{{AUR|python24}}: Python 2.4.6<br />
*{{AUR|python25}}: Python 2.5.6<br />
*{{AUR|python26}}: Python 2.6.8<br />
*{{AUR|python30}}: Python 3.0.1<br />
*{{AUR|python31}}: Python 3.1.5<br />
*{{AUR|python32}}: Python 3.2.3<br />
<br />
As of February 2014, Python upstream only supports Python 2.7, 3.1, 3.2, and 3.3 for security fixes. Using older versions for Internet-facing applications or untrusted code may be dangerous and is not recommended.<br />
<br />
Extra modules/libraries for old versions of Python may be found on the AUR by searching for python(version without decimal), eg searching for "python26" for 2.6 modules.<br />
<br />
==More Resources==<br />
* [http://shop.oreilly.com/product/9780596158071.do Learning Python] is one of the most comprehensive, up to date, and well-written books on Python available today.<br />
* [http://www.diveintopython.net/ Dive Into Python] is an excellent (free) resource, but perhaps for more advanced readers and [http://diveintopython3.ep.io/ has been updated for Python 3].<br />
* [http://www.swaroopch.com/notes/Python A Byte of Python] is a book suitable for users new to Python (and scripting in general).<br />
* [http://learnpythonthehardway.org Learn Python The Hard Way] the best intro to programming.<br />
* [http://facts.learnpython.org facts.learnpython.org] nice site to learn python.<br />
* [http://stephensugden.com/crash_into_python/ Crash into Python] Also known as ''Python for Programmers with 3 Hours'', this guide gives experienced developers from other languages a crash course on Python.<br />
* [http://www.apress.com/book/view/9781590598726 Beginning Game Development with Python and Pygame: From Novice to Professional] for games<br />
<br />
==For Fun==<br />
Try the following snippets from Python's interactive shell:<br />
<br />
>>> import this<br />
<br />
>>> from __future__ import braces<br />
<br />
>>> import antigravity</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=265623ZFS2013-07-09T20:12:07Z<p>Chungy: /* Usage */ auto-snapshot stuff</p>
<hr />
<div>[[Category:File systems]]<br />
{{Article summary start}}<br />
{{Article summary text|This page provides basic guidelines for installing the native ZFS Linux kernel module.}}<br />
{{Article summary heading|Related}}<br />
{{Article summary wiki|Installing Arch Linux on ZFS}}<br />
{{Article summary wiki|ZFS on FUSE}}<br />
{{Article summary end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], and a maximum [[Wikipedia:Exabyte|16 Exabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
==Installation==<br />
<br />
===Building from AUR===<br />
<br />
The ZFS kernel module is available in the [[AUR]] via {{aur|zfs}}.<br />
<br />
{{note|The ZFS and SPL (Solaris Porting Layer is a Linux kernel module which provides many of the Solaris kernel APIs) kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the archzfs repository.}}<br />
<br />
Should you wish to update the core/linux package before the AUR/zfs and AUR/spl packages' dependency lists are updated, a possible work-around is to remove (uninstall) spl and zfs packages (the respective modules and file system may stay in-use), update the core/linux package, build + install zfs and spl packages - just do not forget to edit PKGBUILD and correct the core/linux version number in "depends" section to match the updated version). Finally, the system may be rebooted. [ This is only for the situation, when ZFS is not used for root filesystem. ]<br />
<br />
===Unofficial repository===<br />
<br />
For fast and effortless installation and updates, the [http://demizerone.com/archzfs "archzfs"] signed repository is available to add to your {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-core]<br />
Server = http://demizerone.com/$repo/$arch</nowiki><br />
}}<br />
<br />
The repository and packages are signed with the maintainer's PGP key which is verifiable here: http://demizerone.com. This key is not trusted by any of the Arch Linux master keys, so it will need to be locally signed before use. See [[pacman-key]].<br />
<br />
Add the maintainer's key,<br />
<br />
# pacman-key -r 0EE7A126<br />
<br />
and locally sign to add it to the system's trust database,<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Once the key has been signed, it is now possible to update the package database,<br />
<br />
# pacman -Syyu<br />
<br />
and install ZFS packages:<br />
<br />
# pacman -S archzfs<br />
<br />
===Archiso tracking repository===<br />
<br />
ZFS can easily be used from within the archiso live environment by using the special archiso tracking repository for ZFS. This repository makes it easy to install Arch Linux on a root ZFS filesystem, or to mount ZFS pools from within an archiso live environment using an up-to-date live medium. To use this repository from the live environment, add the following server line to pacman.conf:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki><br />
}}<br />
<br />
This repository and packages are also signed, so the key must be locally signed following the steps listed in the previous section before use. For a guide on how to install Arch Linux on to a root ZFS filesystem, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===mkinitramfs hook===<br />
<br />
If you are using ZFS on your root filesystem, then you will need to add the ZFS hook to [[Mkinitcpio|mkinitcpio.conf]]; if you are not using ZFS for your root filesystem, then you do not need to add the ZFS hook.<br />
<br />
You will need to change your [[kernel parameters]] to include the dataset you want to boot. You can use <code>zfs=bootfs</code> to use the ZFS bootfs (set via <code>zpool set bootfs=rpool/ROOT/arch rpool</code>) or you can set the [[kernel parameters]] to <code>zfs=<pool>/<dataset></code> to boot directly from a ZFS dataset.<br />
<br />
To see all available options for the ZFS hook:<br />
<br />
$ mkinitcpio -H zfs<br />
<br />
To use the mkinitcpio hook, you will need to add <code>zfs</code> to your <code>HOOKS</code> in <code>/etc/mkinitcpio.conf</code>:<br />
<br />
{{hc|/etc/mkinitcpio.conf|<br />
...<br />
HOOKS<nowiki>="base udev autodetect modconf encrypt zfs filesystems usbinput"</nowiki><br />
...<br />
}}<br />
<br />
{{note|It is not necessary to use the "fsck" hook with ZFS. ZFS automatically fixes any errors that occur within the filesystem. However, if the hook is required for another filesystem used on the system, such as ext4, the current ZFS packaging implementation does not yet properly handle fsck requests from mkinitcpio and an error is produced when generating a new ramdisk.}}<br />
<br />
It is important to place this after any hooks which are needed to prepare the drive before it is mounted. For example, if your ZFS volume is encrypted, then you will need to place encrypt before the zfs hook to unlock it first.<br />
<br />
Recreate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount your zpool in {{ic|/etc/fstab}}; the zfs daemon imports and mounts one zfs pool automatically. The daemon mounts a zfs pool reading the file {{ic|/etc/zfs/zpool.cache}}, so the zfs pool that you want to automatically mounted must write the file there.<br />
<br />
Set a pool as to be automatically mounted by the zfs daemon:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.service<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.service<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary to partition your drives before creating the zfs filesystem, this will be done automatically. However, if you feel the need to completely wipe your drive before creating the filesystem, this can be easily done with the dd command.<br />
<br />
# dd if=/dev/zero of=/dev/<device><br />
<br />
It should not have to be stated, but be careful with this command!<br />
<br />
Once you have the list of drives, it is now time to get the id's of the drives you will be using. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's for your device, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than your pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool. Change it to whatever you like.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that you want to include into your pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that you pool is mounted. Using {{ic|# zpool status}} will show that your pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot your computer to make sure your ZFS pool is mounted at boot. It is best to deal with all errors before transferring your data.<br />
<br />
== Usage ==<br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub your pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in your root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of your ZFS storage pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about your ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of your pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but you can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced, turning off compression, and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) \<br />
-o compression=off \<br />
-o primarycache=metadata \<br />
-o secondarycache=none \<br />
-o sync=always \<br />
-o com.sun:auto-snapshot=false <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permament you need to edit your {{ic|/etc/fstab}}. ZVOLs support discard, which can potentionally help ZFS's block allocator and reduce fragmentation for all other datasets when your swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
=== Automatic snapshots ===<br />
<br />
The {{aur|zfs-auto-snapshot-git}} package from [[Arch User Repository|AUR]] provides a shell script to automate the management of snapshots, with each named by date and label (hourly, daily, etc), giving quick and convenient snapshotting of all your ZFS datasets. The package also installs cron tasks for quarter-hourly, hourly, daily, weekly, and monthly snapshots. You may want to adjust the --keep parameter from the defaults depending on how far back you'd like to go (the monthly script by default keeps data for up to a year).<br />
<br />
To prevent a dataset from being snapshotted at all, set <tt>com.sun:auto-snapshot=false</tt> on it. Likewise, you can set more fine-grained control as well by label, so if you don't want any monthlies on a snapshot, for example, set <tt>com.sun:auto-snapshot:monthly=false</tt>.<br />
<br />
==Troubleshooting==<br />
<br />
=== does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. You can either place your spl hostid in the [[kernel parameters]] in your boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
You can always ignore the check adding {{ic|zfs_force&#61;1}} in your [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
First of all double check you actually exported the pool correctly. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means your hostid is not yet correctly set in the early boot phase and it confuses zfs. So you have to manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down your hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
Follow the previous section to set it.<br />
<br />
== Tips and tricks ==<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into your ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up your network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install your favorite text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import your pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount your boot partitions (if you have them):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into your ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check your kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, you will need to run depmod (in the chroot) with the correct kernel version of your chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in your chroot installation.<br />
<br />
Regenerate your ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=265611ZFS2013-07-09T19:58:25Z<p>Chungy: /* Swap volume */ Take out the guesswork of your pagesize</p>
<hr />
<div>[[Category:File systems]]<br />
{{Article summary start}}<br />
{{Article summary text|This page provides basic guidelines for installing the native ZFS Linux kernel module.}}<br />
{{Article summary heading|Related}}<br />
{{Article summary wiki|Installing Arch Linux on ZFS}}<br />
{{Article summary wiki|ZFS on FUSE}}<br />
{{Article summary end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], and a maximum [[Wikipedia:Exabyte|16 Exabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
==Installation==<br />
<br />
===Building from AUR===<br />
<br />
The ZFS kernel module is available in the [[AUR]] via {{aur|zfs}}.<br />
<br />
{{note|The ZFS and SPL (Solaris Porting Layer is a Linux kernel module which provides many of the Solaris kernel APIs) kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the archzfs repository.}}<br />
<br />
Should you wish to update the core/linux package before the AUR/zfs and AUR/spl packages' dependency lists are updated, a possible work-around is to remove (uninstall) spl and zfs packages (the respective modules and file system may stay in-use), update the core/linux package, build + install zfs and spl packages - just do not forget to edit PKGBUILD and correct the core/linux version number in "depends" section to match the updated version). Finally, the system may be rebooted. [ This is only for the situation, when ZFS is not used for root filesystem. ]<br />
<br />
===Unofficial repository===<br />
<br />
For fast and effortless installation and updates, the [http://demizerone.com/archzfs "archzfs"] signed repository is available to add to your {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-core]<br />
Server = http://demizerone.com/$repo/$arch</nowiki><br />
}}<br />
<br />
The repository and packages are signed with the maintainer's PGP key which is verifiable here: http://demizerone.com. This key is not trusted by any of the Arch Linux master keys, so it will need to be locally signed before use. See [[pacman-key]].<br />
<br />
Add the maintainer's key,<br />
<br />
# pacman-key -r 0EE7A126<br />
<br />
and locally sign to add it to the system's trust database,<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Once the key has been signed, it is now possible to update the package database,<br />
<br />
# pacman -Syyu<br />
<br />
and install ZFS packages:<br />
<br />
# pacman -S archzfs<br />
<br />
===Archiso tracking repository===<br />
<br />
ZFS can easily be used from within the archiso live environment by using the special archiso tracking repository for ZFS. This repository makes it easy to install Arch Linux on a root ZFS filesystem, or to mount ZFS pools from within an archiso live environment using an up-to-date live medium. To use this repository from the live environment, add the following server line to pacman.conf:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki><br />
}}<br />
<br />
This repository and packages are also signed, so the key must be locally signed following the steps listed in the previous section before use. For a guide on how to install Arch Linux on to a root ZFS filesystem, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===mkinitramfs hook===<br />
<br />
If you are using ZFS on your root filesystem, then you will need to add the ZFS hook to [[Mkinitcpio|mkinitcpio.conf]]; if you are not using ZFS for your root filesystem, then you do not need to add the ZFS hook.<br />
<br />
You will need to change your [[kernel parameters]] to include the dataset you want to boot. You can use <code>zfs=bootfs</code> to use the ZFS bootfs (set via <code>zpool set bootfs=rpool/ROOT/arch rpool</code>) or you can set the [[kernel parameters]] to <code>zfs=<pool>/<dataset></code> to boot directly from a ZFS dataset.<br />
<br />
To see all available options for the ZFS hook:<br />
<br />
$ mkinitcpio -H zfs<br />
<br />
To use the mkinitcpio hook, you will need to add <code>zfs</code> to your <code>HOOKS</code> in <code>/etc/mkinitcpio.conf</code>:<br />
<br />
{{hc|/etc/mkinitcpio.conf|<br />
...<br />
HOOKS<nowiki>="base udev autodetect modconf encrypt zfs filesystems usbinput"</nowiki><br />
...<br />
}}<br />
<br />
{{note|It is not necessary to use the "fsck" hook with ZFS. ZFS automatically fixes any errors that occur within the filesystem. However, if the hook is required for another filesystem used on the system, such as ext4, the current ZFS packaging implementation does not yet properly handle fsck requests from mkinitcpio and an error is produced when generating a new ramdisk.}}<br />
<br />
It is important to place this after any hooks which are needed to prepare the drive before it is mounted. For example, if your ZFS volume is encrypted, then you will need to place encrypt before the zfs hook to unlock it first.<br />
<br />
Recreate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount your zpool in {{ic|/etc/fstab}}; the zfs daemon imports and mounts one zfs pool automatically. The daemon mounts a zfs pool reading the file {{ic|/etc/zfs/zpool.cache}}, so the zfs pool that you want to automatically mounted must write the file there.<br />
<br />
Set a pool as to be automatically mounted by the zfs daemon:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.service<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.service<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary to partition your drives before creating the zfs filesystem, this will be done automatically. However, if you feel the need to completely wipe your drive before creating the filesystem, this can be easily done with the dd command.<br />
<br />
# dd if=/dev/zero of=/dev/<device><br />
<br />
It should not have to be stated, but be careful with this command!<br />
<br />
Once you have the list of drives, it is now time to get the id's of the drives you will be using. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's for your device, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than your pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool. Change it to whatever you like.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that you want to include into your pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that you pool is mounted. Using {{ic|# zpool status}} will show that your pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot your computer to make sure your ZFS pool is mounted at boot. It is best to deal with all errors before transferring your data.<br />
<br />
== Usage ==<br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub your pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in your root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of your ZFS storage pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about your ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of your pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but you can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, which can be obtained by the <tt>getconf PAGESIZE</tt> command (default on x86_64 is 4KiB). Other options useful for keeping the system running well in low-memory situations are keeping it always synced, turning off compression, and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b $(getconf PAGESIZE) -o compression=off -o primarycache=metadata -o secondarycache=none -o sync=always <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permament you need to edit your {{ic|/etc/fstab}}. ZVOLs support discard, which can potentionally help ZFS's block allocator and reduce fragmentation for all other datasets when your swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
==Troubleshooting==<br />
<br />
=== does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. You can either place your spl hostid in the [[kernel parameters]] in your boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
You can always ignore the check adding {{ic|zfs_force&#61;1}} in your [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
First of all double check you actually exported the pool correctly. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means your hostid is not yet correctly set in the early boot phase and it confuses zfs. So you have to manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down your hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
Follow the previous section to set it.<br />
<br />
== Tips and tricks ==<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into your ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up your network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install your favorite text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import your pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount your boot partitions (if you have them):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into your ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check your kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, you will need to run depmod (in the chroot) with the correct kernel version of your chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in your chroot installation.<br />
<br />
Regenerate your ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]</div>Chungyhttps://wiki.archlinux.org/index.php?title=ZFS&diff=262586ZFS2013-06-12T23:10:39Z<p>Chungy: /* Swap partition */</p>
<hr />
<div>[[Category:File systems]]<br />
{{Article summary start}}<br />
{{Article summary text|This page provides basic guidelines for installing the native ZFS Linux kernel module.}}<br />
{{Article summary heading|Related}}<br />
{{Article summary wiki|Installing Arch Linux on ZFS}}<br />
{{Article summary wiki|ZFS on FUSE}}<br />
{{Article summary end}}<br />
<br />
[[Wikipedia:ZFS|ZFS]] is an advanced filesystem created by [[Wikipedia:Sun Microsystems|Sun Microsystems]] (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), [[Wikipedia:Copy-on-write|Copy-on-write]], [[Wikipedia:Snapshot (computer storage)|snapshots]], data integrity verification and automatic repair (scrubbing), [[Wikipedia:RAID-Z|RAID-Z]], and a maximum [[Wikipedia:Exabyte|16 Exabyte]] volume size. ZFS is licensed under the [[Wikipedia:CDDL|Common Development and Distribution License]] (CDDL).<br />
<br />
Described as [http://web.archive.org/web/20060428092023/http://www.sun.com/2004-0914/feature/ "The last word in filesystems"] ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with [http://zfsonlinux.org/ zfsonlinux.org] (ZOL).<br />
<br />
==Installation==<br />
<br />
===Building from AUR===<br />
<br />
The ZFS kernel module is available in the [[AUR]] via {{aur|zfs}}.<br />
<br />
{{note|The ZFS and SPL (Solaris Porting Layer is a Linux kernel module which provides many of the Solaris kernel APIs) kernel modules are tied to a specific kernel version. It would not be possible to apply any kernel updates until updated packages are uploaded to AUR or the archzfs repository.}}<br />
<br />
Should you wish to update the core/linux package before the AUR/zfs and AUR/spl packages' dependency lists are updated, a possible work-around is to remove (uninstall) spl and zfs packages (the respective modules and file system may stay in-use), update the core/linux package, build + install zfs and spl packages - just do not forget to edit PKGBUILD and correct the core/linux version number in "depends" section to match the updated version). Finally, the system may be rebooted. [ This is only for the situation, when ZFS is not used for root filesystem. ]<br />
<br />
===Unofficial repository===<br />
<br />
For fast and effortless installation and updates, the [http://demizerone.com/archzfs "archzfs"] signed repository is available to add to your {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-core]<br />
Server = http://demizerone.com/$repo/$arch</nowiki><br />
}}<br />
<br />
The repository and packages are signed with the maintainer's PGP key which is verifiable here: http://demizerone.com. This key is not trusted by any of the Arch Linux master keys, so it will need to be locally signed before use. See [[pacman-key]].<br />
<br />
Add the maintainer's key,<br />
<br />
# pacman-key -r 0EE7A126<br />
<br />
and locally sign to add it to the system's trust database,<br />
<br />
# pacman-key --lsign-key 0EE7A126<br />
<br />
Once the key has been signed, it is now possible to update the package database,<br />
<br />
# pacman -Syy<br />
<br />
and install ZFS packages:<br />
<br />
# pacman -S archzfs<br />
<br />
===Archiso tracking repository===<br />
<br />
ZFS can easily be used from within the archiso live environment by using the special archiso tracking repository for ZFS. This repository makes it easy to install Arch Linux on a root ZFS filesystem, or to mount ZFS pools from within an archiso live environment using an up-to-date live medium. To use this repository from the live environment, add the following server line to pacman.conf:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki><br />
}}<br />
<br />
This repository and packages are also signed, so the key must be locally signed following the steps listed in the previous section before use. For a guide on how to install Arch Linux on to a root ZFS filesystem, see [[Installing Arch Linux on ZFS]].<br />
<br />
==Configuration==<br />
<br />
ZFS is considered a "zero administration" filesystem by its creators; therefore, configuring ZFS is very straight forward. Configuration is done primarily with two commands: {{ic|zfs}} and {{ic|zpool}}.<br />
<br />
===mkinitramfs hook===<br />
<br />
If you are using ZFS on your root filesystem, then you will need to add the ZFS hook to [[Mkinitcpio|mkinitcpio.conf]]; if you are not using ZFS for your root filesystem, then you do not need to add the ZFS hook.<br />
<br />
You will need to change your [[kernel parameters]] to include the dataset you want to boot. You can use <code>zfs=bootfs</code> to use the ZFS bootfs (set via <code>zpool set bootfs=rpool/ROOT/arch rpool</code>) or you can set the [[kernel parameters]] to <code>zfs=<pool>/<dataset></code> to boot directly from a ZFS dataset.<br />
<br />
To see all available options for the ZFS hook:<br />
<br />
$ mkinitcpio -H zfs<br />
<br />
To use the mkinitcpio hook, you will need to add <code>zfs</code> to your <code>HOOKS</code> in <code>/etc/mkinitcpio.conf</code>:<br />
<br />
{{hc|/etc/mkinitcpio.conf|<br />
...<br />
HOOKS<nowiki>="base udev autodetect modconf encrypt zfs filesystems usbinput"</nowiki><br />
...<br />
}}<br />
<br />
{{note|It is not necessary to use the "fsck" hook with ZFS. ZFS automatically fixes any errors that occur within the filesystem. However, if the hook is required for another filesystem used on the system, such as ext4, the current ZFS packaging implementation does not yet properly handle fsck requests from mkinitcpio and an error is produced when generating a new ramdisk.}}<br />
<br />
It is important to place this after any hooks which are needed to prepare the drive before it is mounted. For example, if your ZFS volume is encrypted, then you will need to place encrypt before the zfs hook to unlock it first.<br />
<br />
Recreate the ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
===Automatic Start===<br />
<br />
For ZFS to live by its "zero administration" namesake, the zfs daemon must be loaded at startup. A benefit to this is that it is not necessary to mount your zpool in {{ic|/etc/fstab}}; the zfs daemon imports and mounts one zfs pool automatically. The daemon mounts a zfs pool reading the file {{ic|/etc/zfs/zpool.cache}}, so the zfs pool that you want to automatically mounted must write the file there.<br />
<br />
Set a pool as to be automatically mounted by the zfs daemon:<br />
# zpool set cachefile=/etc/zfs/zpool.cache <pool><br />
<br />
==Systemd==<br />
<br />
Enable the service so it is automatically started at boot time:<br />
<br />
# systemctl enable zfs.service<br />
<br />
To manually start the daemon:<br />
<br />
# systemctl start zfs.service<br />
<br />
==Create a storage pool==<br />
<br />
Use {{ic| # parted --list}} to see a list of all available drives. It is not necessary to partition your drives before creating the zfs filesystem, this will be done automatically. However, if you feel the need to completely wipe your drive before creating the filesystem, this can be easily done with the dd command.<br />
<br />
# dd if=/dev/zero of=/dev/<device><br />
<br />
It should not have to be stated, but be careful with this command!<br />
<br />
Once you have the list of drives, it is now time to get the id's of the drives you will be using. The [http://zfsonlinux.org/faq.html#WhatDevNamesShouldIUseWhenCreatingMyPool zfs on Linux developers recommend] using device ids when creating ZFS storage pools of less than 10 devices. To find the id's for your device, simply:<br />
<br />
$ ls -lah /dev/disk/by-id/<br />
<br />
The ids should look similar to the following:<br />
<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd<br />
lrwxrwxrwx 1 root root 9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb<br />
<br />
Now, finally, create the ZFS pool:<br />
<br />
# zpool create -f -m <mount> <pool> raidz <ids><br />
<br />
* '''create''': subcommand to create the pool.<br />
<br />
* '''-f''': Force creating the pool. This is to overcome the "EFI label error". See [[#does not contain an EFI label]].<br />
<br />
* '''-m''': The mount point of the pool. If this is not specified, than your pool will be mounted to {{ic|/<pool>}}.<br />
<br />
* '''pool''': This is the name of the pool. Change it to whatever you like.<br />
<br />
* '''raidz''': This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See [https://blogs.oracle.com/bonwick/entry/raid_z Jeff Bonwick's Blog -- RAID-Z] for more information about raidz.<br />
<br />
* '''ids''': The names of the drives or partitions that you want to include into your pool. Get it from {{ic|/dev/disk/by-id}}.<br />
<br />
Here is an example for the full command:<br />
<br />
# zpool create -f -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY ata-ST3000DM001-9YN166_S1F0JKRR ata-ST3000DM001-9YN166_S1F0KBP8 ata-ST3000DM001-9YN166_S1F0JTM1<br />
<br />
If the command is successful, there will be no output. Using the {{ic|$ mount}} command will show that you pool is mounted. Using {{ic|# zpool status}} will show that your pool has been created.<br />
<br />
{{hc|# zpool status|<br />
pool: bigdata<br />
state: ONLINE<br />
scan: none requested<br />
config:<br />
<br />
NAME STATE READ WRITE CKSUM<br />
bigdata ONLINE 0 0 0<br />
-0 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KDGY-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JKRR-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0KBP8-part1 ONLINE 0 0 0<br />
ata-ST3000DM001-9YN166_S1F0JTM1-part1 ONLINE 0 0 0<br />
<br />
errors: No known data errors<br />
}}<br />
<br />
At this point it would be good to reboot your computer to make sure your ZFS pool is mounted at boot. It is best to deal with all errors before transferring your data.<br />
<br />
== Usage ==<br />
<br />
To see all the commands available in ZFS, use :<br />
<br />
$ man zfs<br />
<br />
or:<br />
<br />
$ man zpool<br />
<br />
=== Scrub ===<br />
<br />
ZFS pools should be scrubbed at least once a week. To scrub your pool:<br />
<br />
# zpool scrub <pool><br />
<br />
To do automatic scrubbing once a week, set the following line in your root crontab:<br />
<br />
{{hc|# crontab -e|<br />
...<br />
30 19 * * 5 zpool scrub <pool><br />
...<br />
}}<br />
<br />
Replace {{ic|<pool>}} with the name of your ZFS storage pool.<br />
<br />
=== Check zfs pool status ===<br />
<br />
To print a nice table with statistics about your ZFS pool, including and read/write errors, use<br />
<br />
# zpool status -v<br />
<br />
=== Destroy a storage pool ===<br />
<br />
ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool:<br />
<br />
# zpool destroy <pool><br />
<br />
And now when checking the status:<br />
<br />
{{hc|# zpool status|no pools available}}<br />
<br />
To find the name of your pool, see [[#Check zfs pool status]].<br />
<br />
=== Export a storage pool ===<br />
<br />
If a storage pool is to be used on another system, it will first need to be exported. It is also necessary to export a pool if it has been imported from the archiso as the hostid is different in the archiso as it is in the booted system. The zpool command will refuse to import any storage pools that have not been exported. It is possible to force the import with the {{ic|-f}} argument, but this is considered bad form.<br />
<br />
Any attempts made to import an un-exported storage pool will result in an error stating the storage pool is in use by another system. This error can be produced at boot time abruptly abandoning the system in the busybox console and requiring an archiso to do an emergency repair by either exporting the pool, or adding the {{ic|zfs_force&#61;1}} to the kernel boot parameters (which is not ideal). See [[#On boot the zfs pool does not mount stating: "pool may be in use from other system"]]<br />
<br />
To export a pool,<br />
<br />
# zpool export bigdata<br />
<br />
=== Swap volume ===<br />
<br />
ZFS does not allow to use swapfiles, but you can use a ZFS volume (ZVOL) as swap. It is importart to set the ZVOL block size to match the system page size, for x86_64 systems that is 4k. Other options useful for keeping the system running well in low-memory situations are keeping it always synced, turning off compression, and not caching the zvol data.<br />
<br />
Create a 8GiB zfs volume:<br />
<br />
# zfs create -V 8G -b 4K -o compression=off -o primarycache=metadata -o secondarycache=none -o sync=always <pool>/swap<br />
<br />
Prepare it as swap partition:<br />
<br />
# mkswap -f /dev/zvol/<pool>/swap<br />
<br />
To make it permament you need to edit your {{ic|/etc/fstab}}. ZVOLs support discard, which can potentionally help ZFS's block allocator and reduce fragmentation for all other datasets when your swap is not full.<br />
<br />
Add a line to {{ic|/etc/fstab}}:<br />
<br />
/dev/zvol/<pool>/swap none swap discard 0 0<br />
<br />
==Troubleshooting==<br />
<br />
=== does not contain an EFI label ===<br />
<br />
The following error will occur when attempting to create a zfs filesystem,<br />
<br />
/dev/disk/by-id/<id> does not contain an EFI label but it may contain partition<br />
<br />
The way to overcome this is to use <code>-f</code> with the zfs create command.<br />
<br />
=== No hostid found ===<br />
<br />
An error that occurs at boot with the following lines appearing before initscript output:<br />
<br />
ZFS: No hostid found on kernel command line or /etc/hostid.<br />
<br />
This warning occurs because the ZFS module does not have access to the spl hosted. There are two solutions, for this. You can either place your spl hostid in the [[kernel parameters]] in your boot loader. For example, adding <code>spl.spl_hostid=0x00bab10c</code>.<br />
<br />
The other solution is to make sure that there is a hostid in <code>/etc/hostid</code>, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image.<br />
<br />
# mkinitcpio -p linux<br />
<br />
=== On boot the zfs pool does not mount stating: "pool may be in use from other system" ===<br />
<br />
You can always ignore the check adding {{ic|zfs_force&#61;1}} in your [[kernel parameters]], but it is not advisable as a permanent solution.<br />
<br />
First of all double check you actually exported the pool correctly. Exporting the zpool clears the hostid marking the ownership. So during the first boot the zpool should mount correctly. If it does not there is some other problem.<br />
<br />
Reboot again, if the zfs pool refuses to mount it means your hostid is not yet correctly set in the early boot phase and it confuses zfs. So you have to manually tell zfs the correct number, once the hostid is coherent across the reboots the zpool will mount correctly.<br />
<br />
Boot using zfs_force and write down your hostid. This one is just an example.<br />
% hostid<br />
0a0af0f8<br />
Follow the previous section to set it.<br />
<br />
== Tips and tricks ==<br />
<br />
=== Emergency chroot repair with archzfs ===<br />
<br />
Here is how to use the archiso to get into your ZFS filesystem for maintenance.<br />
<br />
Boot the latest archiso and bring up your network:<br />
<br />
# wifi-menu<br />
# ip link set eth0 up<br />
<br />
Test the network connection:<br />
<br />
# ping google.com<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
(optional) Install your favorite text editor:<br />
<br />
# pacman -S vim<br />
<br />
Add archzfs archiso repository to {{ic|pacman.conf}}:<br />
<br />
{{hc|/etc/pacman.conf|<nowiki><br />
[demz-repo-archiso]<br />
Server = http://demizerone.com/$repo/$arch</nowiki>}}<br />
<br />
Sync the pacman package database:<br />
<br />
# pacman -Syy<br />
<br />
Install the ZFS package group:<br />
<br />
# pacman -S archzfs<br />
<br />
Load the ZFS kernel modules:<br />
<br />
# modprobe zfs<br />
<br />
Import your pool:<br />
<br />
# zpool import -a -R /mnt<br />
<br />
Mount your boot partitions (if you have them):<br />
<br />
# mount /dev/sda2 /mnt/boot<br />
# mount /dev/sda1 /mnt/boot/efi<br />
<br />
Chroot into your ZFS filesystem:<br />
<br />
# arch-chroot /mnt /bin/bash<br />
<br />
Check your kernel version:<br />
<br />
# pacman -Qi linux<br />
# uname -r<br />
<br />
uname will show the kernel version of the archiso. If they are different, you will need to run depmod (in the chroot) with the correct kernel version of your chroot installation:<br />
<br />
# depmod -a 3.6.9-1-ARCH (version gathered from pacman -Qi linux)<br />
<br />
This will load the correct kernel modules for the kernel version installed in your chroot installation.<br />
<br />
Regenerate your ramdisk:<br />
<br />
# mkinitcpio -p linux<br />
<br />
There should be no errors.<br />
<br />
== See also ==<br />
<br />
* [http://zfsonlinux.org/ ZFS on Linux]<br />
* [http://zfsonlinux.org/faq.html ZFS on Linux FAQ]<br />
* [http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html FreeBSD Handbook -- The Z File System]<br />
* [http://docs.oracle.com/cd/E19253-01/819-5461/index.html Oracle Solaris ZFS Administration Guide]<br />
* [http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Solaris Internals -- ZFS Troubleshooting Guide]</div>Chungyhttps://wiki.archlinux.org/index.php?title=Python&diff=233974Python2012-11-06T00:51:04Z<p>Chungy: Notes on old versions of Python available via the AUR</p>
<hr />
<div>[[Category:Programming language]]<br />
[[zh-CN:Python]]<br />
{{Article summary start}}<br />
{{Article summary text|This article explains how to install and configure Python.}}<br />
{{Article summary heading|Related}}<br />
{{Article summary wiki|Python Package Guidelines}}<br />
{{Article summary wiki|mod_python}}<br />
{{Article summary wiki|Python VirtualEnv}}<br />
{{Article summary end}}<br />
<br />
[http://www.python.org Python] "is a remarkably powerful dynamic programming language that is used in a wide variety of application domains. Python is often compared to Tcl, Perl, Ruby, Scheme or Java."<br />
<br />
==Installation==<br />
<br />
There are currently two versions of Python: Python 3 (which is the default) and the older Python 2.<br />
<br />
===Python 3===<br />
<br />
Python 3 is the latest version of the language, and is '''incompatible with Python 2'''. The language is mostly the same, but many details, especially how built-in objects like dictionaries and strings work, have changed considerably, and a lot of deprecated features have finally been removed. Also, the standard library has been reorganized in a few prominent places. For an overview of the differences, visit [http://wiki.python.org/moin/Python2orPython3 Python2orPython3] and their relevant [http://diveintopython3.ep.io/porting-code-to-python-3-with-2to3.html chapter] in Dive into Python 3.<br />
<br />
To install the latest version of Python 3, [[pacman|install]] the {{Pkg|python}} package from the [[Official Repositories|official repositories]].<br />
<br />
If you would like to build the latest RC/betas from source, visit [http://www.python.org/download/ Python Downloads]. The [[Arch User Repository]] also contains good [[PKGBUILD]]s. If you do decide to build the RC, note that the binary (by default) installs to {{ic|/usr/local/bin/python3.x}}.<br />
<br />
===Python 2===<br />
To install the latest version of Python 2, [[pacman|install]] the {{Pkg|python2}} package from the [[Official Repositories|official repositories]].<br />
<br />
Python 2 will happily run alongside Python 3. You need to specify '''python2''' in order to run this version.<br />
<br />
Any program requiring Python 2 needs to point to {{ic|/usr/bin/python2}}, instead of {{ic|/usr/bin/python}}.<br />
<br />
To do so, open the program or script in a text editor and change the first line.<br />
<br />
The line will show one of the following:<br />
#!/usr/bin/env python<br />
or<br />
#!/usr/bin/python<br />
<br />
In both cases, just change {{ic|python}} to {{ic|python2}} and the program will then use Python 2 instead of Python 3.<br />
<br />
Another way to force the use of python2 without altering the scripts is to call it explicitely with python2, i.e.<br />
{{bc|python2 myScript.py}}<br />
<br />
Finally, you may not be able to control the script calls, but there is a way to trick the environment. It only works if the scripts use {{ic|#!/usr/bin/env python}}, it won't work with {{ic|#!/usr/bin/python}}. This trick relies on {{ic|env}} searching for the first corresponding entry in the PATH variable.<br />
First create a dummy folder.<br />
$ mkdir ~/bin<br />
Then add a symlink 'python' to python2 in it.<br />
$ ln -s /usr/bin/python2 ~/bin/python<br />
Finally put the new folder ''at the beginning'' of your PATH variable.<br />
$ export PATH=~/bin:$PATH<br />
Note that this change is not permanent and is only active in the current terminal session.<br />
To check which python interpreter is being used by {{ic|env}}, use the following command:<br />
$ which python<br />
<br />
==Integrated Development Environments==<br />
There are some IDEs for Python available in the [[Official Repositories|official repositories]].<br />
<br />
===Eric===<br />
For the latest Python 3 compatible version, [[pacman|install]] the {{Pkg|eric}} package.<br />
<br />
Version 4 of Eric is Python 2 compatible and can be installed with the {{Pkg|eric4}} package.<br />
<br />
These IDEs can also handle [[Ruby]].<br />
<br />
===Ninja===<br />
<br />
The Ninja IDE is provided by the package {{Pkg|ninja-ide}}.<br />
<br />
===Spyder===<br />
<br />
Spyder (previously known as Pydee) is a powerful interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. It focuses on scientific computations, providing a matlab-like environment. It can be installed with the package {{AUR|spyder}}<br />
<br />
== Getting easy_install ==<br />
<br />
The easy_install tool is available in the package {{Pkg|python-distribute}} or {{Pkg|python2-distribute}}.<br />
<br />
==Misc. Uses==<br />
Python is excellent for emergency situations, or when you simply need to work with files without writing any Perl, [[Bash]], or C.<br />
Here are some functions that I have found to be helpful ''(although this is by no means a comprehensive list)'';<br />
<br />
;open: open a file<br />
;urllib.urlopen: open a web page, {{ic|import urllib}} first<br />
;urllib.urlretrieve: download a file (by default, to {{ic|/tmp}}, but a directory can be specified), {{ic|import urllib}} first<br />
;os.listdir: list all files in a given directory, {{ic|import os}} first<br />
<br />
You will find that Python is used frequently, in applications ranging from the GIMP, to Sage. Python's scalability, popularity, and ease of coding make it an important addition to all desktops.<br />
<br />
==Widget bindings ==<br />
The following [[wikipedia:widget toolkit|widget toolkit]] bindings are available:<br />
*{{App|[[TkInter]]|[[Tk]] bindings|http://wiki.python.org/moin/TkInter|standard module}}<br />
*{{App|[[pyQt]]|[[Qt]] bindings|http://www.riverbankcomputing.co.uk/software/pyqt/intro|{{Pkg|pyqt}}}}<br />
*{{App|[[pySide]]|[[Qt]] bindings|http://www.pyside.org/|{{Pkg|python2-pyside}}}}<br />
*{{App|[[pyGTK]]|[[GTK+]] bindings|http://www.pygtk.org/|{{Pkg|pygtk}}}}<br />
*{{App|[[wxPython]]|[[wxWidgets]] bindings|http://wxpython.org/|{{Pkg|wxpython}}}}<br />
To use these with Python, you may need to install the associated widget kits.<br />
<br />
==Old versions==<br />
Old versions of Python are available via the [[Arch User Repository|AUR]] and may be useful for historical curiosity, old applications that don't run on current versions, or for testing Python programs intended to run on a distribution that comes with an older version (eg, RHEL 5.x has Python 2.4, or Ubuntu 12.04 has Python 3.1):<br />
*{{AUR|python15}}: Python 1.5.2<br />
*{{AUR|python16}}: Python 1.6.1<br />
*{{AUR|python24}}: Python 2.4.6<br />
*{{AUR|python25}}: Python 2.5.6<br />
*{{AUR|python26}}: Python 2.6.8<br />
*{{AUR|python30}}: Python 3.0.1<br />
*{{AUR|python31}}: Python 3.1.5<br />
*{{AUR|python32}}: Python 3.2.3<br />
<br />
As of November 2012, Python upstream only supports Python 2.6, 2.7, 3.1, 3.2, and 3.3 for security fixes. Using older versions for Internet-facing applications or untrusted code may be dangerous and is not recommended.<br />
<br />
Extra modules/libraries for old versions of Python may be found on the AUR by searching for python(version without decimal), eg searching for "python26" for 2.6 modules.<br />
<br />
==More Resources==<br />
* [http://shop.oreilly.com/product/9780596158071.do Learning Python] is one of the most comprehensive, up to date, and well-written books on Python available today.<br />
* [http://www.diveintopython.net/ Dive Into Python] is an excellent (free) resource, but perhaps for more advanced readers and [http://diveintopython3.ep.io/ has been updated for Python 3].<br />
* [http://www.swaroopch.com/notes/Python A Byte of Python] is a book suitable for users new to Python (and scripting in general).<br />
* [http://learnpythonthehardway.org Learn Python The Hard Way] the best intro to programming.<br />
* [http://facts.learnpython.org facts.learnpython.org] nice site to learn python.<br />
* [http://stephensugden.com/crash_into_python/ Crash into Python] Also known as ''Python for Programmers with 3 Hours'', this guide gives experienced developers from other languages a crash course on Python.<br />
* [http://www.apress.com/book/view/9781590598726 Beginning Game Development with Python and Pygame: From Novice to Professional] for games<br />
<br />
==For Fun==<br />
Try the following snippets from Python's interactive shell:<br />
<br />
>>> import this<br />
<br />
>>> from __future__ import braces<br />
<br />
>>> import antigravity</div>Chungy