ZFS

From ArchWiki
Revision as of 22:08, 13 August 2012 by Demizer (Talk | contribs) (Cleaned up the storage pool section.)

Jump to: navigation, search

Template:Article summary start Template:Article summary text Template:Article summary heading Template:Article summary wiki Template:Article summary wiki Template:Article summary end

ZFS is an advanced filesystem created by Sun Microsystems (now owned by Oracle) and released for OpenSolaris in November 2005. Features of ZFS include: pooled storage (integrated volume management -- zpool), Copy-on-write, snapshots, data integrity verification and automatic repair (scrubbing), RAID-Z, and a maximum 16 Exabyte volume size. ZFS is licensed under the Common Development and Distribution License (CDDL).

Described as "The last word in filesystems" ZFS is stable, fast, secure, and future-proof. Being licensed under the GPL incompatible CDDL, it is not possible for ZFS to be distributed along with the Linux Kernel. This requirement, however, does not prevent a native Linux kernel module from being developed and distributed by a third party, as is the case with zfsonlinux.org.

Installation

The ZFS kernel module is available in the AUR via zfsAUR.

Configuration

ZFS is considered a "zero administration" filesystem by its creators, therefore configuring ZFS is very straight forward. Configuration is done primarily with two commands, # zfs and # zpool.

initramfs hook

Should you wish to use ZFS on root file system, the kernel module must be added to the hooks list in mkinitcpio.conf:

To see all available options for the ZFS hook,

 $ mkiinitcpio -H zfs

Now edit the mkinitcpio config,

/etc/mkinitcpio.conf
...
HOOKS="base udev autodetect pata scsi sata filesystems usbinput fsck zfs"
...

Make sure its placement is last on the list to prevent errors from causing errors when the module is loaded. Such as #No hostid found.

Recreate the ramdisk

 # mkinitcpio -p linux

There should be no errors.

Add zfs to DAEMONS list

For ZFS to live by its "zero administration" namesake, the zfs daemon must bee loaded at startup. A benefit to this is that it is not necessary to mount your zpool in /etc/fstab; the zfs daemon imports and mounts your zfs pool automatically.

/etc/rc.conf
...
DAEMONS=(... @syslog-ng zfs dbus ...)
...

And now start the daemon if it is not started already

 # rc.d start zfs

Prepare your drives

If using storage drives larger than 2TB, you must partition them with gdisk. gdisk is available in the [extra] repository via gptfdisk.

Use # parted --list to see a list of all available drives. If any of the storage drives you plan to use show Error: /dev/<device>: unrecognised disk label when being listed by GNU Parted, then an error will occur when trying to create the pool. The partition tables of the storage drives containing this error will need to be redone.

Create a gpt partition table using the defaults (entire) disk and the default "linux filesystem". It is recommended by the ZFS on linux developers to use the entire disk. See ZFS on Linux FAQ.

Once the partition table is set, producing a list of your storage drives with GNU parted should produce similar output:

# parted --list
...

Model: ATA ST3000DM001-9YN1 (scsi)
Disk /dev/sdb: 3001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name              Flags
 1      1049kB  3001GB  3001GB               Linux filesystem

...

Create a storage pool

Firstly, the zfs on linux developers recommend using device ids when creating ZFS storage pools of less than 10 devices. To find the id's for your device, simply

 $ ls -lah /dev/disk/by-id/

The ids should look similar to the following:

 lrwxrwxrwx 1 root root  9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JKRR -> ../../sdc
 lrwxrwxrwx 1 root root 10 Aug 12 05:30 ata-ST3000DM001-9YN166_S1F0JKRR-part1 -> ../../sdc1
 lrwxrwxrwx 1 root root  9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0JTM1 -> ../../sde
 lrwxrwxrwx 1 root root 10 Aug 12 05:30 ata-ST3000DM001-9YN166_S1F0JTM1-part1 -> ../../sde1
 lrwxrwxrwx 1 root root  9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KBP8 -> ../../sdd
 lrwxrwxrwx 1 root root 10 Aug 12 05:30 ata-ST3000DM001-9YN166_S1F0KBP8-part1 -> ../../sdd1
 lrwxrwxrwx 1 root root  9 Aug 12 16:26 ata-ST3000DM001-9YN166_S1F0KDGY -> ../../sdb
 lrwxrwxrwx 1 root root 10 Aug 12 05:30 ata-ST3000DM001-9YN166_S1F0KDGY-part1 -> ../../sdb1

Now finally, create the ZFS pool:

 # zpool create -m <mount> <pool> raidz <ids>

or as an example

 # zpool create -m /mnt/data bigdata raidz ata-ST3000DM001-9YN166_S1F0KDGY-part1 ata-ST3000DM001-9YN166_S1F0JKRR-part1 ata-ST3000DM001-9YN166_S1F0KBP8-part1 ata-ST3000DM001-9YN166_S1F0JTM1-part1
  • create: subcommand to create the pool.
  • pool: This is the name of the pool. Change it to whatever you like.
  • -m: The mount point of the pool. If this is not specified, than your pool will be mounted to /<pool>.
  • raidz: This is the type of virtual device that will be created from the pool of devices. Raidz is a special implementation of raid5. See Jeff Bonwick's Blog -- RAID-Z for more information about raidz.
  • ids: The names of the drives or partions that you want to include into your pool. Get it from /dev/disk/by-id.

If the command is successful, there will be no output. Using the $ mount command will show that you pool is mounted. Using # zpool status will show that your pool has been created.

# zpool status
  pool: bigdata
 state: ONLINE
 scan: none requested
config:

        NAME                                       STATE     READ WRITE CKSUM
        bigdata                                    ONLINE       0     0     0
          -0                                       ONLINE       0     0     0
            ata-ST3000DM001-9YN166_S1F0KDGY-part1  ONLINE       0     0     0
            ata-ST3000DM001-9YN166_S1F0JKRR-part1  ONLINE       0     0     0
            ata-ST3000DM001-9YN166_S1F0KBP8-part1  ONLINE       0     0     0
            ata-ST3000DM001-9YN166_S1F0JTM1-part1  ONLINE       0     0     0

errors: No known data errors

At this point it would be good to reboot your computer to make sure your ZFS pool is mounted at boot. It is best to deal with all errors A.S.A.P. before transfering your data.

Usage

To see all the commands available in ZFS, use

 $ man zfs

or

 $ man zpool

Scrub

ZFS pools should be scrubbed at least once a week. To scrub your pool

 # zpool scrub <pool>

To do automatic scrubbing once a week, set the following line in your root crontab

# crontab -e
...
30 19 * * 5 zpool scrub <pool>
...

Replace <pool> with the name of your ZFS storage pool.

Check zfs pool status

To print a nice table with statistics about your ZFS pool, including and read/write errors, use

 # zpool status -v

Destroy a storage pool

ZFS makes it easy to destroy a mounted storage pool, removing all metadata about the ZFS device. This command destroys any data contained in the pool.

 # zpool destroy <pool>

and now when checking the status

# zpool status
no pools available

To find the name of your pool, see #Check zfs pool status.

Troubleshooting

No hostid found

An error that occurs at boot with the following lines appearing before initscript output:

 ZFS: No hostid found on kernel command line or /etc/hostid.

This error occurs because the zfs hook in /etc/mkinitcpio.conf is loaded before the filesystem. Move the zfs module hook after filesystems like so,

/etc/mkinitcpio.conf
...
HOOKS="base udev autodetect pata scsi sata filesystems usbinput fsck zfs"
...

and regenerate the ramdisk

 # mkinitcpio -p linux

reboot to verify changes are correct.

Tips and tricks

See also