Software RAID and LVM
|Summary help replacing me|
|This article will provide an example of how to install and configure Arch Linux with a software RAID or Logical Volume Manager (LVM).|
|Convert a single drive system to RAID|
|Installing with Fake RAID|
The combination of RAID and LVM provides numerous features with few caveats compared to just using RAID.
- 1 Preface
- 2 Introduction
- 3 Installation
- 3.1 Load kernel modules
- 3.2 Prepare the hard drives
- 3.3 RAID installation
- 3.4 LVM installation
- 3.5 Update RAID configuration
- 3.6 Prepare hard drive
- 3.7 Configure system
- 3.8 Conclusion
- 3.9 Install Grub on the Alternate Boot Drives
- 3.10 Archive your Filesystem Partition Scheme
- 4 Management
- 5 Mounting from a Live CD
- 6 Removing device, stop using the array
- 7 Adding a device to the array
- 8 Troubleshooting
- 9 Benchmarking
- 10 Additional Resources
Although RAID and LVM may seem like analogous technologies they each present unique features.
Template:Wikipedia Redundant Array of Independent Disks (RAID) is designed to prevent data loss in the event of a hard disk failure. There are different levels of RAID. RAID 0 (striping) is not really RAID at all, because it provides no redundancy. It does, however, provide a speed benefit. This example will utilize RAID 0 for swap, on the assumption that a desktop system is being used, where the speed increase is worth the possibility of system crash if one of your drives fails. On a server, a RAID 1 or RAID 5 array is more appropriate. The size of a RAID 0 array block device is the size of the smallest component partition times the number of component partitions.
RAID 1 is the most straightforward RAID level: straight mirroring. As with other RAID levels, it only makes sense if the partitions are on different physical disk drives. If one of those drives fails, the block device provided by the RAID array will continue to function as normal. The example will be using RAID 1 for everything except swap. Note that RAID 1 is the only option for the boot partition, because bootloaders (which read the boot partition) do not understand RAID, but a RAID 1 component partition can be read as a normal partition. The size of a RAID 1 array block device is the size of the smallest component partition.
RAID 5 requires 3 or more physical drives, and provides the redundancy of RAID 1 combined with the speed and size benefits of RAID 0. RAID 5 uses striping, like RAID 0, but also stores parity blocks distributed across each member disk. In the event of a failed disk, these parity blocks are used to reconstruct the data on a replacement disk. RAID 5 can withstand the loss of one member disk.
RAID does not provide a guarantee that your data is safe. If there is a fire, if your computer is stolen or if you have multiple hard drive failures, RAID will not protect your data. Therefore it is important to make backups. Whether you use tape drives, DVDs, CDROMs or another computer, keep an current copy of your data out of your computer (and preferably offsite). Get into the habit of making regular backups. You can also divide the data on your computer into current and archived directories. Then back up the current data frequently, and the archived data occasionally.
LVM (Logical Volume Management) makes use of the device-mapper feature of the Linux kernel to provide a system of partitions that is independent of the underlying disk's layout. What this means for you is that you can extend and shrink partitions (subject to the filesystem you use allowing this) and add/remove partitions without worrying about whether you have enough contiguous space on a particular disk, without getting caught up in the problems of fdisking a disk that is in use (and wondering whether the kernel is using the old or new partition table) and without having to move other partition out of the way.
This is strictly an ease-of-management issue: it does not provide any addition security. However, it sits nicely with the other two technologies we are using.
Note that LVM is not used for the boot partition, because of the bootloader problem.
This article uses an example with three similar 1TB SATA hard drives. The article assumes that the drives are accessible as Template:Filename, Template:Filename, and Template:Filename. If you are using IDE drives, for maximum performance make sure that each drive is a master on its own separate channel.
|LVM Logical Volumes||Template:Codeline||Template:Codeline||Template:Codeline||Template:Codeline|
|LVM Volume Groups||Template:Filename|
Many tutorials treat the swap space differently, either by creating a separate RAID1 array or a LVM logical volume. Creating the swap space on a separate array is not intended to provide additional redundancy, but instead, to prevent a corrupt swap space from rendering the system inoperable, which is more likely to happen when the swap space is located on the same partition as the root directory.
MBR vs. GPT
Template:Wikipedia The widespread Master Boot Record (MBR) partitioning scheme, dating from the early 1980s, imposed limitations which affected the use of modern hardware. GUID Partition Table (GPT) is a new standard for the layout of the partition table based on the UEFI specification derived from Intel. Although GPT provides a significant improvement over a MBR, it does require the additional step of creating an additional partition at the beginning of each disk for GRUB2 (see: GPT specific instructions).
This tutorial will use SYSLINUX instead of GRUB2. GRUB2 when used in conjunction with GPT requires an additional BIOS Boot Partition. Additionally, the 2011.08.19 Arch Linux installer does not support GRUB2.
GRUB2 support the current default style of metadata created by mdadm (i.e. 1.2) when combined with an initramfs, which has replaced in Arch Linux with mkinitcpio. SYSLINUX only supports version 1.0, and therefore requires the Template:Codeline option.
Some boot loaders (e.g. GRUB, LILO) will not support any 1.x metadata versions, and instead require the older version, 0.90. If you would like to use one of those boot loaders make sure to add the option Template:Codeline to the Template:Codeline array during RAID installation.
Obtain the latest installation media and boot the Arch Linux installer as outlined in the Beginners' Guide, or alternatively, in the Official Arch Linux Install Guide. Follow the directions outlined there until you have configured your network.
Load kernel modules
Enter another TTY terminal by typing Template:Keypress+Template:Keypress. Load the appropriate RAID (e.g. Template:Filename, Template:Filename, Template:Filename, Template:Filename, Template:Filename) and LVM (i.e. Template:Filename) modules. The following example makes use of RAID1 and RAID5.
# modprobe raid1 # modprobe raid5 # modprobe dm-mod
Prepare the hard drives
The boot partition must be RAID1, because GRUB does not have RAID drivers. Any other level will prevent your system from booting. Additionally, if there is a problem with one boot partition, the boot loader can boot normally from the other two partitions in the Template:Codeline array. Finally, the partition you boot from must not be striped (i.e. RAID5, RAID0).
Since most disk partitioning software does not support GPT (i.e. Template:Package Official, Template:Package Official) you will need to install Template:Package Official to set the partition type of the boot loader partitions.
Update the pacman database:
Refresh the package list:
$ pacman -Syy
Install Template:Package Official:
$ pacman -S gdisk
Partition hard drives
Name Flags Part Type FS Type [Label] Size (MB) ------------------------------------------------------------------------------- sda1 Boot Primary linux_raid_m 100.00 # /boot sda2 Primary linux_raid_m 2000.00 # /swap sda3 Primary linux_raid_m 97900.00 # /
Open Template:Codeline with the first hard drive:
$ gdisk /dev/sda
and type the following commands at the prompt:
- Add a new partition: Template:Keypress
- Select the default partition number: Template:Keypress
- Use the default for the first sector: Template:Keypress
- For Template:Filename and Template:Filename type the appropriate size in MB (i.e. Template:Codeline and Template:Codeline). For Template:Filename just hit Template:Keypress to select the remainder of the disk.
- Select Template:Codeline as the partition type: Template:Codeline
- Write the table to disk and exit: Template:Keypress
Clone partitions with sgdisk
$ sgdisk --backup=table /dev/sda $ sgdisk --load-backup=table /dev/sdb $ sgdisk --load-backup=table /dev/sdc
After creating the physical partitions, you are ready to setup the Template:Codeline, Template:Codeline, and Template:Codeline arrays with Template:Codeline. It is an advanced tool for RAID management that will be used to create a Template:Filename within the installation environment.
# mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sd[abc]3
# mdadm --create /dev/md1 --level=1 --raid-devices=3 /dev/sd[abc]2
# mdadm --create /dev/md2 --level=1 --raid-devices=3 --metadata=1.0 /dev/sd[abc]1
After you create a RAID volume, it will synchronize the contents of the physical partitions within the array. You can monitor the progress by refreshing the output of Template:Filename ten times per second with:
# watch -n .1 cat /proc/mdstat
Further information about the arrays is accessible with:
# mdadm --misc --detail /dev/md | less
Once synchronization is complete the Template:Codeline line should read Template:Codeline. Each device in the table at the bottom of the output should read Template:Codeline or Template:Codeline in the Template:Codeline column. Template:Codeline means each device is actively in the array.
This section will convert the two RAIDs into physical volumes (PVs). Then combine those PVs into a volume group (VG). The VG will then be divided into logical volumes (LVs) that will act like physical partitions (e.g. Template:Codeline, Template:Codeline, Template:Codeline). If you did not understand that make sure you read the LVM Introduction section.
Create physical volumes
Make the RAIDs accessible to LVM by converting them into physical volumes (PVs):
# pvcreate /dev/md0
Confirm that LVM has added the PVs with:
Create the volume group
Next step is to create a volume group (VG) on the PVs.
Create a volume group (VG) with the first PV:
# vgcreate VolGroupArray /dev/md0
Confirm that LVM has added the VG with:
Create logical volumes
Now we need to create logical volumes (LVs) on the VG, much like we would normally prepare a hard drive. In this example we will create separate Template:Codeline, Template:Codeline, Template:Codeline, Template:Codeline LVs. The LVs will be accessible as Template:Filename or Template:Filename.
Create a Template:Codeline LV:
# lvcreate -L 20G VolGroupArray -n lvroot
Create a Template:Codeline LV:
# lvcreate -L 15G VolGroupArray -n lvvar
Create a Template:Codeline LV that takes up the remainder of space in the VG:
# lvcreate -l +100%FREE VolGroupArray -n lvhome
Confirm that LVM has created the LVs with:
Update RAID configuration
Since the installer builds the initrd using Template:Filename in the target system, you should update that file with your RAID configuration. The original file can simply be deleted because it contains comments on how to fill it correctly, and that is something mdadm can do automatically for you. So let us delete the original and have mdadm create you a new one with the current setup:
# mdadm --examine --scan > /mnt/etc/mdadm.conf
Prepare hard drive
Follow the directions outlined the Installation section until you reach the Prepare Hard Drive section. Skip the first two steps and navigate to the Manually Configure block devices, filesystems and mountpoints page. Remember to only configure the PVs (e.g. Template:Filename) and not the actual disks (e.g. Template:Filename).
- Add the Template:Codeline module to the Template:Codeline list in Template:Filename.
- Add the Template:Codeline and Template:Codeline hooks to the Template:Codeline list in Template:Filename after Template:Codeline.
Once it is complete you can safely reboot your machine:
Install Grub on the Alternate Boot Drives
Once you have successfully booted your new system for the first time, you will want to install Grub onto the other two disks (or on the other disk if you have only 2 HDDs) so that, in the event of disk failure, the system can be booted from another drive. Log in to your new system as root and do:
# grub grub> device (hd0) /dev/sdb grub> root (hd0,0) grub> setup (hd0) grub> device (hd0) /dev/sdc grub> root (hd0,0) grub> setup (hd0) grub> quit
Archive your Filesystem Partition Scheme
Now that you are done, it is worth taking a second to archive off the partition state of each of your drives. This guarantees that it will be trivially easy to replace/rebuild a disk in the event that one fails. You do this with the
sfdisk tool and the following steps:
# mkdir /etc/partitions # sfdisk --dump /dev/sda >/etc/partitions/disc0.partitions # sfdisk --dump /dev/sdb >/etc/partitions/disc1.partitions # sfdisk --dump /dev/sdc >/etc/partitions/disc2.partitions
For LVM management, please have a look at LVM
Mounting from a Live CD
If you want to mount your RAID partition from a Live CD, use
# mdadm --assemble /dev/md0 /dev/sda3 /dev/sdb3 /dev/sdc3
(or whatever mdX and drives apply to you)
Removing device, stop using the array
You can remove a device from the array after you mark it as faulty.
# mdadm --fail /dev/md0 /dev/sdxx
Then you can remove it from the array.
# mdadm -r /dev/md0 /dev/sdxx
Remove device permanently (for example in the case you want to use it individally from now on). Issue the two commands described above then:
# mdadm --zero-superblock /dev/sdxx
After this you can use the disk as you did before creating the array.
Stop using an array:
- Umount target array
- Repeat the three command described in the beginning of this section on each device.
- Stop the array with:
mdadm --stop /dev/md0
- Remove the corresponding line from /etc/mdadm.conf
Adding a device to the array
Adding new devices with mdadm can be done on a running system with the devices mounted. Partition the new device "/dev/sdx" using the same layout as one of those already in the arrays "/dev/sda".
# sfdisk -d /dev/sda > table # sdfisk /dev/sdx < table
Assemble the RAID arrays if they are not already assembled:
# mdadm --assemble /dev/md1 /dev/sda1 /dev/sdb1 /dev/sdc1 # mdadm --assemble /dev/md2 /dev/sda2 /dev/sdb2 /dev/sdc2 # mdadm --assemble /dev/md0 /dev/sda3 /dev/sdb3 /dev/sdc3
First, add the new device as a Spare Device to all of the arrays. We will assume you have followed the guide and use separate arrays for /boot RAID 1 (/dev/md1), swap RAID 1 (/dev/md2) and root RAID 5 (/dev/md0).
# mdadm --add /dev/md1 /dev/sdx1 # mdadm --add /dev/md2 /dev/sdx2 # mdadm --add /dev/md0 /dev/sdx3
This should not take long for mdadm to do. Check the progress with:
# cat /proc/mdstat
Check that the device has been added with the command:
# mdadm --misc --detail /dev/md0
It should be listed as a Spare Device.
Tell mdadm to grow the arrays from 3 devices to 4 (or however many devices you want to use):
# mdadm --grow -n 4 /dev/md1 # mdadm --grow -n 4 /dev/md2 # mdadm --grow -n 4 /dev/md0
This will probably take several hours. You need to wait for it to finish before you can continue. Check the progress in /proc/mdstat. The RAID 1 arrays should automatically sync /boot and swap but you need to install Grub on the MBR of the new device manually. Installing_with_Software_RAID_or_LVM#Install_Grub_on_the_Alternate_Boot_Drives
The rest of this guide will explain how to resize the underlying LVM and filesystem on the RAID 5 array.
If you are have encrypted your LVM volumes with LUKS, you need resize the LUKS volume first. Otherwise, ignore this step.
# cryptsetup luksOpen /dev/md0 cryptedlvm # cryptsetup resize cryptedlvm
Activate the LVM volume groups:
# vgscan # vgchange -ay
Resize the LVM Physical Volume /dev/md0 (or e.g. /dev/mapper/cryptedlvm if using LUKS) to take up all the available space on the array. You can list them with the command "pvdisplay".
# pvresize /dev/md0
Resize the Logical Volume you wish to allocate the new space to. You can list them with "lvdisplay". Assuming you want to put it all to your /home volume:
# lvresize -l +100%FREE /dev/array/home
To resize the filesystem to allocate the new space use the appropriate tool. If using ext2 you can resize a mounted filesystem with ext2online. For ext3 you can use resize2fs or ext2resize but not while mounted.
You should check the filesystem before resizing.
# e2fsck -f /dev/array/home # resize2fs /dev/array/home
Read the manuals for lvresize and resize2fs if you want to customize the sizes for the volumes.
If you are getting error when you reboot about "invalid raid superblock magic" and you have additional hard drives other than the ones you installed to, check that your hard drive order is correct. During installation, your RAID devices may be hdd, hde and hdf, but during boot they may be hda, hdb and hdc. Adjust your kernel line in /boot/grub/menu.lst accordingly. This is what happened to me anyway.
Recovering from a broken or missing drive in the raid
You might get the above mentioned error also when one of the drives breaks for whatever reason. In that case you will have to fore the raid to still turn on even with one disk short. Type this (change where needed):
# mdadm --manage /dev/md0 --run
Now you should be able to mount it again with something like this (if you had it in fstab):
# mount /dev/md0
Now the raid should be working again and available to use, however with one disk short! So, to add that one disc partition it the way like described above in #Partition_the_Hard_Drives. Once that is done you can add the new disk to the raid by doing:
# mdadm --manage --add /dev/md0 /dev/sdd1
If you type:
# cat /proc/mdstat
you probably see that the raid is now active and rebuilding.
You also might want to update your /etc/mdadm.conf file by typing:
# mdadm --examine --scan > /etc/mdadm.conf
That should be about all steps required to recover your raid. It certainly worked for me when i had lost a dive due to a partition table corruption.
There are several tools for benchmarking a RAID. The most notable improvement is the speed increase when multiple threads are reading from the same RAID volume.
Tiobench specifically benchmarks these performance improvements by measuring fully-threaded I/O on the disk.
Bonnie++ tests database type access to one or more files, and creation, reading, and deleting of small files which can simulate the usage of programs such as Squid, INN, or Maildir format e-mail. The enclosed ZCAV program tests the performance of different zones of a hard drive without writing any data to the disk.
Template:Codeline should NOT be used to benchmark a RAID, because it provides very inconsistent results.
- LVM2 Resource Page on SourceWare.org
- RAID/Software on the Gentoo Wiki
- Software RAID Install on the Gentoo Wiki
- Software RAID in the new Linux 2.4 kernel, Part 1 and Part 2 in the Gentoo Linux Docs
- Linux RAID wiki entry on The Linux Kernel Archives
- Arch Linux software RAID installation guide on Linux 101
- Chapter 15: Redundant Array of Independent Disks (RAID) of Red Hat Enterprise Linux 6 Documentation
- Linux-RAID FAQ on the Linux Documentation Project
- Linux/Fedora: Encrypt /home and swap over RAID with dm-crypt by Justin Wells
RAID & LVM
- Setup Arch Linux on top of raid, LVM2 and encrypted partitions by Yannick Loth
- RAID vs. LVM on Stack Overflow
- What is better LVM on RAID or RAID on LVM? on Server Fault
- Managing RAID and LVM with Linux (v0.5) by Gregory Gulik
- Gentoo Linux x86 with Software Raid and LVM2 Quick Install Guide
- 2011-09-08 - Arch Linux - LVM & RAID (1.2 metadata) + SYSLINUX
- 2011-08-28 - Arch Linux - GRUB and GRUB2
- 2011-08-03 - Arch Linux - Can't install grub2 on software RAID
- 2011-07-29 - Gentoo - Use RAID metadata 1.2 in boot and root partition
- 2011-04-20 - Arch Linux - Software RAID and LVM questions
- 2011-03-12 - Arch Linux - Some newbie questions about installation, LVM, grub, RAID