Difference between revisions of "Software RAID and LVM"
(→Create logical volumes: found the correct section, sorry for the noise) |
(update as discussed in Help_talk:Style/Article_summary_templates#Deprecation_of_summaries_and_overviews) |
||
(6 intermediate revisions by 6 users not shown) | |||
Line 2: | Line 2: | ||
[[Category:Getting and installing Arch]] | [[Category:Getting and installing Arch]] | ||
[[Category:File systems]] | [[Category:File systems]] | ||
− | {{ | + | {{Related articles start}} |
− | {{ | + | {{Related|RAID}} |
− | + | {{Related|LVM}} | |
− | {{ | + | {{Related|Installing with Fake RAID}} |
− | + | {{Related|Convert a single drive system to RAID}} | |
− | + | {{Related articles end}} | |
− | + | This article will provide an example of how to install and configure Arch Linux with a software RAID or Logical Volume Manager (LVM). The combination of [[RAID]] and [[LVM]] provides numerous features with few caveats compared to just using RAID. | |
− | {{ | ||
− | {{ | ||
− | {{ | ||
− | + | == Introduction == | |
− | + | {{Warning|Be sure to review the [[RAID]] article and be aware of all applicable warnings, particularly if you select RAID5.}} | |
− | {{ | ||
Although [[RAID]] and [[LVM]] may seem like analogous technologies they each present unique features. This article uses an example with three similar 1TB SATA hard drives. The article assumes that the drives are accessible as {{ic|/dev/sda}}, {{ic|/dev/sdb}}, and {{ic|/dev/sdc}}. If you are using IDE drives, for maximum performance make sure that each drive is a master on its own separate channel. | Although [[RAID]] and [[LVM]] may seem like analogous technologies they each present unique features. This article uses an example with three similar 1TB SATA hard drives. The article assumes that the drives are accessible as {{ic|/dev/sda}}, {{ic|/dev/sdb}}, and {{ic|/dev/sdc}}. If you are using IDE drives, for maximum performance make sure that each drive is a master on its own separate channel. | ||
− | {{ | + | {{Tip|It is good practice to ensure that only the drives involved in the installation are attached while performing the installation.}} |
{| border="1" width="100%" style="text-align:center;" | {| border="1" width="100%" style="text-align:center;" | ||
Line 55: | Line 51: | ||
=== Swap space === | === Swap space === | ||
+ | |||
{{note|If you want extra performance, just let the kernel use distinct swap partitions as it does striping by default.}} | {{note|If you want extra performance, just let the kernel use distinct swap partitions as it does striping by default.}} | ||
Line 60: | Line 57: | ||
=== MBR vs. GPT === | === MBR vs. GPT === | ||
+ | |||
{{Wikipedia|GUID Partition Table}} | {{Wikipedia|GUID Partition Table}} | ||
The widespread [[Master Boot Record]] (MBR) partitioning scheme, dating from the early 1980s, imposed limitations which affected the use of modern hardware. [[GUID Partition Table]] (GPT) is a new standard for the layout of the partition table based on the [[Wikipedia:Unified Extensible Firmware Interface|UEFI]] specification derived from Intel. Although GPT provides a significant improvement over a MBR, it does require the additional step of creating an additional partition at the beginning of each disk for GRUB2 (see: [[GRUB2#GPT specific instructions|GPT specific instructions]]). | The widespread [[Master Boot Record]] (MBR) partitioning scheme, dating from the early 1980s, imposed limitations which affected the use of modern hardware. [[GUID Partition Table]] (GPT) is a new standard for the layout of the partition table based on the [[Wikipedia:Unified Extensible Firmware Interface|UEFI]] specification derived from Intel. Although GPT provides a significant improvement over a MBR, it does require the additional step of creating an additional partition at the beginning of each disk for GRUB2 (see: [[GRUB2#GPT specific instructions|GPT specific instructions]]). | ||
=== Boot loader === | === Boot loader === | ||
− | |||
− | + | This tutorial will use [[Syslinux|SYSLINUX]] instead of [[GRUB]]. GRUB when used in conjunction with [[GUID Partition Table|GPT]] requires an additional [[GRUB#GPT specific instructions|BIOS Boot Partition]]. Additionally, the [[DeveloperWiki:2011.08.19|2011.08.19]] Arch Linux installer does not support GRUB. | |
+ | |||
+ | GRUB supports the default style of metadata currently created by mdadm (i.e. 1.2) when combined with an initramfs, which has replaced in Arch Linux with [[mkinitcpio]]. SYSLINUX only supports version 1.0, and therefore requires the {{ic|<nowiki>--metadata=1.0</nowiki>}} option. | ||
− | Some boot loaders (e.g. [[GRUB]], [[LILO]]) will not support any 1.x metadata versions, and instead require the older version, 0.90. If you would like to use one of those boot loaders make sure to add the option {{ic|<nowiki>--metadata=0.90</nowiki>}} to the {{ic|/boot}} array during [[#RAID installation|RAID installation]]. | + | Some boot loaders (e.g. [[GRUB Legacy]], [[LILO]]) will not support any 1.x metadata versions, and instead require the older version, 0.90. If you would like to use one of those boot loaders make sure to add the option {{ic|<nowiki>--metadata=0.90</nowiki>}} to the {{ic|/boot}} array during [[#RAID installation|RAID installation]]. |
== Installation == | == Installation == | ||
+ | |||
Obtain the latest installation media and boot the Arch Linux installer as outlined in the [[Beginners' Guide#Preparation|Beginners' Guide]], or alternatively, in the [[Official Arch Linux Install Guide#Pre-Installation|Official Arch Linux Install Guide]]. Follow the directions outlined there until you have [[Beginners Guide#Configure Network (netinstall)|configured your network]]. | Obtain the latest installation media and boot the Arch Linux installer as outlined in the [[Beginners' Guide#Preparation|Beginners' Guide]], or alternatively, in the [[Official Arch Linux Install Guide#Pre-Installation|Official Arch Linux Install Guide]]. Follow the directions outlined there until you have [[Beginners Guide#Configure Network (netinstall)|configured your network]]. | ||
==== Load kernel modules ==== | ==== Load kernel modules ==== | ||
− | Enter another TTY terminal by typing {{ | + | |
+ | Enter another TTY terminal by typing {{ic|Alt}}+{{ic|F2}}. Load the appropriate RAID (e.g. {{ic|raid0}}, {{ic|raid1}}, {{ic|raid5}}, {{ic|raid6}}, {{ic|raid10}}) and LVM (i.e. {{ic|dm-mod}}) modules. The following example makes use of RAID1 and RAID5. | ||
# modprobe raid1 | # modprobe raid1 | ||
# modprobe raid5 | # modprobe raid5 | ||
Line 80: | Line 81: | ||
=== Prepare the hard drives === | === Prepare the hard drives === | ||
+ | |||
{{note|If your hard drives are already prepared and all you want to do is activate RAID and LVM jump to [[Installing_with_Software_RAID_or_LVM#Activate_existing_RAID_devices_and_LVM_volumes|Activate existing RAID devices and LVM volumes]]. This can be achieved with alternative partitioning software (see: [http://yannickloth.be/blog/2010/08/01/installing-archlinux-with-software-raid1-encrypted-filesystem-and-lvm2/ Article]).}} | {{note|If your hard drives are already prepared and all you want to do is activate RAID and LVM jump to [[Installing_with_Software_RAID_or_LVM#Activate_existing_RAID_devices_and_LVM_volumes|Activate existing RAID devices and LVM volumes]]. This can be achieved with alternative partitioning software (see: [http://yannickloth.be/blog/2010/08/01/installing-archlinux-with-software-raid1-encrypted-filesystem-and-lvm2/ Article]).}} | ||
Line 87: | Line 89: | ||
==== Install gdisk ==== | ==== Install gdisk ==== | ||
+ | |||
Since most disk partitioning software (i.e. fdisk and sfdisk) does not support GPT you will need to install {{Pkg|gptfdisk}} to set the partition type of the boot loader partitions. | Since most disk partitioning software (i.e. fdisk and sfdisk) does not support GPT you will need to install {{Pkg|gptfdisk}} to set the partition type of the boot loader partitions. | ||
Line 95: | Line 98: | ||
$ pacman -Syy | $ pacman -Syy | ||
− | Install | + | Install '''gptfdisk'''. |
− | |||
==== Partition hard drives ==== | ==== Partition hard drives ==== | ||
− | We will use | + | |
+ | We will use {{ic|gdisk}} to create three partitions on each of the three hard drives (i.e. {{ic|/dev/sda}}, {{ic|/dev/sdb}}, {{ic|/dev/sdc}}): | ||
Name Flags Part Type FS Type [Label] Size (MB) | Name Flags Part Type FS Type [Label] Size (MB) | ||
Line 108: | Line 111: | ||
Open {{ic|gdisk}} with the first hard drive: | Open {{ic|gdisk}} with the first hard drive: | ||
− | + | # gdisk /dev/sda | |
and type the following commands at the prompt: | and type the following commands at the prompt: | ||
− | # Add a new partition: {{ | + | # Add a new partition: {{ic|n}} |
− | # Select the default partition number: {{ | + | # Select the default partition number: {{ic|Enter}} |
− | # Use the default for the first sector: {{ | + | # Use the default for the first sector: {{ic|Enter}} |
− | # For {{ic|sda1}} and {{ic|sda2}} type the appropriate size in MB (i.e. {{ic|+100MB}} and {{ic|+2048M}}). For {{ic|sda3}} just hit {{ | + | # For {{ic|sda1}} and {{ic|sda2}} type the appropriate size in MB (i.e. {{ic|+100MB}} and {{ic|+2048M}}). For {{ic|sda3}} just hit {{ic|Enter}} to select the remainder of the disk. |
# Select {{ic|Linux RAID}} as the partition type: {{ic|fd00}} | # Select {{ic|Linux RAID}} as the partition type: {{ic|fd00}} | ||
− | # Write the table to disk and exit: {{ | + | # Write the table to disk and exit: {{ic|w}} |
Repeat this process for {{ic|/dev/sdb}} and {{ic|/dev/sdc}} or use the alternate {{ic|sgdisk}} method below. You may need to reboot to allow the kernel to recognize the new tables. | Repeat this process for {{ic|/dev/sdb}} and {{ic|/dev/sdc}} or use the alternate {{ic|sgdisk}} method below. You may need to reboot to allow the kernel to recognize the new tables. | ||
− | {{ | + | {{Note|Make sure to create the same exact partitions on each disk. If a group of partitions of different sizes are assembled to create a RAID partition, it will work, but ''the redundant partition will be in multiples of the size of the smallest partition'', leaving the unallocated space to waste.}} |
==== Clone partitions with sgdisk ==== | ==== Clone partitions with sgdisk ==== | ||
+ | |||
If you are using GPT, then you can use {{ic|sgdisk}} to clone the partition table from {{ic|/dev/sda}} to the other two hard drives: | If you are using GPT, then you can use {{ic|sgdisk}} to clone the partition table from {{ic|/dev/sda}} to the other two hard drives: | ||
$ sgdisk --backup=table /dev/sda | $ sgdisk --backup=table /dev/sda | ||
$ sgdisk --load-backup=table /dev/sdb | $ sgdisk --load-backup=table /dev/sdb | ||
$ sgdisk --load-backup=table /dev/sdc | $ sgdisk --load-backup=table /dev/sdc | ||
+ | |||
+ | {{Note| When using this method to clone the partition table of an active drive onto a replacement drive for the same system (e.g. RAID drive replacement), use {{ic| sgdisk -G /dev/<newDrive>}} to re-randomise the GUID of the disk and partitions to ensure they are unique.}} | ||
=== RAID installation === | === RAID installation === | ||
− | |||
− | + | After creating the physical partitions, you are ready to setup the '''/boot''', ''''/swap''', and '''/''' arrays with {{ic|mdadm}}. It is an advanced tool for RAID management that will be used to create a {{ic|/etc/mdadm.conf}} within the installation environment. | |
+ | |||
+ | Create the '''/''' array at {{ic|/dev/md0}}: | ||
# mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sd[abc]3 | # mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sd[abc]3 | ||
− | Create the | + | Create the '''/swap''' array at {{ic|/dev/md1}}: |
# mdadm --create /dev/md1 --level=1 --raid-devices=3 /dev/sd[abc]2 | # mdadm --create /dev/md1 --level=1 --raid-devices=3 /dev/sd[abc]2 | ||
− | {{ | + | {{Note|If you plan on installing a boot loader that does not support the 1.x version of RAID metadata make sure to add the {{ic|<nowiki>--metadata=0.90</nowiki>}} option to the following command.}} |
− | Create the | + | Create the '''/boot''' array at {{ic|/dev/md2}}: |
# mdadm --create /dev/md2 --level=1 --raid-devices=3 --metadata=1.0 /dev/sd[abc]1 | # mdadm --create /dev/md2 --level=1 --raid-devices=3 --metadata=1.0 /dev/sd[abc]1 | ||
==== Synchronization ==== | ==== Synchronization ==== | ||
− | {{ | + | |
+ | {{Tip|If you want to avoid the initial resync with new hard drives add the {{ic|--assume-clean}} flag.}} | ||
After you create a RAID volume, it will synchronize the contents of the physical partitions within the array. You can monitor the progress by refreshing the output of {{ic|/proc/mdstat}} ten times per second with: | After you create a RAID volume, it will synchronize the contents of the physical partitions within the array. You can monitor the progress by refreshing the output of {{ic|/proc/mdstat}} ten times per second with: | ||
# watch -n .1 cat /proc/mdstat | # watch -n .1 cat /proc/mdstat | ||
− | {{ | + | {{Tip|Follow the synchronization in another TTY terminal by typing {{ic|Alt+F3}} and then execute the above command.}} |
Further information about the arrays is accessible with: | Further information about the arrays is accessible with: | ||
Line 154: | Line 162: | ||
Once synchronization is complete the {{ic|State}} line should read {{ic|clean}}. Each device in the table at the bottom of the output should read {{ic|spare}} or {{ic|active sync}} in the {{ic|State}} column. {{ic|active sync}} means each device is actively in the array. | Once synchronization is complete the {{ic|State}} line should read {{ic|clean}}. Each device in the table at the bottom of the output should read {{ic|spare}} or {{ic|active sync}} in the {{ic|State}} column. {{ic|active sync}} means each device is actively in the array. | ||
− | {{ | + | {{Note|Since the RAID synchronization is transparent to the file-system you can proceed with the installation and reboot your computer when necessary.}} |
− | ==== | + | ==== Scrubbing ==== |
+ | It is good practice to regularly run data [http://en.wikipedia.org/wiki/Data_scrubbing scrubbing] to check for and fix errors. | ||
+ | {{Note|Depending on the size/configuration of the array, a scrub may take multiple hours to complete.}} | ||
− | + | To initiate a data scrub: | |
+ | # echo check > /sys/block/md0/md/sync_action | ||
− | + | As with many tasks/items relating to mdadm, the status of the scrub can be queried: | |
− | # | + | # cat /proc/mdstat |
− | + | Example: | |
− | + | {{hc|$ cat /proc/mdstat|<nowiki> | |
+ | Personalities : [raid6] [raid5] [raid4] [raid1] | ||
+ | md0 : active raid1 sdb1[0] sdc1[1] | ||
+ | 3906778112 blocks super 1.2 [2/2] [UU] | ||
+ | [>....................] check = 4.0% (158288320/3906778112) finish=386.5min speed=161604K/sec | ||
+ | bitmap: 0/30 pages [0KB], 65536KB chunk | ||
+ | </nowiki>}} | ||
To stop a currently running data scrub safely: | To stop a currently running data scrub safely: | ||
− | # echo idle | + | # echo idle > /sys/block/md0/md/sync_action |
+ | |||
+ | When the scrub is complete, admins may check how many blocks (if any) have been flagged as bad: | ||
+ | # cat /sys/block/md0/md/mismatch_cnt | ||
+ | |||
+ | The check operation scans the drives for bad sectors and automatically repairs them. If it finds good sectors that contain bad data (the data in a sector does not agree with what the data from another disk indicates that it should be, for example the parity block + the other data blocks would cause us to think that this data block is incorrect), then no action is taken, but the event is logged (see below). This "do nothing" allows admins to inspect the data in the sector and the data that would be produced by rebuilding the sectors from redundant information and pick the correct data to keep. | ||
+ | |||
+ | ===== General Notes on Scrubbing ===== | ||
+ | {{Note|Users may alternatively echo '''repair''' to /sys/block/md0/md/sync_action but this is ill-advised since if a mismatch in the data is encountered, it would be automatically updated to be consistent. The danger is that we really don't know whether it's the parity or the data block that's correct (or which data block in case of RAID1). It's luck-of-the-draw whether or not the operation gets the right data instead of the bad data.}} | ||
+ | |||
+ | It is a good idea to set up a cron job as root to schedule a periodic scrub. See {{AUR|raid-check}} which can assist with this. | ||
− | + | ===== RAID1 and RAID10 Notes on Scrubbing ===== | |
+ | Due to the fact that RAID1 and RAID10 writes in the kernel are unbuffered, an array can have non-0 mismatch counts even when the array is healthy. These non-0 counts will only exist in transient data areas where they don't pose a problem. However, since we can't tell the difference between a non-0 count that is just in transient data or a non-0 count that signifies a real problem. This fact is a source of false positives for RAID1 and RAID10 arrays. It is however recommended to still scrub to catch and correct any bad sectors there might be in the devices. | ||
=== LVM installation === | === LVM installation === | ||
+ | |||
This section will convert the two RAIDs into physical volumes (PVs). Then combine those PVs into a volume group (VG). The VG will then be divided into logical volumes (LVs) that will act like physical partitions (e.g. {{ic|/}}, {{ic|/var}}, {{ic|/home}}). If you did not understand that make sure you read the [[LVM#Introduction|LVM Introduction]] section. | This section will convert the two RAIDs into physical volumes (PVs). Then combine those PVs into a volume group (VG). The VG will then be divided into logical volumes (LVs) that will act like physical partitions (e.g. {{ic|/}}, {{ic|/var}}, {{ic|/home}}). If you did not understand that make sure you read the [[LVM#Introduction|LVM Introduction]] section. | ||
==== Create physical volumes ==== | ==== Create physical volumes ==== | ||
− | Make the RAIDs accessible to LVM by converting them into physical volumes (PVs) | + | |
+ | Make the RAIDs accessible to LVM by converting them into physical volumes (PVs) using the following command. Repeat this action for each of the RAID arrays created above. | ||
# pvcreate /dev/md0 | # pvcreate /dev/md0 | ||
− | {{ | + | {{Note|This might fail if you are creating PVs on an existing Volume Group. If so you might want to add {{ic|-ff}} option.}} |
Confirm that LVM has added the PVs with: | Confirm that LVM has added the PVs with: | ||
Line 184: | Line 214: | ||
==== Create the volume group ==== | ==== Create the volume group ==== | ||
+ | |||
Next step is to create a volume group (VG) on the PVs. | Next step is to create a volume group (VG) on the PVs. | ||
Line 193: | Line 224: | ||
==== Create logical volumes ==== | ==== Create logical volumes ==== | ||
+ | |||
Now we need to create logical volumes (LVs) on the VG, much like we would normally [[Beginners Guide#Prepare the storage drive|prepare a hard drive]]. In this example we will create separate {{ic|/}}, {{ic|/var}}, {{ic|/swap}}, {{ic|/home}} LVs. The LVs will be accessible as {{ic|/dev/mapper/VolGroupArray-<lvname>}} or {{ic|/dev/VolGroupArray/<lvname>}}. | Now we need to create logical volumes (LVs) on the VG, much like we would normally [[Beginners Guide#Prepare the storage drive|prepare a hard drive]]. In this example we will create separate {{ic|/}}, {{ic|/var}}, {{ic|/swap}}, {{ic|/home}} LVs. The LVs will be accessible as {{ic|/dev/mapper/VolGroupArray-<lvname>}} or {{ic|/dev/VolGroupArray/<lvname>}}. | ||
− | Create a | + | Create a '''/''' LV: |
# lvcreate -L 20G VolGroupArray -n lvroot | # lvcreate -L 20G VolGroupArray -n lvroot | ||
− | Create a | + | Create a '''/var''' LV: |
# lvcreate -L 15G VolGroupArray -n lvvar | # lvcreate -L 15G VolGroupArray -n lvvar | ||
− | {{ | + | {{Note|If you would like to add the swap space to the LVM create a {{ic|/swap}} LV with the {{ic|-C y}} option, which creates a contiguous partition, so that your swap space does not get partitioned over one or more disks nor over non-contiguous physical extents: |
# lvcreate -C y -L 2G VolGroupArray -n lvswap | # lvcreate -C y -L 2G VolGroupArray -n lvswap | ||
}} | }} | ||
− | Create a | + | Create a '''/home''' LV that takes up the remainder of space in the VG: |
# lvcreate -l +100%FREE VolGroupArray -n lvhome | # lvcreate -l +100%FREE VolGroupArray -n lvhome | ||
Line 211: | Line 243: | ||
# lvdisplay | # lvdisplay | ||
− | {{ | + | {{Tip|You can start out with relatively small logical volumes and expand them later if needed. For simplicity, leave some free space in the volume group so there is room for expansion.}} |
=== Update RAID configuration === | === Update RAID configuration === | ||
+ | |||
Since the installer builds the initrd using {{ic|/etc/mdadm.conf}} in the target system, you should update that file with your RAID configuration. The original file can simply be deleted because it contains comments on how to fill it correctly, and that is something mdadm can do automatically for you. So let us delete the original and have mdadm create you a new one with the current setup: | Since the installer builds the initrd using {{ic|/etc/mdadm.conf}} in the target system, you should update that file with your RAID configuration. The original file can simply be deleted because it contains comments on how to fill it correctly, and that is something mdadm can do automatically for you. So let us delete the original and have mdadm create you a new one with the current setup: | ||
# mdadm --examine --scan > /etc/mdadm.conf | # mdadm --examine --scan > /etc/mdadm.conf | ||
Line 220: | Line 253: | ||
=== Prepare hard drive === | === Prepare hard drive === | ||
+ | |||
Follow the directions outlined the [[Beginners' Guide#Installation|Installation]] section until you reach the ''Prepare Hard Drive'' section. Skip the first two steps and navigate to the ''Manually Configure block devices, filesystems and mountpoints'' page. Remember to only configure the PVs (e.g. {{ic|/dev/mapper/VolGroupArray-lvhome}}) and '''not''' the actual disks (e.g. {{ic|/dev/sda1}}). | Follow the directions outlined the [[Beginners' Guide#Installation|Installation]] section until you reach the ''Prepare Hard Drive'' section. Skip the first two steps and navigate to the ''Manually Configure block devices, filesystems and mountpoints'' page. Remember to only configure the PVs (e.g. {{ic|/dev/mapper/VolGroupArray-lvhome}}) and '''not''' the actual disks (e.g. {{ic|/dev/sda1}}). | ||
− | {{ | + | {{Warning|{{ic|mkfs.xfs}} will not align the chunk size and stripe size for optimum performance (see: [http://www.linuxpromagazine.com/Issues/2009/108/RAID-Performance Optimum RAID]).}} |
=== Configure system === | === Configure system === | ||
− | |||
− | ==== | + | {{Warning|Follow the steps in the [[Lvm#Important|LVM Important]] section before proceeding with the installation.}} |
+ | |||
+ | ==== mkinitcpio.conf ==== | ||
+ | |||
[[mkinitcpio]] can use a hook to assemble the arrays on boot. For more information see [[mkinitcpio#Using RAID|mkinitpcio Using RAID]]. | [[mkinitcpio]] can use a hook to assemble the arrays on boot. For more information see [[mkinitcpio#Using RAID|mkinitpcio Using RAID]]. | ||
# Add the {{ic|dm_mod}} module to the {{ic|MODULES}} list in {{ic|/etc/mkinitcpio.conf}}. | # Add the {{ic|dm_mod}} module to the {{ic|MODULES}} list in {{ic|/etc/mkinitcpio.conf}}. | ||
Line 233: | Line 269: | ||
=== Conclusion === | === Conclusion === | ||
+ | |||
Once it is complete you can safely reboot your machine: | Once it is complete you can safely reboot your machine: | ||
# reboot | # reboot | ||
=== Install the bootloader on the Alternate Boot Drives=== | === Install the bootloader on the Alternate Boot Drives=== | ||
+ | |||
Once you have successfully booted your new system for the first time, you will want to install the bootloader onto the other two disks (or on the other disk if you have only 2 HDDs) so that, in the event of disk failure, the system can be booted from any of the remaining drives (e.g. by switching the boot order in the BIOS). The method depends on the bootloader system you're using: | Once you have successfully booted your new system for the first time, you will want to install the bootloader onto the other two disks (or on the other disk if you have only 2 HDDs) so that, in the event of disk failure, the system can be booted from any of the remaining drives (e.g. by switching the boot order in the BIOS). The method depends on the bootloader system you're using: | ||
==== Syslinux ==== | ==== Syslinux ==== | ||
+ | |||
Log in to your new system as root and do: | Log in to your new system as root and do: | ||
# /usr/sbin/syslinux-install_update -iam | # /usr/sbin/syslinux-install_update -iam | ||
Line 253: | Line 292: | ||
Installed MBR (/usr/lib/syslinux/gptmbr.bin) to /dev/sdb | Installed MBR (/usr/lib/syslinux/gptmbr.bin) to /dev/sdb | ||
− | ==== | + | ==== GRUB legacy ==== |
+ | |||
Log in to your new system as root and do: | Log in to your new system as root and do: | ||
# grub | # grub | ||
Line 264: | Line 304: | ||
grub> quit | grub> quit | ||
− | === Archive your | + | === Archive your filesystem partition scheme === |
+ | |||
Now that you are done, it is worth taking a second to archive off the partition state of each of your drives. This guarantees that it will be trivially easy to replace/rebuild a disk in the event that one fails. You do this with the <code>sfdisk</code> tool and the following steps: | Now that you are done, it is worth taking a second to archive off the partition state of each of your drives. This guarantees that it will be trivially easy to replace/rebuild a disk in the event that one fails. You do this with the <code>sfdisk</code> tool and the following steps: | ||
Line 275: | Line 316: | ||
For further information on how to maintain your software RAID or LVM review the [[RAID]] and [[LVM]] aritcles. | For further information on how to maintain your software RAID or LVM review the [[RAID]] and [[LVM]] aritcles. | ||
− | == | + | == See also == |
* [http://yannickloth.be/blog/2010/08/01/installing-archlinux-with-software-raid1-encrypted-filesystem-and-lvm2/ Setup Arch Linux on top of raid, LVM2 and encrypted partitions] by Yannick Loth | * [http://yannickloth.be/blog/2010/08/01/installing-archlinux-with-software-raid1-encrypted-filesystem-and-lvm2/ Setup Arch Linux on top of raid, LVM2 and encrypted partitions] by Yannick Loth | ||
* [http://stackoverflow.com/questions/237434/raid-verses-lvm RAID vs. LVM] on [[Wikipedia:Stack Overflow|Stack Overflow]] | * [http://stackoverflow.com/questions/237434/raid-verses-lvm RAID vs. LVM] on [[Wikipedia:Stack Overflow|Stack Overflow]] | ||
Line 281: | Line 322: | ||
* [http://www.gagme.com/greg/linux/raid-lvm.php Managing RAID and LVM with Linux (v0.5)] by Gregory Gulik | * [http://www.gagme.com/greg/linux/raid-lvm.php Managing RAID and LVM with Linux (v0.5)] by Gregory Gulik | ||
* [http://www.gentoo.org/doc/en/gentoo-x86+raid+lvm2-quickinstall.xml Gentoo Linux x86 with Software Raid and LVM2 Quick Install Guide] | * [http://www.gentoo.org/doc/en/gentoo-x86+raid+lvm2-quickinstall.xml Gentoo Linux x86 with Software Raid and LVM2 Quick Install Guide] | ||
− | |||
− | |||
* 2011-09-08 - Arch Linux - [https://bbs.archlinux.org/viewtopic.php?id=126172 LVM & RAID (1.2 metadata) + SYSLINUX] | * 2011-09-08 - Arch Linux - [https://bbs.archlinux.org/viewtopic.php?id=126172 LVM & RAID (1.2 metadata) + SYSLINUX] | ||
* 2011-04-20 - Arch Linux - [https://bbs.archlinux.org/viewtopic.php?pid=965357 Software RAID and LVM questions] | * 2011-04-20 - Arch Linux - [https://bbs.archlinux.org/viewtopic.php?pid=965357 Software RAID and LVM questions] | ||
* 2011-03-12 - Arch Linux - [https://bbs.archlinux.org/viewtopic.php?id=114965 Some newbie questions about installation, LVM, grub, RAID] | * 2011-03-12 - Arch Linux - [https://bbs.archlinux.org/viewtopic.php?id=114965 Some newbie questions about installation, LVM, grub, RAID] |
Revision as of 06:43, 11 December 2013
This article will provide an example of how to install and configure Arch Linux with a software RAID or Logical Volume Manager (LVM). The combination of RAID and LVM provides numerous features with few caveats compared to just using RAID.
Contents
- 1 Introduction
- 2 Installation
- 3 Management
- 4 See also
Introduction
Although RAID and LVM may seem like analogous technologies they each present unique features. This article uses an example with three similar 1TB SATA hard drives. The article assumes that the drives are accessible as /dev/sda
, /dev/sdb
, and /dev/sdc
. If you are using IDE drives, for maximum performance make sure that each drive is a master on its own separate channel.
LVM Logical Volumes | /
|
/var
|
/swap
|
/home
|
LVM Volume Groups | /dev/VolGroupArray
|
RAID Arrays | /dev/md0
|
/dev/md1
|
Physical Partitions | /dev/sda1
|
/dev/sdb1
|
/dev/sdc1
|
/dev/sda2
|
/dev/sdb2
|
/dev/sdc2
|
Hard Drives | /dev/sda
|
/dev/sdb
|
/dev/sdc
|
Swap space
Many tutorials treat the swap space differently, either by creating a separate RAID1 array or a LVM logical volume. Creating the swap space on a separate array is not intended to provide additional redundancy, but instead, to prevent a corrupt swap space from rendering the system inoperable, which is more likely to happen when the swap space is located on the same partition as the root directory.
MBR vs. GPT
Template:Wikipedia The widespread Master Boot Record (MBR) partitioning scheme, dating from the early 1980s, imposed limitations which affected the use of modern hardware. GUID Partition Table (GPT) is a new standard for the layout of the partition table based on the UEFI specification derived from Intel. Although GPT provides a significant improvement over a MBR, it does require the additional step of creating an additional partition at the beginning of each disk for GRUB2 (see: GPT specific instructions).
Boot loader
This tutorial will use SYSLINUX instead of GRUB. GRUB when used in conjunction with GPT requires an additional BIOS Boot Partition. Additionally, the 2011.08.19 Arch Linux installer does not support GRUB.
GRUB supports the default style of metadata currently created by mdadm (i.e. 1.2) when combined with an initramfs, which has replaced in Arch Linux with mkinitcpio. SYSLINUX only supports version 1.0, and therefore requires the --metadata=1.0
option.
Some boot loaders (e.g. GRUB Legacy, LILO) will not support any 1.x metadata versions, and instead require the older version, 0.90. If you would like to use one of those boot loaders make sure to add the option --metadata=0.90
to the /boot
array during RAID installation.
Installation
Obtain the latest installation media and boot the Arch Linux installer as outlined in the Beginners' Guide, or alternatively, in the Official Arch Linux Install Guide. Follow the directions outlined there until you have configured your network.
Load kernel modules
Enter another TTY terminal by typing Alt
+F2
. Load the appropriate RAID (e.g. raid0
, raid1
, raid5
, raid6
, raid10
) and LVM (i.e. dm-mod
) modules. The following example makes use of RAID1 and RAID5.
# modprobe raid1 # modprobe raid5 # modprobe dm-mod
Prepare the hard drives
Each hard drive will have a 100MB /boot
partition, 2048MB /swap
partition, and a /
partition that takes up the remainder of the disk.
The boot partition must be RAID1, because GRUB does not have RAID drivers. Any other level will prevent your system from booting. Additionally, if there is a problem with one boot partition, the boot loader can boot normally from the other two partitions in the /boot
array. Finally, the partition you boot from must not be striped (i.e. RAID5, RAID0).
Install gdisk
Since most disk partitioning software (i.e. fdisk and sfdisk) does not support GPT you will need to install gptfdisk to set the partition type of the boot loader partitions.
Update the pacman database:
$ pacman-db-upgrade
Refresh the package list:
$ pacman -Syy
Install gptfdisk.
Partition hard drives
We will use gdisk
to create three partitions on each of the three hard drives (i.e. /dev/sda
, /dev/sdb
, /dev/sdc
):
Name Flags Part Type FS Type [Label] Size (MB) ------------------------------------------------------------------------------- sda1 Boot Primary linux_raid_m 100.00 # /boot sda2 Primary linux_raid_m 2000.00 # /swap sda3 Primary linux_raid_m 97900.00 # /
Open gdisk
with the first hard drive:
# gdisk /dev/sda
and type the following commands at the prompt:
- Add a new partition:
n
- Select the default partition number:
Enter
- Use the default for the first sector:
Enter
- For
sda1
andsda2
type the appropriate size in MB (i.e.+100MB
and+2048M
). Forsda3
just hitEnter
to select the remainder of the disk. - Select
Linux RAID
as the partition type:fd00
- Write the table to disk and exit:
w
Repeat this process for /dev/sdb
and /dev/sdc
or use the alternate sgdisk
method below. You may need to reboot to allow the kernel to recognize the new tables.
Clone partitions with sgdisk
If you are using GPT, then you can use sgdisk
to clone the partition table from /dev/sda
to the other two hard drives:
$ sgdisk --backup=table /dev/sda $ sgdisk --load-backup=table /dev/sdb $ sgdisk --load-backup=table /dev/sdc
sgdisk -G /dev/<newDrive>
to re-randomise the GUID of the disk and partitions to ensure they are unique.RAID installation
After creating the physical partitions, you are ready to setup the /boot, '/swap, and / arrays with mdadm
. It is an advanced tool for RAID management that will be used to create a /etc/mdadm.conf
within the installation environment.
Create the / array at /dev/md0
:
# mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sd[abc]3
Create the /swap array at /dev/md1
:
# mdadm --create /dev/md1 --level=1 --raid-devices=3 /dev/sd[abc]2
--metadata=0.90
option to the following command.Create the /boot array at /dev/md2
:
# mdadm --create /dev/md2 --level=1 --raid-devices=3 --metadata=1.0 /dev/sd[abc]1
Synchronization
--assume-clean
flag.After you create a RAID volume, it will synchronize the contents of the physical partitions within the array. You can monitor the progress by refreshing the output of /proc/mdstat
ten times per second with:
# watch -n .1 cat /proc/mdstat
Alt+F3
and then execute the above command.Further information about the arrays is accessible with:
# mdadm --misc --detail /dev/md[012] | less
Once synchronization is complete the State
line should read clean
. Each device in the table at the bottom of the output should read spare
or active sync
in the State
column. active sync
means each device is actively in the array.
Scrubbing
It is good practice to regularly run data scrubbing to check for and fix errors.
To initiate a data scrub:
# echo check > /sys/block/md0/md/sync_action
As with many tasks/items relating to mdadm, the status of the scrub can be queried:
# cat /proc/mdstat
Example:
$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1] md0 : active raid1 sdb1[0] sdc1[1] 3906778112 blocks super 1.2 [2/2] [UU] [>....................] check = 4.0% (158288320/3906778112) finish=386.5min speed=161604K/sec bitmap: 0/30 pages [0KB], 65536KB chunk
To stop a currently running data scrub safely:
# echo idle > /sys/block/md0/md/sync_action
When the scrub is complete, admins may check how many blocks (if any) have been flagged as bad:
# cat /sys/block/md0/md/mismatch_cnt
The check operation scans the drives for bad sectors and automatically repairs them. If it finds good sectors that contain bad data (the data in a sector does not agree with what the data from another disk indicates that it should be, for example the parity block + the other data blocks would cause us to think that this data block is incorrect), then no action is taken, but the event is logged (see below). This "do nothing" allows admins to inspect the data in the sector and the data that would be produced by rebuilding the sectors from redundant information and pick the correct data to keep.
General Notes on Scrubbing
It is a good idea to set up a cron job as root to schedule a periodic scrub. See raid-checkAUR which can assist with this.
RAID1 and RAID10 Notes on Scrubbing
Due to the fact that RAID1 and RAID10 writes in the kernel are unbuffered, an array can have non-0 mismatch counts even when the array is healthy. These non-0 counts will only exist in transient data areas where they don't pose a problem. However, since we can't tell the difference between a non-0 count that is just in transient data or a non-0 count that signifies a real problem. This fact is a source of false positives for RAID1 and RAID10 arrays. It is however recommended to still scrub to catch and correct any bad sectors there might be in the devices.
LVM installation
This section will convert the two RAIDs into physical volumes (PVs). Then combine those PVs into a volume group (VG). The VG will then be divided into logical volumes (LVs) that will act like physical partitions (e.g. /
, /var
, /home
). If you did not understand that make sure you read the LVM Introduction section.
Create physical volumes
Make the RAIDs accessible to LVM by converting them into physical volumes (PVs) using the following command. Repeat this action for each of the RAID arrays created above.
# pvcreate /dev/md0
-ff
option.Confirm that LVM has added the PVs with:
# pvdisplay
Create the volume group
Next step is to create a volume group (VG) on the PVs.
Create a volume group (VG) with the first PV:
# vgcreate VolGroupArray /dev/md0
Confirm that LVM has added the VG with:
# vgdisplay
Create logical volumes
Now we need to create logical volumes (LVs) on the VG, much like we would normally prepare a hard drive. In this example we will create separate /
, /var
, /swap
, /home
LVs. The LVs will be accessible as /dev/mapper/VolGroupArray-<lvname>
or /dev/VolGroupArray/<lvname>
.
Create a / LV:
# lvcreate -L 20G VolGroupArray -n lvroot
Create a /var LV:
# lvcreate -L 15G VolGroupArray -n lvvar
/swap
LV with the -C y
option, which creates a contiguous partition, so that your swap space does not get partitioned over one or more disks nor over non-contiguous physical extents:
# lvcreate -C y -L 2G VolGroupArray -n lvswap
Create a /home LV that takes up the remainder of space in the VG:
# lvcreate -l +100%FREE VolGroupArray -n lvhome
Confirm that LVM has created the LVs with:
# lvdisplay
Update RAID configuration
Since the installer builds the initrd using /etc/mdadm.conf
in the target system, you should update that file with your RAID configuration. The original file can simply be deleted because it contains comments on how to fill it correctly, and that is something mdadm can do automatically for you. So let us delete the original and have mdadm create you a new one with the current setup:
# mdadm --examine --scan > /etc/mdadm.conf
mdadm.conf
file from within the installer.Prepare hard drive
Follow the directions outlined the Installation section until you reach the Prepare Hard Drive section. Skip the first two steps and navigate to the Manually Configure block devices, filesystems and mountpoints page. Remember to only configure the PVs (e.g. /dev/mapper/VolGroupArray-lvhome
) and not the actual disks (e.g. /dev/sda1
).
mkfs.xfs
will not align the chunk size and stripe size for optimum performance (see: Optimum RAID).Configure system
mkinitcpio.conf
mkinitcpio can use a hook to assemble the arrays on boot. For more information see mkinitpcio Using RAID.
- Add the
dm_mod
module to theMODULES
list in/etc/mkinitcpio.conf
. - Add the
mdadm_udev
andlvm2
hooks to theHOOKS
list in/etc/mkinitcpio.conf
afterudev
.
Conclusion
Once it is complete you can safely reboot your machine:
# reboot
Install the bootloader on the Alternate Boot Drives
Once you have successfully booted your new system for the first time, you will want to install the bootloader onto the other two disks (or on the other disk if you have only 2 HDDs) so that, in the event of disk failure, the system can be booted from any of the remaining drives (e.g. by switching the boot order in the BIOS). The method depends on the bootloader system you're using:
Syslinux
Log in to your new system as root and do:
# /usr/sbin/syslinux-install_update -iam
Syslinux will deal with installing the bootloader to the MBR on each of the members of the RAID array:
Detected RAID on /boot - installing Syslinux with --raid Syslinux install successful
Attribute Legacy Bios Bootable Set - /dev/sda1 Attribute Legacy Bios Bootable Set - /dev/sdb1 Installed MBR (/usr/lib/syslinux/gptmbr.bin) to /dev/sda Installed MBR (/usr/lib/syslinux/gptmbr.bin) to /dev/sdb
GRUB legacy
Log in to your new system as root and do:
# grub grub> device (hd0) /dev/sdb grub> root (hd0,0) grub> setup (hd0) grub> device (hd0) /dev/sdc grub> root (hd0,0) grub> setup (hd0) grub> quit
Archive your filesystem partition scheme
Now that you are done, it is worth taking a second to archive off the partition state of each of your drives. This guarantees that it will be trivially easy to replace/rebuild a disk in the event that one fails. You do this with the sfdisk
tool and the following steps:
# mkdir /etc/partitions # sfdisk --dump /dev/sda >/etc/partitions/disc0.partitions # sfdisk --dump /dev/sdb >/etc/partitions/disc1.partitions # sfdisk --dump /dev/sdc >/etc/partitions/disc2.partitions
Management
For further information on how to maintain your software RAID or LVM review the RAID and LVM aritcles.
See also
- Setup Arch Linux on top of raid, LVM2 and encrypted partitions by Yannick Loth
- RAID vs. LVM on Stack Overflow
- What is better LVM on RAID or RAID on LVM? on Server Fault
- Managing RAID and LVM with Linux (v0.5) by Gregory Gulik
- Gentoo Linux x86 with Software Raid and LVM2 Quick Install Guide
- 2011-09-08 - Arch Linux - LVM & RAID (1.2 metadata) + SYSLINUX
- 2011-04-20 - Arch Linux - Software RAID and LVM questions
- 2011-03-12 - Arch Linux - Some newbie questions about installation, LVM, grub, RAID