User:AskApache/raid-backup-log

From ArchWiki

RAID Recovery Planning

RAID is often used for critical stuff for the purpose of keeping data safe and having redundancy to prevent data from being lost if a drive fails. But problems do come up, and when/if they do it is vital that you have everything you need in order to get it fixed and back online.

Throughout this page are commands used to create logs of information on your system and RAID setup, and backups of important files and information.

The following settings are used for all the commands on this page.

DEVS="sdb sdc sdd sde sdf sdg"
PARTS="sdb1 sdc1 sdd1 sde1 sdf1 sdg1"
ARRAY="md0"
BKDIR="${HOME}/raid-backup-log-`date +%m-%d-%Y`"

Create the backup directory and cd into it.

# mkdir -pv $BKDIR && cd $BKDIR
Warning: Make sure any/all filesystems on the raid array are NOT mounted. The array should be assembled and active.

Unmount array

# umount /dev/${ARRAY}


Backup Partitions and MBR

Backup partition table/mbr for devices

# for d in $DEVS; do dd if=/dev/${d} of=mbr-${d}.bin bs=512 count=1; done

Backup partition tables in reusable sfdisk form

# for d in $DEVS; do sfdisk -d /dev/${d} > sfdisk-table-${d}.dump; done


Disk and Partition Info

Create record of disk and partitions

# blkid -c /dev/null /dev/sda* 2>&1 | tee blkid.log

# lsblk -fmb /dev/sda 2>&1 | tee lsblk.log
# lsblk -fmpibo NAME,FSTYPE,MOUNTPOINT,LABEL,UUID,MODEL,SERIAL,SIZE,STATE,OWNER,GROUP,MODE,SCHED,TYPE,HCTL /dev/sda 2>&1 | tee lsblk-O.log

# fdisk -l -u=sectors /dev/sda* 2>&1 | tee fdisk.log
# fdisk -l -u=cylinders /dev/sda* 2>&1 | tee fdisk-C.log

# partprobe -s /dev/sda 2>&1 | tee partprobe.log

# sfdisk -xRluM /dev/sda 2>&1 | tee sfdisk.log
# sfdisk -xRluS /dev/sda 2>&1 | tee sfdisk-S.log
# sfdisk -xRluC /dev/sda 2>&1 | tee sfdisk-C.log
# sfdisk -xRluB /dev/sda 2>&1 | tee sfdisk-B.log

# cfdisk -Ps /dev/sda 2>&1 | tee cfdisk.log
# cfdisk -Pr /dev/sda 2>&1 | tee cfdisk-r.log
# cfdisk -Pt /dev/sda 2>&1 | tee cfdisk-t.log

# parted /dev/sda print 2>&1 | tee parted.log
# parted -l 2>&1 | tee parted-d.log
# parted -lm 2>&1 | tee parted-m.log

Backup Files

# cp -vf /etc/mdadm.conf etc-mdadm.conf
# cp -vf /etc/fstab etc-fstab
# cp -vf /etc/mkinitcpio.conf etc-mkinitcpio.conf
# dmesg > dmesg.log



RAID Logs

Log running arrays

# mdadm -Dv --scan > mdadm-detail-scan.log

Create record of each partition/device in array

# for p in $PARTS; do mdadm -E /dev/${p} > mdadm-examine-${p}.log; done

Backup raid sysfs info

# for f in `find -H /sys/block/${ARRAY} -type f ! -empty`; do echo -e "[ $f ]"; cat $f; done > sys-block-${ARRAY}.log

Backup raid udev info

# for d in $DEVS; do udevadm info -a -p `udevadm info -q path -n /dev/${d}*` > udev-info-${d}.log; done



Hardware Logs

Create record of system

# command lspci -vvvqq > lspci.log
# command lsusb -vvvv > lsusb.log

Log of hard-drive info

# for d in $DEVS; do hdparm -I /dev/${d} 2>&1 > hdparm-${d}.log; done

Log of smartctl info

# for d in $DEVS; do smartctl -a /dev/${d} 2>&1 > smartctl-${d}.log; done



File System Logs

Create record of filesystem built on array

# dumpe2fs /dev/${ARRAY} > dumpe2fs-${ARRAY}.log
# tune2fs -l /dev/${ARRAY} > tune2fs-${ARRAY}.log

Create record of mounts

# mount -l|sort|column -t > mounts.log

Create record of filesytem usage

# command df -aTh > df-usage.log



Shell Script

~/raid-backup.sh
## LOG DISK AND PARTITIONS

# create record of disk and partitions
blkid -c /dev/null > blkid.log
command lsblk -a -f -b -m -t > lsblk.log
fdisk -l -u=sectors > fdisk-S.log
fdisk -l -u=cylinders > fdisk-C.log
sfdisk -R -l -u S > sfdisk-S.log
sfdisk -R -l -u C > sfdisk-C.log
sfdisk -R -l -u B > sfdisk-B.log




## BACKUP FILES

cp -vf /etc/mdadm.conf etc-mdadm.conf
cp -vf /etc/fstab etc-fstab
cp -vf /etc/mkinitcpio.conf etc-mkinitcpio.conf
dmesg > dmesg.log




## LOG MDADM RAID INFO

# log running arrays
mdadm -Dv --scan > mdadm-detail-scan.log

# create record of each partition/device in array
for p in $PARTS; do mdadm -E /dev/${p} > mdadm-examine-${p}.log; done

# backup raid sysfs info
for f in `find -H /sys/block/${ARRAY} -type f ! -empty`; do echo -e "\n[ $f ]"; cat $f; done > sys-block-${ARRAY}.log

# backup raid udev info
for d in $DEVS; do udevadm info -a -p `udevadm info -q path -n /dev/${d}*` > udev-info-${d}.log; done




## LOG HARDWARE INFO

# create record of system
command lspci -vvvqq > lspci.log
command lsusb -vvvv > lsusb.log

# log of hard-drive info
for d in $DEVS; do hdparm -I /dev/${d} 2>&1 > hdparm-${d}.log; done

# log of smartctl info
for d in $DEVS; do smartctl -a /dev/${d} 2>&1 > smartctl-${d}.log; done




## LOG FILESYSTEM INFO

# create record of filesystem built on array
dumpe2fs /dev/${ARRAY} > dumpe2fs-${ARRAY}.log
tune2fs -l /dev/${ARRAY} > tune2fs-${ARRAY}.log

# create record of mounts
mount -l|sort|column -t > mounts.log

# create record of filesytem usage
command df -aTh > df-usage.log








Speed Boosts

Monitor md:

# tmux split-window -l 8 "watch -t 'cat /proc/mdstat'"

View your current usage with:

# sar -bdpqu -P ALL 60 1

Sync Speed

drivers/md/md.c
/*
  * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit'
  * is 1000 KB/sec, so the extra system load does not show up that much.
  * Increase it if you want to have more _guaranteed_ speed. Note that
  * the RAID driver will use the maximum available bandwidth if the IO
  * subsystem is idle. There is also an 'absolute maximum' reconstruction
  * speed limit - in case reconstruction slows down your system despite
  * idle IO detection.
  *
  * you can change it via /proc/sys/dev/raid/speed_limit_min and _max.
  * or /sys/block/mdX/md/sync_speed_{min,max}
  */
 
 static int sysctl_speed_limit_min = 1000;
 static int sysctl_speed_limit_max = 200000;
/etc/sysctl.conf
dev.raid.speed_limit_min = 5000
dev.raid.speed_limit_max = 400000
# echo -n 5000 > /proc/sys/dev/raid/speed_limit_min;  echo -n 400000 > /proc/sys/dev/raid/speed_limit_max

Increase Stripe Cache Size

Setting stripe_cache_size to 2 MiB for /dev/md0

# echo 2048 > /sys/block/md0/md/stripe_cache_size


Increase Sector readahead

Set read-ahead to 32 MiB for /dev/md0:

# blockdev --setra 65536 /dev/md0

mdadm Options

Help

Any parameter that does not start with '-' is treated as a device name or, for --examine-bitmap, a file name. The first such name is often the name of an md device. Subsequent names are often names of component devices.

Some common options are:

  • --help -h  : General help message or, after above option, mode specific help message
  • --help-options  : This help message
  • --version -V  : Print version information for mdadm
  • --verbose -v  : Be more verbose about what is happening
  • --quiet -q  : Don't print un-necessary messages
  • --brief -b  : Be less verbose, more brief
  • --export -Y  : With --detail, --detail-platform or --examine use key=value format for easy import into environment
  • --force -f  : Override normal checks and be more forceful


  • --assemble -A  : Assemble an array
  • --build -B  : Build an array without metadata
  • --create -C  : Create a new array
  • --detail -D  : Display details of an array
  • --examine -E  : Examine superblock on an array component
  • --examine-bitmap -X: Display the detail of a bitmap file
  • --monitor -F  : monitor (follow) some arrays
  • --grow -G  : resize/ reshape and array
  • --incremental -I  : add/remove a single device to/from an array as appropriate
  • --query -Q  : Display general information about how a device relates to the md driver
  • --auto-detect  : Start arrays auto-detected by the kernel
  • --offroot  : Set first character of argv[0] to @ to indicate the application was launched from initrd/initramfs and should not be shutdown by systemd as part of the regular shutdown process.


For create or build:

  • --bitmap= -b  : File to store bitmap in - may pre-exist for --build
  • --chunk= -c  : chunk size of kibibytes
  • --rounding=  : rounding factor for linear array (==chunk size)
  • --level= -l  : raid level: 0,1,4,5,6,10,linear, or mp for create. 0,1,10,mp,faulty or linear for build.
  • --parity= -p  : raid5/6 parity algorithm: {left,right}-{,a}symmetric
  • --layout=  : same as --parity, for RAID10: [fno]NN
  • --raid-devices= -n : number of active devices in array
  • --spare-devices= -x: number of spare (eXtra) devices in initial array
  • --size= -z  : Size (in K) of each drive in RAID1/4/5/6/10 - optional
  • --force -f  : Honour devices as listed on command line. Don't insert a missing drive for RAID5.
  • --assume-clean  : Assume the array is already in-sync. This is dangerous for RAID5.
  • --bitmap-chunk=  : chunksize of bitmap in bitmap file (Kilobytes)
  • --delay= -d  : seconds between bitmap updates
  • --write-behind=  : number of simultaneous write-behind requests to allow (requires bitmap)
  • --name= -N  : Textual name for array - max 32 characters

For assemble:

  • --bitmap= -b  : File to find bitmap information in
  • --uuid= -u  : uuid of array to assemble. Devices which don't have this uuid are excluded
  • --super-minor= -m  : minor number to look for in super-block when choosing devices to use.
  • --name= -N  : Array name to look for in super-block.
  • --config= -c  : config file
  • --scan -s  : scan config file for missing information
  • --force -f  : Assemble the array even if some superblocks appear out-of-date
  • --update= -U  : Update superblock: try '-A --update=?' for list of options.
  • --no-degraded  : Do not start any degraded arrays - default unless --scan.

For detail or examine:

  • --brief -b  : Just print device name and UUID

For follow/monitor:

  • --mail= -m  : Address to mail alerts of failure to
  • --program= -p  : Program to run when an event is detected
  • --alert=  : same as --program
  • --delay= -d  : seconds of delay between polling state. default=60

General management:

  • --add -a  : add, or hotadd subsequent devices
  • --re-add  : re-add a recently removed device
  • --remove -r  : remove subsequent devices
  • --fail -f  : mark subsequent devices as faulty
  • --set-faulty  : same as --fail
  • --run -R  : start a partially built array
  • --stop -S  : deactivate array, releasing all resources
  • --readonly -o  : mark array as readonly
  • --readwrite -w  : mark array as readwrite
  • --zero-superblock  : erase the MD superblock from a device.
  • --wait -W  : wait for recovery/resync/reshape to finish.

Create

# mdadm --create device -chunk=X --level=Y --raid-devices=Z devices

This usage will initialise a new md array, associate some devices with it, and activate the array. In order to create an array with some devices missing, use the special word 'missing' in place of the relevant device name.

Before devices are added, they are checked to see if they already contain raid superblocks or filesystems. They are also checked to see if the variance in device size exceeds 1%. If any discrepancy is found, the user will be prompted for confirmation before the array is created. The presence of a '--run' can override this caution.

If the --size option is given then only that many kilobytes of each device is used, no matter how big each device is. If no --size is given, the apparent size of the smallest drive given is used for raid level 1 and greater, and the full device is used for other levels.

Options that are valid with --create (-C) are:

  • --bitmap=  : Create a bitmap for the array with the given filename or an internal bitmap is 'internal' is given
  • --chunk= -c  : chunk size in kibibytes
  • --rounding=  : rounding factor for linear array (==chunk size)
  • --level= -l  : raid level: 0,1,4,5,6,10,linear,multipath and synonyms
  • --parity= -p  : raid5/6 parity algorithm: {left,right}-{,a}symmetric
  • --layout=  : same as --parity, for RAID10: [fno]NN
  • --raid-devices= -n : number of active devices in array
  • --spare-devices= -x: number of spare (eXtra) devices in initial array
  • --size= -z  : Size (in K) of each drive in RAID1/4/5/6/10 - optional
  • --force -f  : Honour devices as listed on command line. Don't insert a missing drive for RAID5.
  • --run -R  : insist of running the array even if not all devices are present or some look odd.
  • --readonly -o  : start the array readonly - not supported yet.
  • --name= -N  : Textual name for array - max 32 characters
  • --bitmap-chunk=  : bitmap chunksize in Kilobytes.
  • --delay= -d  : bitmap update delay in seconds.


Build

# mdadm --build device -chunk=X --level=Y --raid-devices=Z devices

This usage is similar to --create. The difference is that it creates a legacy array without a superblock. With these arrays there is no different between initially creating the array and subsequently assembling the array, except that hopefully there is useful data there in the second case.

The level may only be 0, 1, 10, linear, multipath, or faulty. All devices must be listed and the array will be started once complete. Options that are valid with --build (-B) are:

  • --bitmap=  : file to store/find bitmap information in.
  • --chunk= -c  : chunk size of kibibytes
  • --rounding=  : rounding factor for linear array (==chunk size)
  • --level= -l  : 0, 1, 10, linear, multipath, faulty
  • --raid-devices= -n : number of active devices in array
  • --bitmap-chunk=  : bitmap chunksize in Kilobytes.
  • --delay= -d  : bitmap update delay in seconds.

Assemble

# mdadm --assemble device options...
  mdadm --assemble --scan options...

This usage assembles one or more raid arrays from pre-existing components. For each array, mdadm needs to know the md device, the identity of the array, and a number of sub devices. These can be found in a number of ways.

The md device is given on the command line, is found listed in the config file, or can be deduced from the array identity. The array identity is determined either from the --uuid, --name, or --super-minor commandline arguments, from the config file, or from the first component device on the command line.

The different combinations of these are as follows: If the --scan option is not given, then only devices and identities listed on the command line are considered. The first device will be the array device, and the remainder will be examined when looking for components. If an explicit identity is given with --uuid or --super-minor, then only devices with a superblock which matches that identity is considered, otherwise every device listed is considered.

If the --scan option is given, and no devices are listed, then every array listed in the config file is considered for assembly. The identity of candidate devices are determined from the config file. After these arrays are assembled, mdadm will look for other devices that could form further arrays and tries to assemble them. This can be disabled using the 'AUTO' option in the config file.

If the --scan option is given as well as one or more devices, then Those devices are md devices that are to be assembled. Their identity and components are determined from the config file.

If mdadm can not find all of the components for an array, it will assemble it but not activate it unless --run or --scan is given. To preserve this behaviour even with --scan, add --no-degraded. Note that \"all of the components\" means as many as were present the last time the array was running as recorded in the superblock. If the array was already degraded, and the missing device is not a new problem, it will still be assembled. It is only newly missing devices that cause the array not to be started.

Options that are valid with --assemble (-A) are:

  • --bitmap=  : bitmap file to use with the array
  • --uuid= -u  : uuid of array to assemble. Devices which don't have this uuid are excluded
  • --super-minor= -m  : minor number to look for in super-block when choosing devices to use.
  • --name= -N  : Array name to look for in super-block.
  • --config= -c  : config file
  • --scan -s  : scan config file for missing information
  • --run -R  : Try to start the array even if not enough devices for a full array are present
  • --force -f  : Assemble the array even if some superblocks appear out-of-date. This involves modifying the superblocks.
  • --update= -U  : Update superblock: try '-A --update=?' for option list.
  • --no-degraded  : Assemble but do not start degraded arrays.
  • --readonly -o  : Mark the array as read-only. No resync will start.

Manage

# mdadm arraydevice options component devices...

This usage is for managing the component devices within an array. The --manage option is not needed and is assumed if the first argument is a device name or a management option. The first device listed will be taken to be an md array device, any subsequent devices are (potential) components of that array.

Options that are valid with management mode are:

  • --add -a  : hotadd subsequent devices to the array
  • --re-add  : subsequent devices are re-added if there were recent members of the array
  • --remove -r  : remove subsequent devices, which must not be active
  • --fail -f  : mark subsequent devices a faulty
  • --set-faulty  : same as --fail
  • --run -R  : start a partially built array
  • --stop -S  : deactivate array, releasing all resources
  • --readonly -o  : mark array as readonly
  • --readwrite -w  : mark array as readwrite

Misc

# mdadm misc_option  devices...

This usage is for performing some task on one or more devices, which may be arrays or components, depending on the task. The --misc option is not needed (though it is allowed) and is assumed if the first argument in a misc option.

Options that are valid with the miscellaneous mode are:

  • --query -Q  : Display general information about how a device relates to the md driver
  • --detail -D  : Display details of an array
  • --detail-platform  : Display hardware/firmware details
  • --examine -E  : Examine superblock on an array component
  • --examine-bitmap -X: Display contents of a bitmap file
  • --zero-superblock  : erase the MD superblock from a device.
  • --run -R  : start a partially built array
  • --stop -S  : deactivate array, releasing all resources
  • --readonly -o  : mark array as readonly
  • --readwrite -w  : mark array as readwrite
  • --test -t  : exit status 0 if ok, 1 if degrade, 2 if dead, 4 if missing
  • --wait -W  : wait for resync/rebuild/recovery to finish

Monitor

# mdadm --monitor options devices

This usage causes mdadm to monitor a number of md arrays by periodically polling their status and acting on any changes. If any devices are listed then those devices are monitored, otherwise all devices listed in the config file are monitored. The address for mailing advisories to, and the program to handle each change can be specified in the config file or on the command line. There must be at least one destination for advisories, whether an email address, a program, or --syslog

Options that are valid with the monitor (-F --follow) mode are:

  • --mail= -m  : Address to mail alerts of failure to
  • --program= -p  : Program to run when an event is detected
  • --alert=  : same as --program
  • --syslog -y  : Report alerts via syslog
  • --increment= -r  : Report RebuildNN events in the given increment. default=20
  • --delay= -d  : seconds of delay between polling state. default=60
  • --config= -c  : specify a different config file
  • --scan -s  : find mail-address/program in config file
  • --daemonise -f  : Fork and continue in child, parent exits
  • --pid-file= -i  : In daemon mode write pid to specified file instead of stdout
  • --oneshot -1  : Check for degraded arrays, then exit
  • --test -t  : Generate a TestMessage event against each array at startup

Grow

# mdadm --grow device options

This usage causes mdadm to attempt to reconfigure a running array. This is only possibly if the kernel being used supports a particular reconfiguration.

Options that are valid with the grow (-G --grow) mode are:

  • --level= -l  : Tell mdadm what level to convert the array to.
  • --layout= -p  : For a FAULTY array, set/change the error mode. for other arrays, update the layout
  • --size= -z  : Change the active size of devices in an array. This is useful if all devices have been replaced with larger devices. Value is in Kilobytes, or the special word 'max' meaning 'as large as possible'.
  • --assume-clean  : When increasing the --size, this flag will avoid a resync of the new space
  • --chunk= -c  : Change the chunksize of the array
  • --raid-devices= -n  : Change the number of active devices in an array.
  • --add= -a  : Add listed devices as part of reshape. This is needed for resizing a RAID0 which cannot have spares already present.
  • --bitmap= -b  : Add or remove a write-intent bitmap.
  • --backup-file= file : A file on a different device to store data for a short time while increasing raid-devices on a RAID4/5/6 array. Also needed throughout a reshape when changing parameters other than raid-devices
  • --array-size= -Z  : Change visible size of array. This does not change any data on the device, and is not stable across restarts.

Incremental

# mdadm --incremental [-Rqrsf] device

This usage allows for incremental assembly of md arrays. Devices can be added one at a time as they are discovered. Once an array has all expected devices, it will be started.

Optionally, the process can be reversed by using the fail option. When fail mode is invoked, mdadm will see if the device belongs to an array and then both fail (if needed) and remove the device from that array.

Options that are valid with incremental assembly (-I --incremental) are:

  • --run -R : Run arrays as soon as a minimal number of devices are present rather than waiting for all expected.
  • --quiet -q : Don't print any information messages, just errors.
  • --rebuild-map -r : Rebuild the 'map' file that mdadm uses for tracking partial arrays.
  • --scan -s : Use with -R to start any arrays that have the minimal required number of devices, but are not yet started.
  • --fail -f : First fail (if needed) and then remove device from any array that it is a member of.



More Info