Difference between revisions of "Kdump"

From ArchWiki
Jump to: navigation, search
m (error corrected, fixed style)
m (add ja link)
 
(12 intermediate revisions by 8 users not shown)
Line 1: Line 1:
 
[[Category:Boot process]]
 
[[Category:Boot process]]
 
[[Category:Kernel]]
 
[[Category:Kernel]]
{{Article summary start}}
+
[[ja:Kdump]]
{{Article summary text|Covers how to setup kdump.}}
+
{{Related articles start}}
{{Article summary heading|Related}}
+
{{Related|Kexec}}
{{Article summary wiki|Kexec}}
+
{{Related articles end}}
{{Article summary end}}
+
  
 
[https://www.kernel.org/doc/Documentation/kdump/kdump.txt Kdump] is a standard Linux mechanism to dump machine memory content on kernel crash. Kdump is based on [[Kexec]]. Kdump utilizes two kernels: system kernel and dump capture kernel. System kernel is a normal kernel that is booted with special kdump-specific flags. We need to tell the system kernel to reserve some amount of physical memory where dump-capture kernel will be loaded. We need to load the dump capture kernel in advance because at the moment crash happens there is no way to read any data from disk because kernel is broken.
 
[https://www.kernel.org/doc/Documentation/kdump/kdump.txt Kdump] is a standard Linux mechanism to dump machine memory content on kernel crash. Kdump is based on [[Kexec]]. Kdump utilizes two kernels: system kernel and dump capture kernel. System kernel is a normal kernel that is booted with special kdump-specific flags. We need to tell the system kernel to reserve some amount of physical memory where dump-capture kernel will be loaded. We need to load the dump capture kernel in advance because at the moment crash happens there is no way to read any data from disk because kernel is broken.
  
Once kernel crash happens the kernel crash handler uses [[Kexec]] mechanism to boot dump capture kernel. Please note that memory with system kernel is untouched and accessible from dump capture kernel as seen at the moment of crash. Once dump capture kernel is booted user can use /dev/vmcore file to get access to memory of crashed system kernel. The dump can be saved to disk or copied over network to some other machine for further investigation.
+
Once kernel crash happens the kernel crash handler uses [[Kexec]] mechanism to boot dump capture kernel. Please note that memory with system kernel is untouched and accessible from dump capture kernel as seen at the moment of crash. Once dump capture kernel is booted, the user can use the file {{ic|/proc/vmcore}} to get access to memory of crashed system kernel. The dump can be saved to disk or copied over network to some other machine for further investigation.
  
In real production environments system and dump capture kernel will be different - system kernel needs a lot of features and compiled with a many kernel flags/drivers. While dump capture kernel goal is to be minimalistic and take as small amount of memory as possible, e.g. dump capture kernel can be compiled without network support if we store memory dump to disk only. But in this article we will simplify things and use the same kernel both as system and dump capture one. In means we will load the same kernel code twice - one as normal system kernel, another one to reserved memory area.
+
In real production environments system and dump capture kernel will be different - system kernel needs a lot of features and compiled with a many kernel flags/drivers. While dump capture kernel goal is to be minimalistic and take as small amount of memory as possible, e.g. dump capture kernel can be compiled without network support if we store memory dump to disk only. But in this article we will simplify things and use the same kernel both as system and dump capture one. It means we will load the same kernel code twice - one as normal system kernel, another one to reserved memory area.
  
 
== Compiling kernel ==
 
== Compiling kernel ==
  
System/dump capture kernel requires some configuration flags that are not set by default. Please consult [[Kernels#Compilation|Kernel Compilation]] article for more information about compiling custom kernel in Arch. Here we will emphasize on Kdump specific configuration.
+
System/dump capture kernel requires some configuration flags that are not set by default. Please consult [[Kernel Compilation]] article for more information about compiling custom kernel in Arch. Here we will emphasize on Kdump specific configuration.
  
 
To create a kernel you need to edit kernel config (or config.x86_64) file and enable following configuration options:
 
To create a kernel you need to edit kernel config (or config.x86_64) file and enable following configuration options:
Line 31: Line 30:
 
== Setup kdump kernel ==
 
== Setup kdump kernel ==
  
First you need to reserve memory for dump capture kernel. Edit you bootloader config file and add ''crashkernel=64M'' boot option to the system kernel you just installed. For example [[Syslinux]] boot entry would look like:
+
First you need to reserve memory for dump capture kernel. Edit you bootloader configuration and add {{ic|1=crashkernel=64M}} boot option to the system kernel you just installed. For example [[Syslinux]] boot entry would look like:
  
 
{{hc|/boot/syslinux/syslinux.cfg|<nowiki>
 
{{hc|/boot/syslinux/syslinux.cfg|<nowiki>
Line 43: Line 42:
 
64M of memory should be enough to hadle crash dumps on machines with up to 12G of RAM. Some systems require more reserved memory. In case if dump capture kernel unable not load try to increase the memory to ''256M'' or even to ''512M'', but note that this memory is unavailable to system kernel.
 
64M of memory should be enough to hadle crash dumps on machines with up to 12G of RAM. Some systems require more reserved memory. In case if dump capture kernel unable not load try to increase the memory to ''256M'' or even to ''512M'', but note that this memory is unavailable to system kernel.
  
Reboot into your system kernel.  To make sure that the kernel is booted with correct options please check ''/proc/cmdline'' file.
+
Reboot into your system kernel.  To make sure that the kernel is booted with correct options please check the {{ic|/proc/cmdline}} file.
  
Next you need to tell [[Kexec]] that you want to use your dump capture kernel. Specify your kernel, initramfs file, device for root fs and other parameters if needed.
+
Next you need to tell [[Kexec]] that you want to use your dump capture kernel. Specify your kernel, initramfs file, root device and other parameters if needed:
  
 
  # kexec -p [/boot/vmlinuz-linux-kdump] --initrd=[/boot/initramfs-linux-kdump.img] --append="root=[root-device] single irqpoll maxcpus=1 reset_devices"
 
  # kexec -p [/boot/vmlinuz-linux-kdump] --initrd=[/boot/initramfs-linux-kdump.img] --append="root=[root-device] single irqpoll maxcpus=1 reset_devices"
  
It loads the kernel into reserved area. Without '''-p''' flag ''kexec'' would boot the kernel right away, but in presence of the flag kernel will be loaded into reserved memory but boot postponed until crash.
+
It loads the kernel into the reserved area. Without the {{ic|-p}} flag ''kexec'' would boot the kernel right away, but in presence of the flag kernel will be loaded into reserved memory but boot postponed until a crash.  
  
Instead of runnig kexec manually you might want to setup [[Systemd]] service that will run kexec on boot:
+
{{Note|For a loaded kernel {{ic|cat /sys/devices/system/cpu/online}} shows the active CPU cores. The {{ic|1=maxcpus=1}} kernel parameter should [https://www.kernel.org/doc/Documentation/cpu-hotplug.txt limit] it to one. If it has no effect or your SMP-enabled kernel [https://bbs.archlinux.org/viewtopic.php?pid&#61;1424049#p1424049 does not boot], try using {{ic|1=nr_cpus=1}} instead.}}
 +
 
 +
Instead of running ''kexec'' manually you might want to setup [[Systemd]] service that will run kexec on boot:
  
 
{{hc|/etc/systemd/system/kdump.service|<nowiki>
 
{{hc|/etc/systemd/system/kdump.service|<nowiki>
Line 81: Line 82:
 
== Dump crashed kernel ==
 
== Dump crashed kernel ==
  
Once booted into dump capture kernel you can read {{ic|/dev/vmcore}} file. It is recommended to dump core to a file and analyze it later.
+
Once booted into dump capture kernel you can read {{ic|/proc/vmcore}} file. It is recommended to dump core to a file and analyze it later.
 
  # cp /proc/vmcore /root/crash.dump
 
  # cp /proc/vmcore /root/crash.dump
  
Line 88: Line 89:
 
== Analyzing core dump ==
 
== Analyzing core dump ==
  
You can use either ''gdb'' tool or special gdb extension called [https://aur.archlinux.org/packages/crash/ crash] that can be found in AUR. Run ''crash'' as
+
You can use either ''gdb'' tool or special gdb extension called {{Pkg|crash}}. Run ''crash'' as
 
  $ crash ''vmlinux'' ''path''/crash.dump
 
  $ crash ''vmlinux'' ''path''/crash.dump
 
Where ''vmlinux'' previously saved kernel binary with debug symbols.
 
Where ''vmlinux'' previously saved kernel binary with debug symbols.
Line 94: Line 95:
 
Follow ''man crash'' or http://people.redhat.com/~anderson/crash_whitepaper/ for more information about debugging practices.
 
Follow ''man crash'' or http://people.redhat.com/~anderson/crash_whitepaper/ for more information about debugging practices.
  
== Additional information ==  
+
== Additional information ==
  
 
* https://www.kernel.org/doc/Documentation/kdump/kdump.txt - Official kdump documentation
 
* https://www.kernel.org/doc/Documentation/kdump/kdump.txt - Official kdump documentation
 
* http://www.dedoimedo.com/computers/www.dedoimedo.com-crash-book.pdf - The crash book
 
* http://www.dedoimedo.com/computers/www.dedoimedo.com-crash-book.pdf - The crash book

Latest revision as of 16:30, 16 March 2016

Related articles

Kdump is a standard Linux mechanism to dump machine memory content on kernel crash. Kdump is based on Kexec. Kdump utilizes two kernels: system kernel and dump capture kernel. System kernel is a normal kernel that is booted with special kdump-specific flags. We need to tell the system kernel to reserve some amount of physical memory where dump-capture kernel will be loaded. We need to load the dump capture kernel in advance because at the moment crash happens there is no way to read any data from disk because kernel is broken.

Once kernel crash happens the kernel crash handler uses Kexec mechanism to boot dump capture kernel. Please note that memory with system kernel is untouched and accessible from dump capture kernel as seen at the moment of crash. Once dump capture kernel is booted, the user can use the file /proc/vmcore to get access to memory of crashed system kernel. The dump can be saved to disk or copied over network to some other machine for further investigation.

In real production environments system and dump capture kernel will be different - system kernel needs a lot of features and compiled with a many kernel flags/drivers. While dump capture kernel goal is to be minimalistic and take as small amount of memory as possible, e.g. dump capture kernel can be compiled without network support if we store memory dump to disk only. But in this article we will simplify things and use the same kernel both as system and dump capture one. It means we will load the same kernel code twice - one as normal system kernel, another one to reserved memory area.

Compiling kernel

System/dump capture kernel requires some configuration flags that are not set by default. Please consult Kernel Compilation article for more information about compiling custom kernel in Arch. Here we will emphasize on Kdump specific configuration.

To create a kernel you need to edit kernel config (or config.x86_64) file and enable following configuration options:

config{.x86_64} file
CONFIG_DEBUG_INFO=y
CONFIG_CRASH_DUMP=y
CONFIG_PROC_VMCORE=y

Also change package base name to something like linux-kdump to distinguish the kernel from the default Arch one. Compile kernel package and install it. Save ./src/linux-X.Y/vmlinux uncompressed system kernel binary - it contains debug symbols and you will need them later when analyzing crash.

In case if you have separate kernel for system and dump capture then it is recommended to consult Kdump documentation. It has several recommendations how to make dump capture kernel smaller.

Setup kdump kernel

First you need to reserve memory for dump capture kernel. Edit you bootloader configuration and add crashkernel=64M boot option to the system kernel you just installed. For example Syslinux boot entry would look like:

/boot/syslinux/syslinux.cfg
LABEL arch-kdump
        MENU LABEL Arch Linux Kdump
        LINUX ../vmlinuz-linux-kdump
        APPEND root=/dev/sda1 crashkernel=64M
        INITRD ../initramfs-linux-kdump.img

64M of memory should be enough to hadle crash dumps on machines with up to 12G of RAM. Some systems require more reserved memory. In case if dump capture kernel unable not load try to increase the memory to 256M or even to 512M, but note that this memory is unavailable to system kernel.

Reboot into your system kernel. To make sure that the kernel is booted with correct options please check the /proc/cmdline file.

Next you need to tell Kexec that you want to use your dump capture kernel. Specify your kernel, initramfs file, root device and other parameters if needed:

# kexec -p [/boot/vmlinuz-linux-kdump] --initrd=[/boot/initramfs-linux-kdump.img] --append="root=[root-device] single irqpoll maxcpus=1 reset_devices"

It loads the kernel into the reserved area. Without the -p flag kexec would boot the kernel right away, but in presence of the flag kernel will be loaded into reserved memory but boot postponed until a crash.

Note: For a loaded kernel cat /sys/devices/system/cpu/online shows the active CPU cores. The maxcpus=1 kernel parameter should limit it to one. If it has no effect or your SMP-enabled kernel does not boot, try using nr_cpus=1 instead.

Instead of running kexec manually you might want to setup Systemd service that will run kexec on boot:

/etc/systemd/system/kdump.service
[Unit]
Description=Load dump capture kernel
After=local-fs.target

[Service]
ExecStart=/usr/bin/kexec -p [/boot/vmlinuz-linux-kdump] --initrd=[/boot/initramfs-linux-kdump.img] --append="root=[root-device] single irqpoll maxcpus=1 reset_devices"
Type=oneshot

[Install]
WantedBy=multi-user.target

Then enable the service :

# systemctl enable kdump

To check whether the crash kernel is already loaded please run following command:

$ cat /sys/kernel/kexec_crash_loaded

Testing crash

If you want to test crash then you can use sysrq for this.
Warning: kernel crash may corrupt data on your disks, run it at your own risk!
# echo c > /proc/sysrq-trigger

Once crash happens kexec will load your dump capture kernel.

Dump crashed kernel

Once booted into dump capture kernel you can read /proc/vmcore file. It is recommended to dump core to a file and analyze it later.

# cp /proc/vmcore /root/crash.dump

or optionally you can copy the crash to other machine. Once dump is saved you should reboot machine into normal system kernel.

Analyzing core dump

You can use either gdb tool or special gdb extension called crash. Run crash as

$ crash vmlinux path/crash.dump

Where vmlinux previously saved kernel binary with debug symbols.

Follow man crash or http://people.redhat.com/~anderson/crash_whitepaper/ for more information about debugging practices.

Additional information