Talk:Kdump

From ArchWiki
Latest comment: 25 December 2021 by Archiebunkerlinux in topic update Kdump services

update Kdump services

I just recently got an up to date variation of the Kdump SystemD + GRUB strategy and propose editing the sections on the SystemD services to take into account the necessary changes, most importantly:

makedumpfile does not appear to support kernels newer than something like 5.11, and since Arch is at 5.15 this means the services are obsolete. I don't personally know of tools that are capable of doing the same kind of core crash file compression as makedumpfile, but a common strategy is to use vmcore-dmesg from kexec-tools to extract just the dmesg log if you don't want to store the entire memory dump (which can be huge for e.g. servers).

The way I set mine up is as a kind of replacement to the diskdump strategy, so that you can save logs to a removable USB drive. That plus some way of replicating the old netdump strategies would be nice additions to the page. I don't mind contributing what I can here.Archiebunkerlinux (talk) 06:01, 25 December 2021 (UTC)Reply[reply]

Just to be clear, is the dependency on kexec only to install kexec-tools?

In other words, the systemd setup on the kexec page is for example for quick reboot, but is not needed for kdump to work. If one wants to just do kdump one may skip the kexec systemd stuff - do I have that right? SanjeevKSharma (talk) 17:20, 27 October 2013 (UTC)Reply[reply]

how are others setting up automatic crash dump (no user intervention) then reboot?

For cases where the video is messed up it would be great to just dump the crashed kernel and hard reboot with no keyboard or terminal interaction. So if kexec, kdump and systemd are all working but video/console corruption is hiding it one still gets a dump file.

here are 2 setups that are doing it but they don't specify how exactly - maybe setting off a shell script from the boot parameters in grub or syslinux?

https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes

http://doc.opensuse.org/products/draft/SLES/SLES-tuning_sd_draft/cha.tuning.kexec.html

How could we do this on Arch? SanjeevKSharma (talk) 03:06, 28 October 2013 (UTC)Reply[reply]

Intel iommu can cause problems

when I manually crashed the machine it just sat there for 5 minutes. Second time I tried I took a shower & when I came back the stack trace screen was still there. I found reports of Intel iommu [0] causing issues on one of Redhat's kdump pages and a kernel mailing list

I added " intel_iommu=off" to "--append="root=/dev/s..." in kdump.service

The capture kernel comes up now after manually panicing SanjeevKSharma (talk) 03:03, 28 October 2013 (UTC)Reply[reply]

[0] the reports have it as only affecting intel, not AMD iommu.

No crash dump showing up

I manually panic my machine - the stack trace screen shows up then the new kernel comes up ... and there's no /proc/vmcore

I grep'd the kernel source tree I compiled from

grep -r CONFIG_CRASH_ *

src/linux-3.11/config.x86_64:CONFIG_CRASH_DUMP=y


CONFIG_DEBUG_INFO is in there too so I assume the kernel built with those CONFIG_ options.

Is there a way to be completely sure those options made it into the actual kernel binary that got built? SanjeevKSharma (talk) 03:09, 28 October 2013 (UTC)Reply[reply]

update: I had placed the echoing of the 3 kernel config options close to the bottom of the build() in PKGCONFIG; the options didn't take effect. place that echo @ or near the top of the build. There's a tool, extract-ikconfig, in the kernel source that shows the options - run it against the compiled image or the installed, compressed image

/usr/src/linux-3.11.5-1-ARCH/scripts/extract-ikconfig /home/sam/BAK/KERNEL/core/linux/src/linux-3.11/vmlinux /usr/src/linux-3.11.5-1-ARCH/scripts/extract-ikconfig /boot/vmlinuz-linux-withKDUMP SanjeevKSharma (talk) 00:07, 29 October 2013 (UTC)Reply[reply]

And there's file /proc/config.gz, which is just the config file all by itself for the currently running kernel.

Safer way to test crashdumping?

At the moment, the section about using /proc/sys/kernel/sysrq to capture a crashdump suggests simply crashing the system without doing anything prior. It warns of potential data loss, but shouldn't we add a recommendation to at least do an emergency sync and read-only remount of the root fs (with echo "u" > /proc/sys/kernel/sysrq) so that we don't corrupt the entire filesystem? goose121 (talk) 19:54, 2 January 2018 (UTC)Reply[reply]

CONFIG_PROC_KCORE is also needed

if you get: "kexec[599]: Cannot read /proc/kcore: No such file or directory"

Howaboutsynergy (talk) 12:54, 11 May 2019 (UTC)Reply[reply]

Is video mode not reset so you can't see what kexec'd kernel is doing? (tested i915)

Found two ways of making sure the kexec kernel resets video mode so you're not left with the previous kernel's frozen screen and thus can't see what it's doing: 1. either recompile kernel with i915 and drm kernel modules as built into kernel (so, CONFIG_DRM=y and CONFIG_DRM_I915=y instead of =m) for the kexec kernel. OR, 2. keep them as modules but make sure they're present(or what) in the initramfs image by making sure /etc/mkinicpio.conf has the `MODULES=(i915 drm fbcon)` line. The idea is that the video driver(s) have to be loaded early and both of these methods ensure that can be done. (for more info search for "How to reset the video mode of the Intel driver i915 for the kexec-ed kernel so I can see what kexec kernel is doing?" on unix stackexchange - I can't post the link because I'm using git pacman and being asked for its output/captcha won't match) Howaboutsynergy (talk) 14:37, 10 September 2019 (UTC)Reply[reply]