Comment 11 for bug 1867916

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Ryan,

Part 1)
------

First, please try to reproduce the problem later, not so early in boot,
by disabling the bcache module on the kernel boot parameters, and then
loading it after the system has booted successfully.
(This should be possible as you mentioned the boot disk isn't involved.)

1) Edit '/etc/fstab' and either comment or add the 'noauto' option to
the mounts depending on bcache, so that systemd doesn't delay on boot.

For example,

$ sudo vim /etc/fstab
From: /dev/mapper/*whatadisk* /mountpoint ext4 defaults 0 0
To: /dev/mapper/*whatadisk* /mountpoint ext4 defaults,noauto 0 0
Esc, :x, Enter

2) Edit '/etc/default/grub' and add the 'modprobe.blacklist=bcache' option
to GRUB_CMDLINE_LINUX_DEFAULT.

For example,

$ sudo vim /etc/default/grub
From: GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0"
To: GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0 modprobe.blacklist=bcache"
Esc, :x, Enter

Update and check grub config:

$ sudo update-grub

$ grep modprobe.blacklist=bcache /boot/grub/grub.cfg
                linux /boot/vmlinuz-4.15.0-91-generic ... modprobe.blacklist=bcache
                linux /boot/vmlinuz-4.15.0-88-generic ... modprobe.blacklist=bcache

3) Reboot the system in 4.15.0-91, it should not fail, as bcache is not loaded.

4) Now load bcache, retrigger device events, and check if the problem reproduces.

$ sudo modprobe bcache
$ sudo udevadm trigger

This should register the bcache devices, e.g., /dev/bcache0.

If you can see /dev/bcache0 and the problem did NOT happen,
please stop here and let me know.

If the problem reproduced, please proceed after your system
rebooted (it should boot normally as it has bcache disabled.)

...

Part 2)
------

1) Install linux-crashdump:

$ sudo apt install linux-crashdump

Answer these questions:

- Should kexec-tools handle reboots (sysvinit only)? No
- Should kdump-tools be enabled by default? Yes

2) Increase the reserved memory size for the crashdump kernel:

Edit '/etc/default/grub.d/kdump-tools.cfg' and change the crashkernel size from 192M to 512M or 768M if possible:

For example,

$ sudo vim /etc/default/grub.d/kdump-tools.cfg
from: GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=512M-:192M"
to: GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=512M-:768M"
Esc, :x, Enter

4) Update grub and reboot

$ sudo update-grub
$ sudo reboot

5) Check kdump status is 'ready' and that panic_on_oops is enabled (1) by default:

$ sudo kdump-config status
current state: ready to kdump

$ cat /proc/sys/kernel/panic_on_oops
1

6) Trigger a test crashdump

$ echo 1 | sudo tee /proc/sys/kernel/sysrq
$ echo c | sudo tee /proc/sysrq-trigger

This apparently 'reboots' the system, and collects a memory dump:

[ 8.510809] kdump-tools[781]: Starting kdump-tools: * running makedumpfile -c -d 31 /proc/vmcore /var/crash/202004081540/dump-incomplet$
...
Copying data : [100.0 %] - eta: 0s
...
[ 15.964149] kdump-tools[781]: * kdump-tools: saved vmcore in /var/crash/202004081540
...
[ 16.176388] kdump-tools[781]: * kdump-tools: saved dmesg content in /var/crash/202004081540
...
[ 17.187848] kdump-tools[781]: Rebooting.
...

7) After the system boots again, check the crashdump is stored in /var/crash/<timestamp>

$ ls -1 /var/crash/202004081540
dmesg.202004081540
dump.202004081540

If this didn't happen, please stop and let me know, so we can fix the crashdump mechanism.

If you have /var/crash/<timestamp>, the crashdump is working, let's move forward.
Feel free to remove that directory, $ sudo rm -rf /var/crash/<timestamp>

...

8) Boot again and reproduce the problem.

Again, boot in 4.15.0-91, and reproduce the problem manually as in step 4 in Part 1.

And this should generate a crashdump in /var/crash, as in the test crashdump.
Please create a tarball and attach it to Launchpad.

$ sudo tar cvf lp1867916-crashdump.tar /var/crash/<timestamp>

If there are attachment size limit issues, please let me know, or use another hosting website, if at all possible.

Thank you very much,
Mauricio