crash:crash-7-branch

Last commit made on 2022-04-26
Get this branch:
git clone -b crash-7-branch https://git.launchpad.net/crash

Branch merges

Branch information

Name:
crash-7-branch
Repository:
lp:crash

Recent commits

41da48e... by Kazuhito Hagio <email address hidden>

crash-7.3.1 -> crash-7.3.2

Signed-off-by: Kazuhito Hagio <email address hidden>

df1fd92... by Huang Shijie <email address hidden>

diskdump: Optimize the boot time

1.) The vmcore file maybe very big.

    For example, I have a vmcore file which is over 23G,
    and the panic kernel had 767.6G memory,
    its max_sect_len is 4468736.

    Current code costs too much time to do the following loop:
    ..............................................
        for (i = 1; i < max_sect_len + 1; i++) {
                dd->valid_pages[i] = dd->valid_pages[i - 1];
                for (j = 0; j < BITMAP_SECT_LEN; j++, pfn++)
                        if (page_is_dumpable(pfn))
                                dd->valid_pages[i]++;
    ..............................................

    For my case, it costs about 56 seconds to finish the
    big loop.

    This patch moves the hweightXX macros to defs.h,
    and uses hweight64 to optimize the loop.

    For my vmcore, the loop only costs about one second now.

2.) Tests result:
  # cat ./commands.txt
      quit

 Before:

  #echo 3 > /proc/sys/vm/drop_caches;
  #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1
   ............................
        real 1m54.259s
        user 1m12.494s
        sys 0m3.857s
   ............................

 After this patch:

  #echo 3 > /proc/sys/vm/drop_caches;
  #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1
   ............................
        real 0m55.217s
        user 0m15.114s
        sys 0m3.560s
   ............................

Signed-off-by: Huang Shijie <email address hidden>

bf24b45... by Huang Shijie <email address hidden>

diskdump: use mmap/madvise to improve the start-up

Sometimes, the size of bitmap in vmcore can be very large, such as over
256M. This patch uses mmap/madvise to improve the performance of reading
bitmap in the non-FLAT_FORMAT code path.

Without the patch:
    #echo 3 > /proc/sys/vm/drop_caches;
    #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1
    ............................
        real 0m55.217s
        user 0m15.114s
        sys 0m3.560s
    ............................

With the patch:
    #echo 3 > /proc/sys/vm/drop_caches;
    #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1
    ............................
        real 0m44.097s
        user 0m19.031s
        sys 0m1.620s
    ............................

Note:
    Test files:
        vmlinux: 272M
        vmcore : 23G (bitmap_len: 4575985664)
    #cat ./commands.txt
        quit

Signed-off-by: Huang Shijie <email address hidden>

bbd6f7b... by Rongwei Wang <email address hidden>

arm64: handle 1GB block for VM_L4_4K

When arm64 is configured with PAGE_SIZE=4k and 4 level
translation, the pagetable of all pages may be created with
block mapping or contiguous mapping as much as possible, likes
disable CONFIG_RODATA_FULL_DEFAULT_ENABLED. But now, vtop
command can not handle 1GB block (PUD mapping) well, and just
shows a seek error:

crash> vtop ffff00184a800000
VIRTUAL PHYSICAL
ffff00184a800000 188a800000

PAGE DIRECTORY: ffff8000110aa000
   PGD: ffff8000110aa000 => 203fff9003
   PUD: ffff001fffff9308 => 68001880000705
   PMD: ffff0018400002a0 => ffff8000103b4fd0
vtop: seek error: kernel virtual address: ffff7fffd03b4000 type: "page table"

This patch fixes it, and shows as following:

crash> vtop ffff00184a800000
VIRTUAL PHYSICAL
ffff00184a800000 188a800000

PAGE DIRECTORY: ffff8000110aa000
   PGD: ffff8000110aa000 => 203fff9003
   PUD: ffff001fffff9308 => 68001880000705
  PAGE: 1880000000 (1GB)

     PTE PHYSICAL FLAGS
68001880000705 1880000000 (VALID|SHARED|AF|PXN|UXN)

      PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffe00610a0000 188a800000 0 0 0 77fffe0000000000

Acked-by: Kazuhito Hagio <email address hidden>
Signed-off-by: Rongwei Wang <email address hidden>

b654750... by xiaer1921 <email address hidden>

Fix for "kmem -s|-S" on Linux 5.17+ with CONFIG_SLAB

Since the following kernel commits split slab info from struct page
into struct slab, crash cannot get several slab related offsets from
struct page.

  d122019bf061 ("mm: Split slab into its own type")
  401fb12c68c2 ("mm: Differentiate struct slab fields by sl*b implementations")
  07f910f9b729 ("mm: Remove slab from struct page")

Without the patch, "kmem -s|-S" options cannot work correctly on kernels
configured with CONFIG_SLAB with the following error:

  crash> kmem -s
  kmem: invalid structure member offset: page_active
        FILE: memory.c LINE: 12225 FUNCTION: verify_slab_overload_page()

Resolves: https://github.com/crash-utility/crash/issues/115
Signed-off-by: xiaer1921 <email address hidden>
Signed-off-by: Kazuhito Hagio <email address hidden>

ef0bf92... by Lianbo Jiang <email address hidden>

Fix the failure of resolving ".rodata" on s390x

The commit <cd8954023bd4> broke crash-utility on s390x and got the
following error:

  crash: cannot resolve ".rodata"

The reason is that all symbols containing a "." may be filtered out
on s390x. To prevent the current failure, do not filter out the
symbol ".rodata" on s390x.

In addition, a simple way is to check whether the symbol ".rodata"
exists before calculating the value of a symbol, just to be on the
safe side.

Fixes: 7d34768896a6 ("kernel: fix start-up time degradation caused by strings command")
Reported-by: Alexander Egorenkov <email address hidden>
Signed-off-by: Lianbo Jiang <email address hidden>

7d34768... by HATAYAMA Daisuke <email address hidden>

kernel: fix start-up time degradation caused by strings command

verify_namelist() uses strings command and scans full part of vmlinux
file to find linux_banner string. However, vmlinux file is quite large
these days, reaching over 500MB. As a result, this degradates start-up
time of crash command 10 or more seconds. (Of course, this depends on
machines you use for investigation, but I guess typically we cannot
use such powerful machines to investigate crash dump...)

To resolve this issue, let's use bfd library and read linux_banner
string in vmlinux file directly.

A simple benchmark shows the following result:

Without the fix:

    # cat ./commands.txt
    quit
    # time ./crash -i ./commands.txt \
        /usr/lib/debug/lib/modules/5.16.15-201.fc35.x86_64/vmlinux \
        /var/crash/*/vmcore >/dev/null 2>&1

    real 0m20.251s
    user 0m19.022s
    sys 0m1.054s

With the fix:

    # time ./crash -i ./commands.txt \
        /usr/lib/debug/lib/modules/5.16.15-201.fc35.x86_64/vmlinux \
        /var/crash/*/vmcore >/dev/null 2>&1

    real 0m6.528s
    user 0m6.143s
    sys 0m0.431s

Note that this commit keeps the original logic that uses strings
command for backward compatibility for in case.

Signed-off-by: HATAYAMA Daisuke <email address hidden>

502d605... by Huang Shijie <email address hidden>

arm64: fix the seek error of "pud page" for live debugging

Crash reported an error on kernel v5.7 when live debugging with the
command "crash vmlinux /proc/kcore":

  "crash: seek error: kernel virtual address: ffff75e9fffff000 type: "pud page""

The reason is that the PTOV() and arm64_vtop_4level_4k() do not work
as expected due to incorrect physvirt_offset.

To fix the above issue, need to read out the virtual address of
"physvirt_offset" from the "/proc/kallsyms", and update the
ms->phys_offset which is initialized with a wrong value in kernel
version [5.4, 5.10).

Signed-off-by: Huang Shijie <email address hidden>

d497af9... by Huang Shijie <email address hidden>

arm64: fix the wrong vmemmap_end

The VMEMMAP_END did not exist before the kernel v5.7, but for now, the
value of vmemmap_end may be set to -1(0xffffffffffffffffUL).

According to the arch/arm64/mm/dump.c (before kernel v5.7):
    ..................................................
      { VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end" }
    ..................................................

The vmemmap_end should always be:
vmemmap_end = vmemmap_vaddr + vmemmap_size;

This patch fixes the above issue.

Fixes: dc42750d5f2c ("arm64: update the modules/vmalloc/vmemmap ranges")
Signed-off-by: Huang Shijie <email address hidden>

386d4b5... by Huang Shijie <email address hidden>

arm64: use the vmcore info to get module/vmalloc/vmemmap ranges

Since the kernel commit <2369f171d5c5> ("arm64: crash_core: Export
MODULES, VMALLOC, and VMEMMAP ranges"), crash can obtain the range
of module/vmalloc/vmemmap from the vmcore info, and no need to
calculate them manually.

This patch adds a new hook arm64_get_range_v5_18 which could parse
out all the module/vmalloc/vmemmap ranges from the vmcore info.

Signed-off-by: Huang Shijie <email address hidden>