Kernel oops due to uninitialized list on kernfs (kernfs_kill_sb)

Bug #1934175 reported by Guilherme G. Piccoli
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
High
Unassigned
Bionic
Fix Released
High
Krzysztof Kozlowski

Bug Description

[Impact]
* We had a recent report of a kernel crash due to a NULL pointer dereference in a Bionic 4.15 derivative kernel, as per the following log collected:

[...]
[537105.767348] SLUB: Unable to allocate memory on node -1, gfp=0x14000c0(GFP_KERNEL)
[...]
[537105.767368] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[537105.777711] IP: kernfs_kill_sb+0x31/0x70
[537105.783582] PGD 0 P4D 0
[537105.787844] Oops: 0002 [#1] SMP PTI
[...]
RIP: 0010:kernfs_kill_sb+0x31/0x70
RSP: 0018:ffffb90aec1afd00 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff9fdbd567d900 RCX: ffffa0143885ae01
RDX: 0000000000000000 RSI: ffffa0143885ae00 RDI: ffffffffa2937c40
RBP: ffffb90aec1afd10 R08: ffffa0150b581510 R09: 000000018100004d
R10: ffffb90aec1afcd8 R11: 0000000000000100 R12: ffffa01436e43000
R13: ffffa01436e43000 R14: 0000000000000000 R15: ffff9fdbd567d900
FS: 00007fe41a615b80(0000) GS:ffffa01afea40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000007dfe3cc003 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 sysfs_kill_sb+0x1f/0x40
 deactivate_locked_super+0x48/0x80
 kernfs_mount_ns+0x1eb/0x230
 sysfs_mount+0x66/0xc0
 mount_fs+0x37/0x160
 ? alloc_vfsmnt+0x1b3/0x230
 vfs_kern_mount.part.24+0x5d/0x110
 do_mount+0x5ed/0xce0
[...]

* The following detailed call stack plus the disassembly help to understand the cause of the issue:

mount_fs()
--sysfs_mount()
----kernfs_mount_ns() <inlined kernfs_fill_super() fails, very likely due to being unable to allocate memory>
------deactivate_locked_super() <given the callback .kill_sb = sysfs_kill_sb, next function is called>
--------sysfs_kill_sb()
----------kernfs_kill_sb() <OOPS due to the unitialized list>

The below disassembly of kernfs_kill_sb() clarifies exactly the issue:

ffffffff812f46e0 <kernfs_kill_sb>:
[ ... prologue ...]
48 8b 9f 08 04 00 00 mov 0x408(%rdi),%rbx # %rbx = kernfs_super_info *info = sb->s_fs_info
49 89 fc mov %rdi,%r12 # %r12 = super_block *sb
48 c7 c7 40 7c 53 82 mov $0xffffffff82537c40,%rdi # %rdi = &kernfs_mutex (global)
ffffffff812f46f9: R_X86_64_32S kernfs_mutex
e8 ee da 67 00 callq ffffffff819721f0 <mutex_lock> # mutex_lock(&kernfs_mutex);
[...]
48 8b 53 18 mov 0x18(%rbx),%rdx # %rdx = info->node
48 8b 43 20 mov 0x20(%rbx),%rax # based on splat, RAX == 0x0 [info->head.prev]
48 89 42 08 mov %rax,0x8(%rdx) # <- OOPS [tried to assign next->prev = prev, see __list_del()]
48 89 10 mov %rdx,(%rax)
48 b8 00 01 00 00 00 movabs $0xdead000000000100,%rax # node->next = LIST_POISON1
[...]

* The fix for this issue comes from upstream commit 82382acec0c9 ("kernfs: deal with kernfs_fill_super() failures"); this commit is a very trivial fix that adds an INIT_LIST_HEAD(&info->node) in kernfs_mount_ns(), making the list prev/next pointers valid since the beginning. Unfortunately this commit wasn't CCed to stable email when sent, so it wasn't automatically picked up by Ubuntu kernel; now it was properly submitted to stable list [0].

* Along with this fix, we found another commit (7b745a4e4051) which is a small/simple fix to correlated code, that also should have been sent to 4.14.y stable branch, but for some reason wasn't. Since both commits were accepted in linux-stable, we are hereby proposing the backport for Ubuntu kernel 4.15.

[0] https://<email address hidden>/

[Test Case]
* We don't have a real test case, although low-memory condition or an artificial kprobe reproducer could easily trigger the issue.

* We booted a qemu virtual machine with a kernel containing both patches with no issues.

[Where problems could occur]
* The likelihood of issues are low, specially due to the fact both patches are very simple and they are on upstream kernel for more than 3 years (and were quickly accepted in 4.14.y stable branch last week).

* With that sad, the second patch could potentially introduce issues with super_block references - I honestly cannot conceive any issues potentially caused by patch 1.

CVE References

Changed in linux (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
description: updated
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Changed in linux (Ubuntu):
assignee: Guilherme G. Piccoli (gpiccoli) → nobody
Changed in linux (Ubuntu Bionic):
assignee: Guilherme G. Piccoli (gpiccoli) → nobody
assignee: nobody → Krzysztof Kozlowski (krzk)
tags: added: bionic verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.5 KiB)

This bug was fixed in the package linux - 4.15.0-154.161

---------------
linux (4.15.0-154.161) bionic; urgency=medium

  * bionic/linux: 4.15.0-154.161 -proposed tracker (LP: #1938411)

  * Potential reverts of 4.19.y stable changes in 18.04 (LP: #1938537)
    - SAUCE: Revert "locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to
      signal"
    - SAUCE: Revert "drm/amd/amdgpu: fix refcount leak"

  * Packaging resync (LP: #1786013)
    - [Packaging] resync getabis
    - [Packaging] update helper scripts
    - update dkms package versions

  * btrfs: Automatic balance returns -EUCLEAN and leads to forced readonly
    filesystem (LP: #1934709) // CVE-2019-19036
    - btrfs: Validate child tree block's level and first key
    - btrfs: Detect unbalanced tree with empty leaf before crashing btree
      operations

  * btrfs: Automatic balance returns -EUCLEAN and leads to forced readonly
    filesystem (LP: #1934709)
    - Revert "btrfs: Detect unbalanced tree with empty leaf before crashing btree
      operations"
    - Revert "btrfs: Validate child tree block's level and first key"
    - btrfs: Only check first key for committed tree blocks
    - btrfs: Fix wrong first_key parameter in replace_path

  * Enable fib-onlink-tests.sh and msg_zerocopy.sh in kselftests/net on Bionic
    (LP: #1934759)
    - selftests: Add fib-onlink-tests.sh to TEST_PROGS
    - selftests: net: use TEST_PROGS_EXTENDED
    - selftests/net: enable msg_zerocopy test
    - SAUCE: selftests: Make fib-onlink-tests.sh executable

  * Kernel oops due to uninitialized list on kernfs (kernfs_kill_sb)
    (LP: #1934175)
    - kernfs: deal with kernfs_fill_super() failures
    - unfuck sysfs_mount()

  * large_dir in ext4 broken (LP: #1933074)
    - SAUCE: ext4: fix directory index node split corruption

  * btrfs: Attempting to balance a nearly full filesystem with relocated root
    nodes fails (LP: #1933172) // CVE-2019-19036
    - btrfs: reloc: fix reloc root leak and NULL pointer dereference

  * btrfs: Attempting to balance a nearly full filesystem with relocated root
    nodes fails (LP: #1933172)
    - Revert "btrfs: reloc: fix reloc root leak and NULL pointer dereference"

  * Pixel format change broken for Elgato Cam Link 4K (LP: #1932367)
    - (upstream) media: uvcvideo: Fix pixel format change for Elgato Cam Link 4K

  * Bionic update: upstream stable patchset 2021-06-23 (LP: #1933375)
    - net: usb: cdc_ncm: don't spew notifications
    - efi: Allow EFI_MEMORY_XP and EFI_MEMORY_RO both to be cleared
    - efi: cper: fix snprintf() use in cper_dimm_err_location()
    - vfio/pci: Fix error return code in vfio_ecap_init()
    - vfio/pci: zap_vma_ptes() needs MMU
    - vfio/platform: fix module_put call in error flow
    - ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
    - HID: pidff: fix error return code in hid_pidff_init()
    - HID: i2c-hid: fix format string mismatch
    - netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches
    - ieee802154: fix error return code in ieee802154_add_iface()
    - ieee802154: fix error return code in ieee802154_llsec_getparams()
    - Bluetooth: fix the erroneous flush_work() order
    - Blu...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.