Boot crash with Trusty 3.13
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
High
|
Unassigned | ||
Trusty |
Fix Released
|
High
|
Juerg Haefliger |
Bug Description
== SRU Justification ==
Custom compilation of the Trusty 3.13 kernel codebase results in a (reproducible) QEMU boot crash (see below).
== Fix ==
Replace UBUNTU SAUCE patch with proper upstream commit:
548acf19234d ("x86/mm: Expand the exception table logic to allow new handling options")
== Regression Potential ==
Medium. The patch is quite large but the backport was a simple context adjustment. Ran the x86 selftests and perf NMI tests for several hours to verify stability.
== Test Case ==
Compile the Trusty 3.13 kernel code using the default config (make defconfig) and run the resulting kernel in QMEU. Crashes every time.
Original bug description:
While doing kernel testing using the Trusty 3.13 code base, I get the following boot crash with QEMU:
[ 0.338393] BUG: unable to handle kernel paging request at ffffffff014142f0
[ 0.338987] IP: [<ffffffff01414
[ 0.339388] PGD 180f067 PUD 0
[ 0.339388] Oops: 0010 [#1] SMP
[ 0.339388] Modules linked in:
[ 0.339388] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.11-
[ 0.339388] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 0.339388] task: ffff88003f708000 ti: ffff88003f6fa000 task.ti: ffff88003f6fa000
[ 0.339388] RIP: 0010:[<
[ 0.339388] RSP: 0000:ffff88003f
[ 0.339388] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 0.339388] RDX: 0000000000000000 RSI: ffff88003deb9eb4 RDI: ffffffff818b8590
[ 0.339388] RBP: ffff88003f6fbe98 R08: 0000000000000000 R09: ffff88003fa14ae0
[ 0.339388] R10: ffffffff81264c68 R11: ffffea0000fdd000 R12: ffffffff818b8590
[ 0.339388] R13: 00000000000000ad R14: 0000000000000000 R15: 0000000000000000
[ 0.339388] FS: 000000000000000
[ 0.339388] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.339388] CR2: ffffffff014142f0 CR3: 000000000180c000 CR4: 0000000000360770
[ 0.339388] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.339388] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.339388] Stack:
[ 0.339388] ffff88003f6fbf08 ffffffff81000402 ffff88003f6fbf00 ffffffff81065f88
[ 0.339388] ffff88003f6fbef0 ffff88003ffd96a1 ffffffff817e9d28 000000ad00060006
[ 0.339388] ffffffff817b013d ffffffff8196cef0 ffffffff8196d018 0000000000000006
[ 0.339388] Call Trace:
[ 0.339388] [<ffffffff81000
[ 0.339388] [<ffffffff81065
[ 0.339388] [<ffffffff8189d
[ 0.339388] [<ffffffff8189d
[ 0.339388] [<ffffffff813fa
[ 0.339388] [<ffffffff813fa
[ 0.339388] [<ffffffff8140f
[ 0.339388] [<ffffffff813fa
[ 0.339388] Code: Bad RIP value.
[ 0.339388] RIP [<ffffffff01414
[ 0.339388] RSP <ffff88003f6fbe98>
[ 0.339388] CR2: ffffffff014142f0
[ 0.339388] ---[ end trace a71242bdac7e8632 ]---
[ 0.339388] note: swapper/0[1] exited with preempt_count 1
[ 0.357079] swapper/0 (1) used greatest stack depth: 5424 bytes left
[ 0.357539] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 0.357539]
[ 0.358073] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000
Git bisect identified the following commit as the culprit:
commit 56764fdc3a84737
Author: Andy Whitcroft <email address hidden>
Date: Thu Feb 22 11:24:00 2018 +0100
UBUNTU: SAUCE: x86, extable: fix uaccess fixup detection
BugLink: http://
The existing code intends to identify a subset of fixups which need
special handling, uaccess related faults need to record the failure.
This is done by adjusting the fixup code pointer by a (random) constant
0x7ffffff0. This is detected in fixup_exception by comparing the two
pointers. The intent of this code is to detect the the delta between
the original code and its fixup code being greater than the constant.
However, the code as written triggers undefined comparison behaviour.
In this kernel this prevents the condition triggering, leading to panics
when jumping to the corrupted fixup address.
Convert the code to better implement the intent. Convert both of the
offsets to final addresses and compare the delta between those. Also add
a massive comment to explain all of this including the implicit assumptions
on order of the segments that this comparison implies.
Fixes: 706276543b69 ("x86, extable: Switch to relative exception table entries")
Signed-off-by: Andy Whitcroft <email address hidden>
Acked-by: Colin Ian King <email address hidden>
Acked-by: Khalid Elmously <email address hidden>
Signed-off-by: Kleber Sacilotto de Souza <email address hidden>
Changed in linux (Ubuntu Trusty): | |
status: | New → Triaged |
Changed in linux (Ubuntu): | |
status: | Incomplete → Triaged |
Changed in linux (Ubuntu Trusty): | |
importance: | Undecided → High |
Changed in linux (Ubuntu): | |
importance: | Undecided → High |
Changed in linux (Ubuntu Trusty): | |
assignee: | nobody → Juerg Haefliger (juergh) |
description: | updated |
Changed in linux (Ubuntu Trusty): | |
status: | Triaged → Fix Committed |
The upstream commit that fixes 706276543b69 ("x86, extable: Switch to relative exception table entries") (which the problematic commit tries to do as well) is:
ommit 548acf19234dbda 5a52d5a8e7e205a f46e9da840
Author: Tony Luck <email address hidden>
Date: Wed Feb 17 10:20:12 2016 -0800
x86/mm: Expand the exception table logic to allow new handling options
Huge amounts of help from Andy Lutomirski and Borislav Petkov to
produce this. Andy provided the inspiration to add classes to the
exception table with a clever bit-squeezing trick, Boris pointed
out how much cleaner it would all be if we just had a new field.
Linus Torvalds blessed the expansion with:
' I'd rather not be clever in order to save just a tiny amount of space
in the exception table, which isn't really criticial for anybody. '
The third field is another relative function pointer, this one to a
handler that executes the actions.
We start out with three handlers:
1: Legacy - just jumps the to fixup IP
2: Fault - provide the trap number in %ax to the fixup code
3: Cleaned up legacy for the uaccess error hack
Signed-off-by: Tony Luck <email address hidden> lkml.kernel. org/r/f6af78fcb d348cf4939875cf da9c19689b5e50b 8<email address hidden>
Reviewed-by: Borislav Petkov <email address hidden>
Cc: Linus Torvalds <email address hidden>
Cc: Peter Zijlstra <email address hidden>
Cc: Thomas Gleixner <email address hidden>
Link: http://
Signed-off-by: Ingo Molnar <email address hidden>