mono occassionally crashes since kernel 3.13.0-48 on multi-cpu vm
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Medium
|
Chris J Arges | ||
Trusty |
Fix Released
|
Medium
|
Chris J Arges | ||
Utopic |
Fix Released
|
Medium
|
Chris J Arges | ||
Vivid |
Fix Released
|
Medium
|
Chris J Arges |
Bug Description
[Impact]
The addition of the commit:
http://
Causes SIGSEGVs when running certain workloads on multi-cpu VMs.
[Test Case]
Mono test case here that causes the SIGSEGV
https:/
[Fix]
These two commits are required for fixing this issue:
https:/
https:/
--
Gradually since late March more and more users started to complain about frequent SIGSEGV crashes in our .net/mono application. Early April I started to investigate it actively.
After eliminating possible native libraries, and testing various mono versions I discovered the crashes would occur more frequently on a vbox vm with multiple cpus configured. And discovered that the mono bug-18026.cs testcase would fairly consistently crash. At that point it was reported to the mono bug tracker.
I finally got a break when we found a correlation with the kernel version. 3.13.0-46 didn't crash while 3.13.0-48,49 did.
More and more users upgrade to these newer kernel versions and start running into issues, which explains the gradual increase in reports.
Early this week I performed a full git bisect on the kernel between 3.13.0-46 and -48 and isolated the commit that seems to trigger the crashes.
Namely http://
At this point I don't know if the commit messed up something, or that mono simply handles it incorrectly. However, a few commits for linux 4.x seem to fix it:
https:/
https:/
I applied these commits myself on top of commit 11f4e033, compiled and ran the testcase... didn't crash in the 200x test runs I did.
Although I don't know if those two patches have unknown side-effects.
I'm not an expert on the kernel, not even remotely. But I thought it would be nice to be able to point at a possible solution.
My current test vm is a virtualbox vm 64bit installed using the 14.04.2 server iso running on an older i7 quad core Windows 7 64bit host.
In the vm I've tested numerous mono and kernel combinations. Last test was with kernel 3.16.0-36 and 3.13.0-51 and mono 4.0.1, in which the problem still occurs.
By now I've debugged the app using gdb several dozen times on various user setups, compiled mono half a dozen times, and then the 8x3h compile kernel bisect :) Speaking of down the rabbit-hole...
So I'm pretty desperate for some expert to help me out here. :D
Reference to mono bug report: https:/
ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-
ProcVersionSign
Uname: Linux 3.13.0-51-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Apr 30 18:53 seq
crw-rw---- 1 root audio 116, 33 Apr 30 18:53 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.10
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
CurrentDmesg: [ 9.379188] init: plymouth-
Date: Thu Apr 30 19:45:43 2015
HibernationDevice: RESUME=
InstallationDate: Installed on 2015-04-22 (7 days ago)
InstallationMedia: Ubuntu-Server 14.04.2 LTS "Trusty Tahr" - Release amd64 (20150218.1)
IwConfig:
eth0 no wireless extensions.
lo no wireless extensions.
Lsusb:
Bus 001 Device 002: ID 80ee:0021 VirtualBox USB Tablet
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: innotek GmbH VirtualBox
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 1.127.11
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.board.name: VirtualBox
dmi.board.vendor: Oracle Corporation
dmi.board.version: 1.2
dmi.chassis.type: 1
dmi.chassis.vendor: Oracle Corporation
dmi.modalias: dmi:bvninnotekG
dmi.product.name: VirtualBox
dmi.product.
dmi.sys.vendor: innotek GmbH
summary: |
- mono occassionally crashes since kernel 3.13.0-46 on multi-cpu vm + mono occassionally crashes since kernel 3.13.0-48 on multi-cpu vm |
tags: | added: bisect-done regression-update reverse-bisect-done |
tags: | added: cherry-pick |
Changed in linux (Ubuntu): | |
importance: | Undecided → Medium |
status: | Confirmed → Triaged |
Changed in linux (Ubuntu Utopic): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Trusty): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Utopic): | |
status: | New → In Progress |
Changed in linux (Ubuntu Vivid): | |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Trusty): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in linux (Ubuntu Utopic): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in linux (Ubuntu Vivid): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in linux (Ubuntu Trusty): | |
status: | New → In Progress |
description: | updated |
Changed in linux (Ubuntu Vivid): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Utopic): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Trusty): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-trusty verification-done-utopic removed: verification-needed-trusty verification-needed-utopic |
tags: |
added: verification-done-vivid removed: verification-needed-vivid |
This change was made by a bot.