[Needs 2.6.38] [sandybridge] GPU lockup (IPEHR: 0x680b0000)

Bug #718870 reported by Mario Lizier
124
This bug affects 14 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
xserver-xorg-video-intel (Ubuntu)
Fix Released
High
Unassigned

Bug Description

This bug seems to be particular to Sandybridge hardware. The IPEHR values seem to vary from freeze to freeze, but seem to not be random. The value 0x68b0000 has been seen in a few different reports by different people, along with 0x78170003 and one or two other codes.

Here is an example sequence from one reporter on a particular system:
#733206 (IPEHR: 0x78170003)
#733208 (IPEHR: 0x680b0000)
#733209 (IPEHR: 0x78170003)
#733324 (IPEHR: 0x680b0000)

His description: "The GPU lockups really soon. The time between the login and the lockup is between a few seconds and a few minutes. Opening Nautilus and trying to do something there is a guarantee for catching a lockup."

[Original Bug Report]
I believe this is the same bus as #715122, but after the last update (#142 from 2011-02-11)

[ 482.078037] do_trap: 6 callbacks suppressed
[ 482.078042] intel_gpu_top[1931] trap divide error ip:4016bf sp:7fff09c33610 error:0 in intel_gpu_top[400000+6000]
[ 750.170912] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[ 750.172237] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 183795 at 183794, next 183796)
[ 750.231061] divide error: 0000 [#1] SMP
[ 750.231079] last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/drm/card0/uevent
[ 750.231101] CPU 0
[ 750.231107] Modules linked in: binfmt_misc parport_pc ppdev dm_crypt snd_hda_codec_hdmi snd_hda_codec_realtek coretemp snd_hda_intel lp xhci_hcd snd_hda_codec shpchp snd_hwdep snd_pcm joydev snd_seq_midi parport snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device psmouse snd soundcore serio_raw snd_page_alloc btrfs zlib_deflate crc32c_intel libcrc32c usbhid hid i915 ahci libahci drm_kms_helper drm i2c_algo_bit video e1000e
[ 750.231243]
[ 750.231249] Pid: 5, comm: kworker/u:0 Not tainted 2.6.38-3-generic #30-Ubuntu DH67BL/
[ 750.231272] RIP: 0010:[<ffffffffa00dc758>] [<ffffffffa00dc758>] ironlake_compute_srwm+0x88/0x2b0 [i915]
[ 750.231308] RSP: 0018:ffff88020f4eb870 EFLAGS: 00010246
[ 750.231324] RAX: 0000000000000000 RBX: 00000000000005dc RCX: 0000000000000000
[ 750.231344] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88020b32a000
[ 750.231363] RBP: ffff88020f4eb8c0 R08: 0000000000000004 R09: 0000000000000000
[ 750.231383] R10: 00000000000005dc R11: ffff88020f4eb92c R12: ffff88020f4eb928
[ 750.231403] R13: 0000000010624dd3 R14: 0000000000000001 R15: ffff88020f4eb928
[ 750.231423] FS: 0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
[ 750.231446] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 750.231462] CR2: 00007fff4ea6c418 CR3: 0000000001a03000 CR4: 00000000000406f0
[ 750.231482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 750.231502] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 750.231522] Process kworker/u:0 (pid: 5, threadinfo ffff88020f4ea000, task ffff88020f4cdb00)
[ 750.231545] Stack:
[ 750.231551] 0000000000000282 ffff88020f4eb900 ffff88020f4eb8c0 ffffffff81075885
[ 750.231574] ffff88020f4ebfd8 ffff88020b388000 ffff88020f4eb92c ffff88020b388018
[ 750.231597] ffff88020f4eb924 ffff88020f4eb928 ffff88020f4eb960 ffffffffa00ddf55
[ 750.231619] Call Trace:
[ 750.231631] [<ffffffff81075885>] ? try_to_del_timer_sync+0x85/0xe0
[ 750.231653] [<ffffffffa00ddf55>] sandybridge_update_wm+0x1c5/0x3f0 [i915]
[ 750.231677] [<ffffffffa00db160>] intel_update_watermarks+0x140/0x160 [i915]
[ 750.231701] [<ffffffffa00e4b85>] ironlake_crtc_disable+0x3d5/0x900 [i915]
[ 750.231725] [<ffffffffa00e515e>] ironlake_crtc_prepare+0xe/0x10 [i915]
[ 750.231746] [<ffffffffa009d577>] drm_crtc_helper_set_mode+0x317/0x4e0 [drm_kms_helper]
[ 750.231772] [<ffffffffa009d7b2>] drm_helper_resume_force_mode+0x72/0x150 [drm_kms_helper]
[ 750.231801] [<ffffffffa0053727>] ? drm_irq_install+0x167/0x230 [drm]
[ 750.231822] [<ffffffffa00b5bf6>] i915_reset+0x266/0x4d0 [i915]
[ 750.231842] [<ffffffffa00b9f30>] ? i915_error_work_func+0x0/0x130 [i915]
[ 750.231865] [<ffffffffa00ba007>] i915_error_work_func+0xd7/0x130 [i915]
[ 750.231885] [<ffffffff8108145d>] process_one_work+0x11d/0x420
[ 750.231903] [<ffffffff81082083>] worker_thread+0x163/0x360
[ 750.231920] [<ffffffff81081f20>] ? worker_thread+0x0/0x360
[ 750.231936] [<ffffffff81086d66>] kthread+0x96/0xa0
[ 750.231951] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10
[ 750.231969] [<ffffffff81086cd0>] ? kthread+0x0/0xa0
[ 750.231984] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10
[ 750.232000] Code: e8 4c 8b 75 f0 4c 8b 7d f8 c9 c3 0f 1f 84 00 00 00 00 00 69 c1 e8 03 00 00 49 63 da 41 bd d3 4d 62 10 41 0f af f0 89 c2 c1 fa 1f <41> f7 f9 31 d2 48 63 c8 48 89 d8 48 f7 f1 48 b9 cf f7 53 e3 a5
[ 750.232092] RIP [<ffffffffa00dc758>] ironlake_compute_srwm+0x88/0x2b0 [i915]
[ 750.232116] RSP <ffff88020f4eb870>

ACTHD: 0xffffffff
EIR: 0x00000000
EMR: 0xffffffff
ESR: 0x00000000
PGTBL_ER: 0x00000000
IPEHR: 0x680b0000
IPEIR: 0x00000000
INSTDONE1: 0xfffffffe
INSTDONE2: 0xffffffff

ProblemType: Crash
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.14.0-1ubuntu7
ProcVersionSignature: Ubuntu 2.6.38-3.30-generic 2.6.38-rc4
Uname: Linux 2.6.38-3-generic x86_64
Architecture: amd64
Chipset: sandybridge
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: None
DRM.card0.DP.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.DP.2:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.HDMI.A.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.HDMI.A.2:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.VGA.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1920x1080 1680x1050 1280x1024 1280x1024 1152x864 1024x768 1024x768 800x600 800x600 640x480 640x480 720x400
 edid-base64: AP///////wAebaNXAQEBAQoUAQNsMBt46pU1oVlXnycOUFSlSwCzAIGAgY9xTwEBAQEBAQEBAjqAGHE4LUBYLEUA3QwRAAAaAAAA/QA4Sx5TDwAKICAgICAgAAAA/ABFMjI0MAogICAgICAgAAAA/wBTZXJpYWxfTnVtYmVyAIg=
Date: Mon Feb 14 14:51:37 2011
DistUpgraded: Fresh install
DistroCodename: natty
DistroVariant: ubuntu
DkmsStatus:

DumpSignature: 803644b9 (IPEHR: 0x680b0000)
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
GraphicsCard: Subsystem: Intel Corporation Device [8086:2002]
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Alpha amd64 (20110121)
InterpreterPath: /usr/bin/python2.7
ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-2.6.38-3-generic root=UUID=9aca5a97-fe4f-45d8-8b52-fac0552fdf14 ro
ProcKernelCmdLine_: BOOT_IMAGE=/vmlinuz-2.6.38-3-generic root=UUID=9aca5a97-fe4f-45d8-8b52-fac0552fdf14 ro
RelatedPackageVersions:
 xserver-xorg 1:7.6~3ubuntu4
 libdrm2 2.4.23-1ubuntu3
 xserver-xorg-video-intel 2:2.14.0-1ubuntu7
Renderer: Hardware acceleration
SourcePackage: xserver-xorg-video-intel
Title: [sandybridge] GPU lockup 803644b9 (IPEHR: 0x680b0000)
UserGroups:

dmi.bios.date: 10/25/2010
dmi.bios.vendor: Intel Corp.
dmi.bios.version: BLH6710H.86A.0062.2010.1025.1539
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: DH67BL
dmi.board.vendor: Intel Corporation
dmi.board.version: AAG10189-202
dmi.chassis.type: 3
dmi.modalias: dmi:bvnIntelCorp.:bvrBLH6710H.86A.0062.2010.1025.1539:bd10/25/2010:svn:pn:pvr:rvnIntelCorporation:rnDH67BL:rvrAAG10189-202:cvn:ct3:cvr:
version.compiz: compiz 1:0.9.2.1+glibmainloop4-0ubuntu11
version.libdrm2: libdrm2 2.4.23-1ubuntu3
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10-1ubuntu1
version.xserver-xorg: xserver-xorg 1:7.6~3ubuntu4
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.13.2+git20110124.fadee040-0ubuntu4
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-1ubuntu7
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu4

Revision history for this message
Mario Lizier (lizier) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

Thank you for following up with a new bug report.

This apport capture is a lot more detailed than bug 715122; in fact with that bug it's not really evident what went wrong (the error codes are all 0's). But this one shows that there's been a divide error. There is no reason not to think it's the same as the previous lockup, so I'm duping the previous one to this one.

Since it sounds like you experience this crash multiple times, can you speculate what actions or activities seem to trigger it the most? A divide-by-zero error seems like something which would have a way to trigger it.

description: updated
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote :

Hmm, it's interesting that IPEHR varies from bug to bug, but otherwise the error codes look the same.

Thanks for filing multiple reports; it gives us extra data points and demonstrates it as an ongoing issue. I think you've given us enough data though, so I think you can hold off filing more bug reports about this freeze issue until bug 718870 is resolved or until we need more testing.

It's likely this lockup is due to a bug in the kernel, and given the newness of sandybridge it is not unlikely that a kernel update will bring a fix from upstream. So please keep an eye out for if the bug goes away after an update and follow up here.

Meanwhile, I think the next step is to forward this bug report upstream for further attention.

Changed in xserver-xorg-video-intel (Ubuntu):
importance: Undecided → High
status: Incomplete → Confirmed
Bryce Harrington (bryce)
summary: - [sandybridge] GPU lockup 803644b9 (IPEHR: 0x680b0000)
+ [sandybridge] GPU lockup (IPEHR: 0x680b0000)
Bryce Harrington (bryce)
description: updated
description: updated
Revision history for this message
Benjamin Drung (bdrung) wrote : Re: [sandybridge] GPU lockup (IPEHR: 0x680b0000)

This bug is probably a kernel issue. I don't get this crash with the drm-intel-next kernel (version 2.6.38-997.201103081433).

Revision history for this message
Benjamin Drung (bdrung) wrote :

This bug is fixed in the vanilla kernel 2.6.38 (taken from the kernel PPA). Can the others test the vanilla kernel [1] too and report back if this kernel fixes the issue too?

[1] http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.38-natty/

Revision history for this message
Lenar (lenar) wrote :

Maybe I have another bug, but the kernels mentioned in #4 and #5 do not behave any better.
Whole machine hangs during the first minute after logon to KDE. With those two
kernels it's so hanged I can't switch terminal and so can't see the error message.
Even SysRq doesn't work.

Latest kernel I have not crashing or hanging is 2.6.37-02063701-generic.
2.6.38-1-generic (based on rc2 I think) is the next one I have which started to show the problem.

Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks for confirming this bug is resolved by an updated kernel. That kernel version should be released to ubuntu shortly, so we can close out this bug at that time.

summary: - [sandybridge] GPU lockup (IPEHR: 0x680b0000)
+ [Needs 2.6.38] [sandybridge] GPU lockup (IPEHR: 0x680b0000)
Changed in xserver-xorg-video-intel (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Bryce Harrington (bryce) wrote :

bdrung, I gather that the kernel team has now baselined the 2.6.38 kernel, which corresponds to the one you believe is fixed from comment #5 - can you confirm that the issue is indeed gone for you? If so, this bug report can be closed out now.

Revision history for this message
Benjamin Drung (bdrung) wrote :

Yes, this issue is gone with the 2.6.38-7 kernel.

Revision history for this message
Mario Lizier (lizier) wrote :

My machine had the same behavior described in comment #5, but now I am running 2.6.38-7 without any crash in days :)

Revision history for this message
Lenar (lenar) wrote :

I wish I could say the issue is gone for me too. But I can't. It just doesn't work.
Whole computer hangs so I cannot get any messages too. Cannot log in over network,
sysrq doesn't work.

Anyway, most probably it is another bug, we can close this report. I won't be reporting
a new one though (because I really can't get any useful information on this one, except
that linux-image-2.6.37-02063701-generic is the latest one which doesn't hang).

tags: added: kj-triage
Bryce Harrington (bryce)
Changed in xserver-xorg-video-intel (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Marking this Fix Released for the linux task per comments 9, 10, and 11. Thanks.

Changed in linux (Ubuntu):
status: New → Fix Released
Revision history for this message
Christian Reis (kiko) wrote :

For the record, this still affects Maverick installs; to get around this, you'll need an updated 2.6.38 kernel. And, if you want working GPU acceleration, see the PPAs suggested at http://ubuntuforums.org/showpost.php?p=10358709&postcount=3

To post a comment you must log in.