10de:0429 [Lenovo ThinkPad T61] frequent Xorg freezes with nouveau

Bug #711434 reported by Marc Deslauriers
54
This bug affects 12 people
Affects Status Importance Assigned to Milestone
Nouveau Xorg driver
Fix Released
High
xserver-xorg-video-nouveau (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

Binary package hint: xorg

Since switching to nouveau from -nvidia, I get frequent lockups. Mouse still moves, but screen doesn't update anymore.

WORKAROUND: nouveau.noaccel=1

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: xorg 1:7.6~3ubuntu1
ProcVersionSignature: Ubuntu 2.6.38-1.28-generic 2.6.38-rc2
Uname: Linux 2.6.38-1-generic x86_64
Architecture: amd64
DRM.card0.DVI.D.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1680x1050 1680x1050 1680x1050 1400x1050 1280x1024 1280x960 1152x864 1024x768 800x600 640x480 720x400 640x400 640x350
 edid-base64: AP///////wAwrlNAAAAAAAERAQOAIRV46of1lFdPjCcnUFQAAAABAQEBAQEBAQEBAQEBAQEBqC+Q4GAaEEAgQBMAS88QAAAZtyeQ4GAaEEAgQBMAS88QAAAZAAAADwCzCjKzCigUAQAGr3QkAAAA/gBCMTU0U1cwMSBWOSAKAIs=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
Date: Tue Feb 1 14:18:29 2011
DistUpgraded: Yes, recently upgraded Log time: 2010-12-12 14:49:15.473692
DistroCodename: natty
DistroVariant: ubuntu
DkmsStatus:

EcryptfsInUse: Yes
GraphicsCard: Subsystem: Lenovo ThinkPad T61 [17aa:20d8]
MachineType: LENOVO 6459CTO
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcEnviron:
 LANGUAGE=en_CA:en
 PATH=(custom, user)
 LANG=en_CA.UTF-8
 LC_MESSAGES=en_CA.utf8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-2.6.38-1-generic root=/dev/mapper/defaultvg-root ro quiet splash vt.handoff=7
SourcePackage: xorg
Symptom: display
Title: Xorg freeze
dmi.bios.date: 11/14/2008
dmi.bios.vendor: LENOVO
dmi.bios.version: 7LETC5WW (2.25 )
dmi.board.name: 6459CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7LETC5WW(2.25):bd11/14/2008:svnLENOVO:pn6459CTO:pvrThinkPadT61:rvnLENOVO:rn6459CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 6459CTO
dmi.product.version: ThinkPad T61
dmi.sys.vendor: LENOVO
version.libdrm2: libdrm2 2.4.23-1ubuntu3
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10-1ubuntu1
version.xserver-xorg: xserver-xorg 1:7.6~3ubuntu1
version.xserver-xorg-video-ati: xserver-xorg-video-ati N/A
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-1ubuntu4
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu4

Revision history for this message
In , Vvv-oktetlabs (vvv-oktetlabs) wrote :

Created attachment 33897
Xorg log

I have observed X server hangs up few times (not too often, usually once per 1-2 days). System continue to work properly still accessible through network. kill -9 to X restore the system with loss of X session. strace to X process shows it hangs on ioctl to /dev/dri/card0:

ioctl(11, 0x40086485, 0x7fff61c01800) = ? ERESTARTSYS (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = -1 EINTR (Interrupted system call)
ioctl(11, 0x40086485, 0x7fff61c01800) = ? ERESTARTSYS (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = -1 EINTR (Interrupted system call)
ioctl(11, 0x40086485, 0x7fff61c01800) = ? ERESTARTSYS (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = -1 EINTR (Interrupted system call)

X log has message on event queue overflow and backtrace:
[mi] EQ overflowing. The server is probably stuck in an infinite loop.

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x28) [0x49ea18]
1: /usr/bin/X (mieqEnqueue+0x1f4) [0x49e3e4]
2: /usr/bin/X (xf86PostMotionEventP+0xce) [0x478f9e]
3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7feb9c7cd000+0x516f) [0x7feb9c7d216f]
4: /usr/bin/X (0x400000+0x6be87) [0x46be87]
5: /usr/bin/X (0x400000+0x1171c3) [0x5171c3]
6: /lib64/libpthread.so.0 (0x3e9e200000+0xf0f0) [0x3e9e20f0f0]
7: /lib64/libc.so.6 (ioctl+0x7) [0x3e9d6d6937]
8: /usr/lib64/libdrm.so.2 (drmIoctl+0x23) [0x3eb3e03383]
9: /usr/lib64/libdrm.so.2 (drmCommandWrite+0x1b) [0x3eb3e0360b]
10: /usr/lib64/libdrm_nouveau.so.1 (0x7feb9fe68000+0x2f1d) [0x7feb9fe6af1d]
11: /usr/lib64/libdrm_nouveau.so.1 (nouveau_bo_map_range+0xfc) [0x7feb9fe6b11c]
12: /usr/lib64/libdrm_nouveau.so.1 (0x7feb9fe68000+0x2106) [0x7feb9fe6a106]
13: /usr/lib64/libdrm_nouveau.so.1 (nouveau_pushbuf_flush+0x29c) [0x7feb9fe6a49c]
14: /usr/lib64/xorg/modules/libexa.so (0x7feb9dc89000+0x90e1) [0x7feb9dc920e1]
15: /usr/lib64/xorg/modules/libexa.so (0x7feb9dc89000+0x939d) [0x7feb9dc9239d]
16: /usr/bin/X (miCopyRegion+0x28d) [0x545e5d]
17: /usr/bin/X (miDoCopy+0x44a) [0x54636a]
18: /usr/lib64/xorg/modules/libexa.so (0x7feb9dc89000+0x8667) [0x7feb9dc91667]
19: /usr/bin/X (0x400000+0xd44b8) [0x4d44b8]
20: /usr/bin/X (0x400000+0x2b3bc) [0x42b3bc]
21: /usr/bin/X (0x400000+0x2c86c) [0x42c86c]
22: /usr/bin/X (0x400000+0x21e3a) [0x421e3a]
23: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x3e9d61eb1d]
24: /usr/bin/X (0x400000+0x219f9) [0x4219f9]

I'm using Fedora 12 for x86-64, uname -a output is
Linux lumos 2.6.32.9-67.fc12.x86_64 #1 SMP Sat Feb 27 09:26:40 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
xorg-x11-drv-nouveau version 0.0.15 release 20.20091105gite1c2efd.fc12

Ask me if I can provide more information or make more experiments.
Thanks, Victor

Revision history for this message
In , Vvv-oktetlabs (vvv-oktetlabs) wrote :

Created attachment 33898
dmesg output

Revision history for this message
In , Vvv-oktetlabs (vvv-oktetlabs) wrote :

Created attachment 33899
lspci output

Revision history for this message
In , Xavier (chantry-xavier) wrote :

This seems related to two existing fedora bugs :
https://bugzilla.redhat.com/show_bug.cgi?id=567412
https://bugzilla.redhat.com/show_bug.cgi?id=566987

When using fedora nouveau code, it's usually advised to use fedora bug report.
People following this bug tracker might expect you to run latest git of all nouveau components :
http://nouveau.freedesktop.org/wiki/InstallNouveau
(two main reasons for that : 1) git is usually better and has more fixes 2) we do not know what code exactly nouveau is shipping)

As said in https://bugzilla.redhat.com/show_bug.cgi?id=567412#c6 , using the latest git of xf86-video-nouveau would at least fix the DATA ERROR you see in kernel log and rules that out.

However, according to mwk, the data error is unlikely to be related to the hangs. Also he believes he might have the same problem using recent nouveau code. To confirm that, he would like to know the value of the 400700 register next time the machine hangs :
$ wget http://0x04.net/~mwk/pgtest/{peek.c,libio.{c,h}}
$ gcc peek.c libio.c -lpciaccess -o peek
# ./peek 0x400700

Revision history for this message
In , Vvv-oktetlabs (vvv-oktetlabs) wrote :

Sorry for misplaced report.

It happens again - ./peek 0x400700 output is
00400700: 00100001

So, is there chance this problem resolved in git driver?

Revision history for this message
In , Xavier (chantry-xavier) wrote :

(In reply to comment #4)
> Sorry for misplaced report.
>
> It happens again - ./peek 0x400700 output is
> 00400700: 00100001
>
> So, is there chance this problem resolved in git driver?
>

17:53 < mwk> shining^: heh, ok, so the bug report looks like it could be related...
17:53 < mwk> his failing status is 00100001, mine is 01b00001....
17:54 < shining^> mwk: well thats different :)
17:54 < mwk> not so different
17:54 < mwk> mine includes all of his bits
17:54 < mwk> too bad I don't know what most of these bits mean...
17:55 < mwk> 00800000 is CUDA MP execution, 00000001 is the whole PGRAPH... and that's about it
17:57 < shining^> mwk: ok. and it also hangs regularly for you ?
17:57 < shining^> "usually once per 1-2 days"
17:57 < mwk> shining^: much more often when I'm playing some 3d on it
17:58 < shining^> ok
17:58 < mwk> possibly I could get more lockups out of it, and of the 00100001 kind, if I left X running for long
17:58 < shining^> I suppose the answer to that question is no then : "So, is there chance this problem resolved in git driver?"
17:59 < mwk> with current git? /me doubts that.

Revision history for this message
In , Vvv-oktetlabs (vvv-oktetlabs) wrote :

Thanks. I called ./peek 0x400700 when X hangs another time, its output was:
00400700: 001e0001

Just in case if it helps somehow:
- when X hangs, mouse cursor still running
- since update to 2.6.32.9, it appears X hangs more often than before, and usually system can not be restored by killing X (X restarts, but screen remains the same as in moment when hang occurs)

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

This bug is a huge mystery and I'm starting to think it's a hw bug. One possibility is that it's triggered by a particular insn sequence that DDX uses. Could you please try http://0x04.net/~mwk/nva5-hack.patch and see if the lockups still happen?

If anyone hits this bug, with the above patch or without, please compile pgtest from http://0x04.net/cgit/index.cgi/pgtest and give the output of ./peek 0x400000 0x10000 after the lockup.

Revision history for this message
In , Axel Beckert (xtaran) wrote :
Download full text (3.5 KiB)

I have observed this bug also on Ubuntu 10.04 (Details below):

[…]
(II) NOUVEAU(0): Modeline "1152x864"x0.0 108.00 1152 1216 1344 1600 864 865 868 900 +hsync +vsync (67.5 kHz)
(II) NOUVEAU(0): Modeline "1400x1050"x0.0 156.00 1400 1504 1648 1896 1050 1053 1057 1099 -hsync +vsync (82.3 kHz)
(II) NOUVEAU(0): Modeline "1440x900"x0.0 106.50 1440 1520 1672 1904 900 903 909 934 -hsync +vsync (55.9 kHz)
(II) NOUVEAU(0): Modeline "1440x900"x0.0 136.75 1440 1536 1688 1936 900 903 909 942 -hsync +vsync (70.6 kHz)
(II) NOUVEAU(0): Modeline "1680x1050"x0.0 146.25 1680 1784 1960 2240 1050 1053 1059 1089 -hsync +vsync (65.3 kHz)
[mi] EQ overflowing. The server is probably stuck in an infinite loop.

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x28) [0x4a3248]
1: /usr/bin/X (mieqEnqueue+0x1f4) [0x4a2ac4]
2: /usr/bin/X (xf86PostMotionEventP+0xc4) [0x47cea4]
3: /usr/lib/xorg/modules/input/evdev_drv.so (0x7f5182f10000+0x53cf) [0x7f5182f153cf]
4: /usr/bin/X (0x400000+0x6fca7) [0x46fca7]
5: /usr/bin/X (0x400000+0x11d1f3) [0x51d1f3]
6: /lib/libpthread.so.0 (0x7f518823e000+0xf8f0) [0x7f518824d8f0]
7: /lib/libc.so.6 (ioctl+0x7) [0x7f5186ff6157]
8: /lib/libdrm.so.2 (drmIoctl+0x28) [0x7f51855a75b8]
9: /lib/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f51855a783b]
10: /lib/libdrm_nouveau.so.1 (0x7f5184f69000+0x2f7d) [0x7f5184f6bf7d]
11: /lib/libdrm_nouveau.so.1 (nouveau_bo_map_range+0xfc) [0x7f5184f6c1bc]
12: /lib/libdrm_nouveau.so.1 (0x7f5184f69000+0x2166) [0x7f5184f6b166]
13: /lib/libdrm_nouveau.so.1 (nouveau_pushbuf_flush+0x29c) [0x7f5184f6b4fc]
14: /usr/lib/xorg/modules/libexa.so (0x7f51840c4000+0x8876) [0x7f51840cc876]
15: /usr/lib/xorg/modules/libexa.so (0x7f51840c4000+0x9415) [0x7f51840cd415]
16: /usr/bin/X (0x400000+0xd8a7b) [0x4d8a7b]
17: /usr/bin/X (0x400000+0x2ebb4) [0x42ebb4]
18: /usr/bin/X (0x400000+0x30c3c) [0x430c3c]
19: /usr/bin/X (0x400000+0x261aa) [0x4261aa]
20: /lib/libc.so.6 (__libc_start_main+0xfd) [0x7f5186f36c4d]
21: /usr/bin/X (0x400000+0x25d59) [0x425d59]

strace shows tons of this:

[…]
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = -1 EINTR (Interrupted system call)
ioctl(8, 0x40086485, 0x7fff98e3a6d0) = ? ERESTARTSYS (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = -1 EINTR (Interrupted system call)
ioctl(8, 0x40086485, 0x7fff98e3a6d0) = ? ERESTARTSYS (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = -1 EINTR (Interrupted system call)
ioctl(8, 0x40086485, 0x7fff98e3a6d0) = ? ERESTARTSYS (To be restarted)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe) = -1 EINTR (Interrupted system call)
ioctl(8, 0x40086485, 0x7fff98e3a6d0) = ? ERESTARTSYS (To be restarted)
[…]

neper:~# uname -a
Linux neper 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:28:05 UTC 2010 x86_64 GNU/Linux
neper:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 10.04 LTS
Release: 10.04
Codename: lucid
neper:~# dpkg -l | fgrep nouveau
ii libdrm-nouveau1 2.4.18-1ubuntu3 […]
ii xserver-xorg-video-nouveau 1:0.0.15...

Read more...

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

*** Bug 28320 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Mark-hagger (mark-hagger) wrote :

(In reply to comment #7)
> If anyone hits this bug, with the above patch or without, please compile pgtest
> from http://0x04.net/cgit/index.cgi/pgtest and give the output of ./peek
> 0x400000 0x10000 after the lockup.

I can get this to hang almost at will, on a fresh Fedora 13 install, I've attached output from the peek run, hope it helps.

Revision history for this message
In , Mark-hagger (mark-hagger) wrote :

Created attachment 36471
Xorg and peek output after hang

Revision history for this message
In , Greg Wilkins (gregw-wiltel) wrote :

(In reply to comment #7)
> This bug is a huge mystery and I'm starting to think it's a hw bug.

I'm getting the same symptoms on a fresh install of ubuntu 10.04 on a brand new lenovo w510. X locks up about 2 or 3 times per day - screen is frozen, even if X is restarted.

  2.6.32-22-generic-pae #36-Ubuntu SMP Thu Jun 3 23:14:23 UTC 2010 i686

next time it happens, I'll ssh into the machine and gather more information.

Revision history for this message
In , Greg Wilkins (gregw-wiltel) wrote :

Created attachment 36485
dmesg taken during X lockup

It happened again....
here is the dmesg output that I captured by ssh-ing in while X was in 100% CPU state.

I also thought I had captured and strace... but I made a mistake. I'll capture that next time. Is there anything else you need, this is happening every few hours for me?

Revision history for this message
In , Mark-hagger (mark-hagger) wrote :

(In reply to comment #13)
> that next time. Is there anything else you need, this is happening every few
> hours for me?
See comment#7, Marcin says he wants output from "peek" after a hang has occurred.

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

Status update.

First, we don't need any additional dumps - Ben found a reliable way to reproduce this bug semi-instantly, so we can make any dumps we need.

But, we still have no idea what's causing this bug. And right now all developers hunting for this bug are either too busy with other things, or on vacation.

So, please no more dumps. You have four options:

 - use other driver
 - disable acceleration [nouveau.noaccel=1 on kernel command line]
 - decide that rebooting every few hours isn't that bad after all
 - debug and fix that bug on your own

Yes, we know the situation sucks. Sorry for that.

Revision history for this message
In , bedahr (grasch-simon-listens) wrote :

Would you mind posting the method to reproduce the bug?

Thanks!

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

06:16:48 <darktama> I can reproduce the issue *very* quickly with http://www.nvnews.net/vbulletin/showthread.php?t=150598
06:16:57 <darktama> the first post has an image, and a smaller thumbnail image
06:17:42 <darktama> if I scroll up and down for a bit (<1min) with the the thumbnail going off/on the screen, it'll hang

Revision history for this message
In , bedahr (grasch-simon-listens) wrote :

Hm I don't have any problems with this site (have been scrolling for almost 2 minutes now without any issues).

I am using a 8600M GT, tough but I'd thought I'd test it because my X has hung twice in the last three days with the same symptoms as well...

Revision history for this message
In , Picogeyer (picogeyer) wrote :

(In reply to comment #17)
> 06:16:48 <darktama> I can reproduce the issue *very* quickly with
> http://www.nvnews.net/vbulletin/showthread.php?t=150598
> 06:16:57 <darktama> the first post has an image, and a smaller thumbnail image
> 06:17:42 <darktama> if I scroll up and down for a bit (<1min) with the the
> thumbnail going off/on the screen, it'll hang

I can't reproduce the bug with that site either using a GeForce 210, though my X hangs about twice a day.

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

Yeah, that's a funny thing about this testcase. Apparently it works for NVA3 and NVA5 chipsets, but not NVA8 which gt210 is based on. Maybe it's related to different TP configuration or something...

As for 8600M problem - this is almost certainly some other bug. Not all random hangs are instance of the particular bug. Does that hang give something suspicious in dmesg?

Revision history for this message
In , bedahr (grasch-simon-listens) wrote :

I'll try to debug it the next time it happens...

Revision history for this message
In , Greg Wilkins (gregw-wiltel) wrote :

Created attachment 36508
strace taken during 100% X lockup

here is an strace of the X process during the 100% cpu lockup. Looks like a loop to me.

I'm going to switch to the proprietary driver for a while and see if I get lockups with that. But I really want to run the open driver (it's so much more flexible), so I'm happy to switch back if you want me to try anything.

Revision history for this message
In , Greg Wilkins (gregw-wiltel) wrote :

As per comment #15, I can confirm that setting [nouveau.noaccel=1 on kernel command line] does prevent lockups (at least for 3 days). It does result in a occasional strange display artefacts (eg black boxes left after tool tips close), but is entirely usable and causes less issues than using the proprietary driver.

Revision history for this message
In , Greg Wilkins (gregw-wiltel) wrote :

squeak?

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

Well... the status is still the same. We [me and darktama / Ben Skeggs] both have no fucking idea what's
causing this problem.

It turns out, stuff is complicated. NVA3+ card introduced some sort of
microcontroller on-board that runs all the time and controls stuff like power
management. In all NVA3+ traces, we see a lot of places where blob is talking
to that controller and does stuff to PGRAPH. Atm we can only assume that
lockups are caused by us not doing this.

We have no idea what exactly that microcontroller does and how to talk to it.
Worse, this microcontroller needs microcode that is uploaded by the driver.
And judging by what I've seen already, ctxprogs were a walk in the park
compared to this new uc. The progs are huge, and there are shitloads of
opcodes to RE.

So - we're blocked again on REing some magical code. And we're talking real
code this time, as opposed to ctxprogs which were mostly just a list of regs
to copy...

Until we understand what's going on, status for NVA3+ cards stays the same -
we have absoluely no idea how to fix that bug and ETA is on the order of at
least months.

And for the record: I'm going to kill anyone who calls this new progs
"voodoo", "magic", or anything like that. We already decided on the name of
"fuc progs". And the PM microkontroller is not the only place where there are
used. Right now we know of the following:

 - NV98+ cryptographic engine setup
 - NV98+ cryptographic engine ctxprogs
 - NV98+ video decoding ctxprogs
 - NVA3+ unknown engine 104xxx [DMA copier?] ctxprogs
 - NVA3+ PM microcontroller
 - NVC0 PGRAPH ctxprogs

With such a huge list of users, REing fuc progs is a high priority for me.
Keep your fingers crossed. And find lots of RE-capable people to help, this is
going to be a long ride...

PS. Note that chipset ordering is a bit funny and it does NOT follow NVxx
numerical values. The saner ordering I've come up with is at
http://github.com/pathscale/envytools/blob/master/nvchipsets.xml - it best
matches order of adding new functionality. Also, for video decoding units,
NVA0 and NV98 should be swapped, ie. NV98+ video decoding does not include
NVA0. So NVA3+ cards are NVA3, NVA5, NVA8, NVAF, NVC0. But not NVAA/NVAC.

PPS. In other words, if you want your card to work with nouveau, don't buy NVA3+ cards yet. Or, following nvidia codenames, don't buy stuff with GT21x chipset in it. Nor MCP89, but that one is too damn rare anyway.

PPPS. If you think you're able to RE microcodes, contact me. Srsly. We need more people. Badly.

Revision history for this message
In , Greg Wilkins (gregw-wiltel) wrote :

Marcin,

thanks for the update.
Sorry I can't help with the RE side of things (got my own open source projects keeping me 200% busy). But I am happy to help debug/test on my hardware.

Note that I still find using the nouveau driver without acceleration is much better than using the nvidea driver with acceleration.

Revision history for this message
In , Dmytro-poplavskiy (dmytro-poplavskiy) wrote :

I also have the similar problem on GT240 on OpenSuse 11.3,
I hope the backtrace may help.

#0 0x00007fddfac33e87 in ioctl () from /lib64/libc.so.6
#1 0x00007fddf93ebc38 in drmIoctl (fd=10, request=1074291842, arg=0x7fffca3b0330) at xf86drm.c:184
#2 0x00007fddf93edf3b in drmCommandWrite (fd=<value optimized out>, drmCommandIndex=<value optimized out>, data=<value optimized out>, size=<value optimized out>)
    at xf86drm.c:2398
#3 0x00007fddf8dad07d in nouveau_bo_wait (bo=0x829b00, cpu_write=<value optimized out>, no_wait=<value optimized out>, no_block=<value optimized out>)
    at nouveau_bo.c:385
#4 0x00007fddf8dad68e in nouveau_bo_map_range (bo=0x829b00, delta=0, size=<value optimized out>, flags=8) at nouveau_bo.c:428
#5 0x00007fddf8dac21a in nouveau_pushbuf_space (chan=0x8366c0, min=<value optimized out>) at nouveau_pushbuf.c:53
#6 0x00007fddf8dac760 in nouveau_pushbuf_flush (chan=0x8366c0, min=0) at nouveau_pushbuf.c:273
#7 0x00007fddf7f0e7a5 in ?? () from /usr/lib64/xorg/modules/libexa.so
#8 0x00007fddf7f10272 in ?? () from /usr/lib64/xorg/modules/libexa.so

Revision history for this message
In , Mark Carey (careym) wrote :

Adding a me too.

NVA8, GT218

[drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2

Fedora 13. Seems to happen reproducably after playing freeciv for 45 - 60 minutes.

[ 3407.261] [mi] EQ overflowing. The server is probably stuck in an infinite loop.
[ 3407.262]
Backtrace:
[ 3407.272] 0: /usr/bin/Xorg (xorg_backtrace+0x3c) [0x80ad11c]
[ 3407.272] 1: /usr/bin/Xorg (mieqEnqueue+0x1b7) [0x809b8f7]
[ 3407.272] 2: /usr/bin/Xorg (xf86PostMotionEventP+0xd2) [0x80c8232]
[ 3407.272] 3: /usr/lib/xorg/modules/input/evdev_drv.so (0xe7c000+0x30a2) [0xe7f0a2]
[ 3407.272] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0xe7c000+0x3349) [0xe7f349]
[ 3407.272] 5: /usr/bin/Xorg (0x8047000+0x76c30) [0x80bdc30]
[ 3407.272] 6: /usr/bin/Xorg (0x8047000+0x12db64) [0x8174b64]
[ 3407.272] 7: (vdso) (__kernel_sigreturn+0x0) [0xdba400]
[ 3407.272] 8: (vdso) (__kernel_vsyscall+0x2) [0xdba416]
[ 3407.272] 9: /lib/libc.so.6 (ioctl+0x19) [0x74c0b9]
[ 3407.272] 10: /usr/lib/libdrm.so.2 (drmIoctl+0x2e) [0x7fd6a7e]
[ 3407.272] 11: /usr/lib/libdrm.so.2 (drmCommandWrite+0x3c) [0x7fd6e0c]
[ 3407.273] 12: /usr/lib/libdrm_nouveau.so.1 (0xfb5000+0x2a9a) [0xfb7a9a]
[ 3407.273] 13: /usr/lib/libdrm_nouveau.so.1 (nouveau_bo_map_range+0xf1) [0xfb7c91]
[ 3407.273] 14: /usr/lib/libdrm_nouveau.so.1 (nouveau_bo_map+0x34) [0xfb7d64]
[ 3407.273] 15: /usr/lib/libdrm_nouveau.so.1 (0xfb5000+0x1c2e) [0xfb6c2e]
[ 3407.273] 16: /usr/lib/libdrm_nouveau.so.1 (nouveau_pushbuf_flush+0x1ca) [0xfb700a]
[ 3407.273] 17: /usr/lib/xorg/modules/drivers/nouveau_drv.so (0x131000+0x1b76f) [0x14c76f]
[ 3407.273] 18: /usr/lib/xorg/modules/libexa.so (0x163000+0x8ac3) [0x16bac3]
[ 3407.273] 19: /usr/lib/xorg/modules/libexa.so (0x163000+0x9673) [0x16c673]
[ 3407.273] 20: /usr/bin/Xorg (0x8047000+0xe6f26) [0x812df26]
[ 3407.273] 21: /usr/bin/Xorg (0x8047000+0x17967d) [0x81c067d]
[ 3407.273] 22: /usr/bin/Xorg (miCompositeRects+0x7f) [0x81c079f]
[ 3407.273] 23: /usr/bin/Xorg (CompositeRects+0x74) [0x811da34]
[ 3407.273] 24: /usr/bin/Xorg (0x8047000+0xe036d) [0x812736d]
[ 3407.273] 25: /usr/bin/Xorg (0x8047000+0xdc494) [0x8123494]
[ 3407.273] 26: /usr/bin/Xorg (0x8047000+0x50a37) [0x8097a37]
[ 3407.273] 27: /usr/bin/Xorg (0x8047000+0x1b595) [0x8062595]
[ 3407.273] 28: /lib/libc.so.6 (__libc_start_main+0xe6) [0x68dcc6]
[ 3407.273] 29: /usr/bin/Xorg (0x8047000+0x1b181) [0x8062181]

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

Can we please stop with the "me too"s and useless backtraces already? We already know the bug affects anyone with NVA3+ cards and the backtraces only show the fallout of the fallout of the fallout of the original problem...

Also: if you get a DMA_PUSHER, that's another bug. This bug happens with total silence from the kernel - the card just hangs without telling us why.

Now, a status update. So far, I REd the instruction set and disaassembled a good chunk of that microcode. Sure enough, it pokes PGRAPH and watches for stuff on it. But the code doesn't yet fully make sense to me, and I'm temporarily away from my NVA5 card, so... could someone dump some registers for me?

I need a dump of the registers both before and after the hang. The interesting stuff is in 10axxx range, and ideally I'd want output of "./peek 10a000 1000". However, if the card doesn't like it and hangs or something upon that command, dump registers individually instead by doing "./peek X" for X being 10a008, 10a6fc, 10a4fc, 10a4f4, 10a714, 10a700, 10a704, 10a4dc, 10a690, 10a688.

Repeating that 2 or 3 times may be useful, esp. across different chipsets. But not TOO much, please.

Revision history for this message
In , Mark Carey (careym) wrote :

Created attachment 37625
peek results from before lockup

Revision history for this message
In , Mark Carey (careym) wrote :

Created attachment 37626
peek results after lockup

Revision history for this message
In , Mark Carey (careym) wrote :

Created attachment 37627
Xorg.0.log from lockup

Revision history for this message
In , Richard-coe (richard-coe) wrote :

I found this bug while researching my X locks up issue.
This solution does not address the hardware issue documented here, but
I want help anyone who is in a similar situation that does *NOT* have
the hardware lockup and that the noaccel=1 does not work.

In my case all you have to do is move a window and then immediately
click in another window. There may be other ways to reproduce this.

The only workaround is to restart the window manager or Xorg.

Investigating the issue I found this patch:
http://cgit.freedesktop.org/xorg/xserver/patch/?id=1884db430a5680e37e94726dff46686e2218d525
Subject: Revert "dix: use the event mask of the grab for TryClientEvents."

I am using xorg-server-1.8.0 and found the issue was introduced in
xorg-server-1.6.3. The patch fix appears in xorg-server-1.8.2 and later

Revision history for this message
In , Bugs-sthias (bugs-sthias) wrote :

./peek'ing my QUADRO NVS 140M results in all zeroes ("...") before and after the lockup, regardless of whether I peek the range "10a000 1000" or the single addresses. Am I doing anything wrong?

Things that cause lockups with nouveau appear to increasingly slow down the blob-driver, which because of this is also no option for me to use. After some time using KDE4/Skype/Firefox/OpenOffice the system becomes terribly slow with blob.

If I can be of any help with testing, please let me know.

Revision history for this message
In , Marcin Kościelnicki (koriakin) wrote :

(In reply to comment #34)
> ./peek'ing my QUADRO NVS 140M results in all zeroes ("...") before and after
> the lockup, regardless of whether I peek the range "10a000 1000" or the single
> addresses. Am I doing anything wrong?

Yes. you didn't listen when I said this bug is NVA3+ only. Your card is NV86.

Do you get any interesting messages in kernel around the lockup? This is almost certainly some other bug.

Revision history for this message
In , Bugs-sthias (bugs-sthias) wrote :

(In reply to comment #35)
> Yes. you didn't listen when I said this bug is NVA3+ only. Your card is NV86.
>
> Do you get any interesting messages in kernel around the lockup? This is almost
> certainly some other bug.

OK, sorry for that. Yes, my kernel gives me

[29470.176297] [drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2
[29470.176975] [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x1560 Data 0x00000000:0xcccccccc
[29470.176979] [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE
[29470.176990] [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x1564 Data 0x00000000:0xcccccccc
[29470.176993] [drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_BITFIELD

So, I think I'll try my luck with bug 26733 or open a new one, if they don't like my problem there either :)

Revision history for this message
In , Kavol (kavol) wrote :

Created attachment 39004
./peek 10a000 1000 after and before the crash

07:00.0 VGA compatible controller: nVidia Corporation Device 0a65 (rev a2)

here are my dumps

note that "before" is actually after reboot - don't know how to capture right before the freeze

Revision history for this message
In , Marcin Slusarz (marcin-slusarz) wrote :

*** Bug 30817 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Jeffm-9 (jeffm-9) wrote :

Created attachment 39589
Before and after peek 10a000 on ION2

Here are 3 more peeks, before and after, using a different chipset as you said that might be interesting to you.

01:00.0 VGA compatible controller: nVidia Corporation GT218 [ION] (rev a2) (prog-if 00 [VGA controller])
 Flags: bus master, fast devsel, latency 0, IRQ 16
 Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
 Memory at d0000000 (64-bit, prefetchable) [size=256M]
 Memory at ce000000 (64-bit, prefetchable) [size=32M]
 I/O ports at dc00 [size=128]
 Expansion ROM at fe980000 [disabled] [size=512K]
 Capabilities: [60] Power Management version 3
 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
 Capabilities: [78] Express Endpoint, MSI 00
 Capabilities: [b4] Vendor Specific Information: Len=14 <?>
 Capabilities: [100] Virtual Channel
 Capabilities: [128] Power Budgeting <?>
 Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
 Kernel driver in use: nouveau

I do see messages in the kernel, but as you mentioned it's a separate bug. They appear inconsistently during this failure.
[drm] nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 2
and sometimes also:
[drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - Ch 2/2 Class 0x502d Mthd 0x0240 Data 0x00000000:0x00086184
[drm] nouveau 0000:01:00.0: PGRAPH_DATA_ERROR - INVALID_VALUE

This is on openSUSE Factory, kernel 2.6.36-rc4
xorg-x11-server-7.5_1.9.0.902-69.3.x86_64
xorg-x11-driver-video-nouveau-0.0.16_20101010_8c8f15c-14.1.x86_64
libdrm-2.4.22-27.2.x86_64

Revision history for this message
In , Johannes Obermayr (jobermayr) wrote :

Jeff, please try whether it is valid with packages from home:jobermayr.

Revision history for this message
In , Jeffm-9 (jeffm-9) wrote :

(In reply to comment #40)
> Jeff, please try whether it is valid with packages from home:jobermayr.

I'm afraid so. Same symptoms. Is another dump needed or are the ones I've provided already sufficient?

Revision history for this message
In , Alex Mayorga (alex-mayorga) wrote :
Download full text (8.6 KiB)

(In reply to comment #3)
> However, according to mwk, the data error is unlikely to be related to the
> hangs. Also he believes he might have the same problem using recent nouveau
> code. To confirm that, he would like to know the value of the 400700 register
> next time the machine hangs :
> $ wget http://0x04.net/~mwk/pgtest/{peek.c,libio.{c,h}}
> $ gcc peek.c libio.c -lpciaccess -o peek
> # ./peek 0x400700

Landed here from bug 33357 that might be a duplicate of this one.
For n00bs like me the commands have changed a bit, I managed to figure out the first one to be:
$ wget http://0x04.net/cgit/index.cgi/pgtest/{peek.c,libio.{c,h}}

The second one gives me the following:

$ gcc peek.c libio.c -lpciaccess -o peek
peek.c:1:1: error: expected identifier or ‘(’ before ‘<’ token
peek.c:3:13: warning: character constant too long for its type
peek.c:3:53: warning: multi-character character constant
peek.c:3:63: warning: multi-character character constant
peek.c:6:12: warning: character constant too long for its type
peek.c:6:32: warning: character constant too long for its type
peek.c:7:12: warning: character constant too long for its type
peek.c:7:29: warning: character constant too long for its type
peek.c:8:11: warning: character constant too long for its type
peek.c:8:29: warning: character constant too long for its type
peek.c:8:45: warning: character constant too long for its type
peek.c:9:11: warning: character constant too long for its type
peek.c:9:29: warning: character constant too long for its type
peek.c:9:46: warning: character constant too long for its type
peek.c:9:106: warning: character constant too long for its type
peek.c:12:9: warning: multi-character character constant
peek.c:12:26: warning: character constant too long for its type
peek.c:14:11: warning: multi-character character constant
peek.c:14:38: warning: character constant too long for its type
peek.c:14:66: warning: character constant too long for its type
peek.c:14:87: warning: character constant too long for its type
peek.c:15:11: warning: multi-character character constant
peek.c:15:26: warning: character constant too long for its type
peek.c:15:66: warning: character constant too long for its type
peek.c:15:80: warning: character constant too long for its type
peek.c:15:131: warning: multi-character character constant
peek.c:15:151: warning: multi-character character constant
peek.c:15:164: error: empty character constant
peek.c:16:27: warning: character constant too long for its type
peek.c:17:23: warning: character constant too long for its type
peek.c:17:37: error: empty character constant
peek.c:17:46: warning: character constant too long for its type
peek.c:18:15: warning: multi-character character constant
peek.c:18:46: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘this’
peek.c:18:56: warning: character constant too long for its type
peek.c:19:16: warning: character constant too long for its type
peek.c:19:1: error: stray ‘\305’ in program
peek.c:19:1: error: stray ‘\233’ in program
peek.c:20:14: warning: multi-character character constant
peek.c:21:10: warning: character constant too long for its type
peek.c:21:24: warning: character cons...

Read more...

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :
Revision history for this message
Id2ndR (id2ndr) wrote :

I can confirm this trouble. It happen since the 31/01 to me.
I tried the package from stock repository and xorg-edgers PPA but I've got the same trouble.

It seams to happen quicker when launching virt-manager and firefox and reducing/restoring these windows.

I attach the backtrace of the latest crash

bugbot (bugbot)
affects: xorg (Ubuntu) → xserver-xorg-video-nouveau (Ubuntu)
Revision history for this message
Thomas (t-hartwig) wrote :

Same me on a Thinkpad W510.
System is still working, so you can reach by network, but display is locked and no interaction possible any more.

Revision history for this message
Thomas (t-hartwig) wrote :

Added my Xorg.0.log. BTW tried an earlier kernel 2.6.37, but did freeze as well.

Revision history for this message
Thomas (t-hartwig) wrote :

First one was the current session, now the one from the last freeze.

Revision history for this message
Thomas (t-hartwig) wrote :

Probably related:
https://bugzilla.redhat.com/show_bug.cgi?id=618143

"nouveau.noaccel=1" to add as boot option suggested there. Will test and report...

Revision history for this message
Thomas (t-hartwig) wrote :

nouveau.noaccel=1 seems to workaround this, had no freeze since yesterday. I can see no performance and 3D is not possible either.

Revision history for this message
Thomas (t-hartwig) wrote :

Wanted to write: I can see no performance degrades from my daily work and 3D is not possible either.

Revision history for this message
In , =?ISO-8859-15?Q?Tiziano_M=FCller?= (tm-dev-zero) wrote :

I'd say the current instructions to build the peek (and other utilities) are:

git clone git://0x04.net/pgtest
cd pgtest
make

Revision history for this message
Id2ndR (id2ndr) wrote :

My system is more stable with latest updates. I don't know exactly which one hopelessly.

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

I am still getting this 3-4 times a day.

Revision history for this message
Mariano Chavero (marianochavero) wrote :

Same problem here... It gets always I hit the Ubuntu logo.

Revision history for this message
Bryce Harrington (bryce) wrote :

Typically gpu lockup bugs all share the same symptoms (more or less) but can have completely different root causes. "Mouse moving but screen not updating" is nigh-universal to all GPU lockup bugs. Unfortunately we don't have the debugging tools available in the distro to distinguish freezes on -nouveau like we do for -intel.

So, it would *really* help if each person could file their bug report as a unique, separate bug report, rather than joining this bug report assuming it's the same issue. If you don't want to, that's ok, but we'll be focusing only on Marc's issue and once that's solved will close this bug report out.

Marc, did you attempt the nouveau.noaccel=1 workaround that Thomas mentioned? Also, the dmesg in your original bug report appears to be from a working session. What we need for analysis purposes is dmesg (and why not, /var/log/Xorg.0.log too) from when it is locked up; you will want to ssh into the box from another machine to do this.

@everyone else, again, please file a new bug report. In each case we're also going to need dmesg and /var/log/Xorg.0.loggggggggggggggggggggg from while it's frozen. There are some various help documents at http://wiki.ubuntu.com/X for troubleshooting and analyzing these kinds of issues. If your lockup happens during boot, one thing to try is to disable vesafb in grub on the kernel command line by passing vesafb.anything=1; see https://wiki.ubuntu.com/Kernel/KernelBootParameters for guidance on setting kernel parameters.

Changed in xserver-xorg-video-nouveau (Ubuntu):
status: New → Incomplete
Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

OK, I will try the nouveau.noaccel=1 workaround and will report back.

Bryce Harrington (bryce)
Changed in xserver-xorg-video-nouveau (Ubuntu):
importance: Undecided → High
Revision history for this message
In , Rtguille (rtguille) wrote :

I have a GT220 and sometimes freezes randomly:

[ 3675.146]
Backtrace:
[ 3675.153] 0: /usr/bin/X (xorg_backtrace+0x28) [0x460d18]
[ 3675.153] 1: /usr/bin/X (0x400000+0x63509) [0x463509]
[ 3675.153] 2: /lib64/libc.so.6 (0x34eca00000+0x32a20) [0x34eca32a20]
[ 3675.153] 3: /usr/lib64/xorg/modules/extensions/libdri2.so
(DRI2CloseScreen+0x24) [0x7fb95ba39a14]
[ 3675.153] 4: /usr/lib64/xorg/modules/drivers/nouveau_drv.so
(0x7fb95b809000+0xd6ab) [0x7fb95b8166ab]
[ 3675.154] 5: /usr/bin/X (0x400000+0xa3b49) [0x4a3b49]
[ 3675.154] 6: /usr/bin/X (0x400000+0x15daec) [0x55daec]
[ 3675.154] 7: /usr/bin/X (0x400000+0x2193c) [0x42193c]
[ 3675.154] 8: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x34eca1ec5d]
[ 3675.154] 9: /usr/bin/X (0x400000+0x21449) [0x421449]
[ 3675.154] Segmentation fault at address 0x10
[ 3675.154]
Fatal server error:
[ 3675.154] Caught signal 11 (Segmentation fault). Server aborting
[ 3675.154]
[ 3675.154]
Please consult the Fedora Project support
         at http://bodhi.fedoraproject.org/
 for help.
[ 3675.154] Please also check the log file at "/var/log/Xorg.0.log" for
additional information.
[ 3675.154]
[ 3675.157] (II) NOUVEAU(0): NVLeaveVT is called.

That error is a bit old.
Ok, i read about the random lockus some time ago.

The interesting thing is that at some point i started to use (i still do) use
the nVIDIA (P)drivers, and to my surprise, it also has random gpu lockus (granted, it is another piece of software)
But when i put my new ati HD5670 and found to also random freeze...

I run F13.
I never had issues with Slackware 13.1 and nvidia(P).
The OtherOS never freezed. with any card.

For F13 i must use pcie_aspm=off, because of issues with the sata controller.
But pcie_aspm=off also seems to set the pcie bus into gen1, (it halves the
link speed and the de-emphasys, and also changes. I do not know if it is normal to pcie_aspm=off to do that and other thins that do.

For example the nvidia(p) report that the card is at gen1 speed. (the mb is Gen2 and the card is also Gen2) M4N72-E

Is expected for the nouvau driver to work with any pcie configuration, different link/speed aspm/no_aspm/ ?

I know that it may not be related, but i do not know if subtle different pcie configurations (or pcie driver bugs) may lead to the vga to behave in that way.

Thanks in advance.

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

The nouveau.noaccel=1 workaround seems to have worked for me. With that, I have not had any crashes in the past couple of days.

Bryce, do you still need a dmesg from a crash?

Revision history for this message
In , Frédéric Crozat (fcrozat) wrote :

Created attachment 43519
before and after peek on nv40 from Lenovo T410

Here is peek before and after free, on kernel 2.6.38rc5, on Lenovo T410 laptop with integrated nividia GPU:

01:00.0 VGA compatible controller: nVidia Corporation GT218 [NVS 3100M] (rev a2) (prog-if 00 [VGA controller])
 Subsystem: Lenovo Device 2142
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0, Cache Line Size: 64 bytes
 Interrupt: pin A routed to IRQ 16
 Region 0: Memory at cc000000 (32-bit, non-prefetchable) [size=16M]
 Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
 Region 3: Memory at ce000000 (64-bit, prefetchable) [size=32M]
 Region 5: I/O ports at 2000 [size=128]
 [virtual] Expansion ROM at cd000000 [disabled] [size=512K]
 Capabilities: <access denied>
 Kernel driver in use: nouveau

Revision history for this message
In , Frepdesktop (frepdesktop) wrote :
Download full text (27.5 KiB)

My Xorg was also blocking - only the mouse pointer keeps moving, but no other thing happends.

I solved the problem (for now), by removing the comment for the option NoAccel and setting the value to true. I'm using the xorg.conf generated with Xorg -configure with just that change, and everything seem to be working. I'm even using the composite from xfce4 (for the real transparent xfce4-terminal), and it's working ok - maybe not as fast as with accelaration.

For me it was very simple to reproduce the error. As soon as gdm was active (the default theme on debian - the one with the stars and the star rocket), a click on any menu would block everything but the mouse movement.

Here is the log for my X when it was blocking.

Anything else I can do to help solve this?

X.Org X Server 1.9.4
Release Date: 2011-02-04
[ 147.356] X Protocol Version 11, Revision 0
[ 147.356] Build Operating System: Linux 2.6.32.28-dsa-ia32 i686 Debian
[ 147.356] Current Operating System: Linux voyager 2.6.37-1-686 #1 SMP Tue Feb 15 18:21:50 UTC 2011 i686
[ 147.356] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.37-1-686 root=UUID=b69dfdff-c215-473f-9f91-1680b2773ef1 ro single
[ 147.356] Build Date: 17 February 2011 01:25:01AM
[ 147.356] xorg-server 2:1.9.4-2 (Cyril Brulebois <email address hidden>)
[ 147.356] Current version of pixman: 0.21.4
[ 147.356] Before reporting problems, check http://wiki.x.org
 to make sure that you have the latest version.
[ 147.356] Markers: (--) probed, (**) from config file, (==) default setting,
 (++) from command line, (!!) notice, (II) informational,
 (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[ 147.356] (==) Log file: "/var/log/Xorg.0.log", Time: Fri Feb 18 03:17:25 2011
[ 147.356] (==) Using config file: "/etc/X11/xorg.conf"
[ 147.356] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[ 147.373] (==) ServerLayout "X.org Configured"
[ 147.373] (**) |-->Screen "Screen0" (0)
[ 147.373] (**) | |-->Monitor "Monitor0"
[ 147.374] (**) | |-->Device "Card0"
[ 147.374] (**) |-->Input Device "Mouse0"
[ 147.374] (**) |-->Input Device "Keyboard0"
[ 147.374] (==) Automatically adding devices
[ 147.374] (==) Automatically enabling devices
[ 147.452] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[ 147.452] Entry deleted from font path.
[ 147.541] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[ 147.541] Entry deleted from font path.
[ 147.541] (**) FontPath set to:
 /usr/share/fonts/X11/misc,
 /usr/share/fonts/X11/100dpi/:unscaled,
 /usr/share/fonts/X11/75dpi/:unscaled,
 /usr/share/fonts/X11/Type1,
 /usr/share/fonts/X11/100dpi,
 /usr/share/fonts/X11/75dpi,
 /var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType,
 built-ins,
 /usr/share/fonts/X11/misc,
 /usr/share/fonts/X11/100dpi/:unscaled,
 /usr/share/fonts/X11/75dpi/:unscaled,
 /usr/share/fonts/X11/Type1,
 /usr/share/fonts/X11/100dpi,
 /usr/share/fonts/X11/75dpi,
 /var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType,
 built-ins
[ 147.541] (**) ModulePath set to "/usr/lib/xorg/modules"
[ 147.541] (WW) AllowEmptyInput is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' wi...

Revision history for this message
In , Timo Aaltonen (tjaalton) wrote :

*** Bug 33357 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Jordan Bradley (jordan-w-bradley) wrote :

Is there anything the non-programmer can do t

Revision history for this message
In , Alex Mayorga (alex-mayorga) wrote :

FWIW you can see a peek of my frozen nVidia Corporation GT216 [GeForce GT 230M] (rev a2) at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/553789/+attachment/1863113/+files/peek-696104.txt

Revision history for this message
In , Thomas Schwinge (tschwinge) wrote :

Created attachment 43864
peek_10a000_1000-tschwinge.tar.bz2

Marcin, in case you need further data, see the
peek_10a000_1000-tschwinge.tar.bz2 which I'm just attaching. The files'
names should be explanatory. I ran each of the invokations three times.

I will now switch to using nouveau.noaccel=1, but please tell if you need
further data or need something tested.

This system is a DELL PRECISION M4500, the graphics card's lspci output:

01:00.0 VGA compatible controller: nVidia Corporation GT216 [Quadro FX 880M] (rev a2)

I hit this while setting up the system, roughly one hour after finishing
a fresh Ubuntu 10.10 maverick installation, while browsing in Firfox some
web pages about how to get suspend / hibernate / resume working reliably
-- oh the joy of installing GNU/Linux systems on new hardware. :-)

Revision history for this message
In , Pav-s (pav-s) wrote :

I have been using the git sources for two weeks, and updated to 2.6.39 yesterday:

since somewhere after 25c68aef4e6abcc3c10f593fc565c342ebe2ded8 the lockups disappeared - and I hope they wont return.

great work, thank you

Revision history for this message
In , Pav-s (pav-s) wrote :

I am sorry, the last comment was probably unclear:

Hardware: Thinkpad T410, NVS 3100M (nva8?)
Kernel: vanilla 2.6.39+ from kernel.org + nouveau tree merged

Revision history for this message
In , Emil-l-velikov (emil-l-velikov) wrote :

For anyone still experiencing this bug, here is a possible workaround (thanks to ColdFeetBob for pointing it out)

Modify* /xserver/mi/mieq.c and rebuild xserver
* Increase the "#define QUEUE_SIZE 512" to 1024, 2048 or 4096 [1]

Note that this is *not* a proper fix as you can see in the discussion [2], and it isn't recommended for unexperienced users

[1] http://www.nvnews.net/vbulletin/showpost.php?p=2398370&postcount=2
[2] https://bugs.freedesktop.org/show_bug.cgi?id=15473

Revision history for this message
In , Frédéric Crozat (fcrozat) wrote :

I can confirm I didn't had any freeze using 2.6.39 kernel on Lenovo T410 for 3 days (I had several freezes per hour before), using GNOME Shell.

Revision history for this message
In , Ildar (ildar-users) wrote :

(In reply to comment #54)
> I can confirm I didn't had any freeze using 2.6.39 kernel on Lenovo T410 for 3
> days (I had several freezes per hour before), using GNOME Shell.

Frederic, you forgot to mention that it's now terribly unstable. I have regular crashes like below, especially on: VT switching, Suspend-to-RAM, etc.

Details:
(II) NOUVEAU(0): NVLeaveVT is called.

Backtrace:
0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x45de88]
1: /usr/bin/Xorg (0x400000+0x62449) [0x462449]
2: /lib64/libpthread.so.0 (0x7fc7e6fb2000+0xefc0) [0x7fc7e6fc0fc0]
3: /lib64/libc.so.6 (0x7fc7e5f00000+0x7427c) [0x7fc7e5f7427c]
4: /lib64/libc.so.6 (__libc_malloc+0x70) [0x7fc7e5f76600]
5: /usr/lib64/X11/modules/libexa.so (0x7fc7e3297000+0x8d44) [0x7fc7e329fd44]
6: /usr/lib64/X11/modules/libexa.so (0x7fc7e3297000+0x53c2) [0x7fc7e329c3c2]
7: /usr/bin/Xorg (0x400000+0xa4a20) [0x4a4a20]
8: /usr/bin/Xorg (ChangeWindowAttributes+0x2e3) [0x455e23]
9: /usr/bin/Xorg (0x400000+0x27c38) [0x427c38]
10: /usr/bin/Xorg (0x400000+0x2db61) [0x42db61]
11: /usr/bin/Xorg (0x400000+0x215ce) [0x4215ce]
12: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x7fc7e5f1ec5d]
13: /usr/bin/Xorg (0x400000+0x21179) [0x421179]
Segmentation fault at address (nil)

Fatal server error:
Caught signal 11 (Segmentation fault). Server aborting

Revision history for this message
In , Skeggsb (skeggsb) wrote :

(In reply to comment #55)
> (In reply to comment #54)
> > I can confirm I didn't had any freeze using 2.6.39 kernel on Lenovo T410 for 3
> > days (I had several freezes per hour before), using GNOME Shell.
>
> Frederic, you forgot to mention that it's now terribly unstable. I have regular
> crashes like below, especially on: VT switching, Suspend-to-RAM, etc.
>
> Details:
> (II) NOUVEAU(0): NVLeaveVT is called.
Not even remotely related. And from the backtrace, probably not nouveau's fault either.

>
> Backtrace:
> 0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x45de88]
> 1: /usr/bin/Xorg (0x400000+0x62449) [0x462449]
> 2: /lib64/libpthread.so.0 (0x7fc7e6fb2000+0xefc0) [0x7fc7e6fc0fc0]
> 3: /lib64/libc.so.6 (0x7fc7e5f00000+0x7427c) [0x7fc7e5f7427c]
> 4: /lib64/libc.so.6 (__libc_malloc+0x70) [0x7fc7e5f76600]
> 5: /usr/lib64/X11/modules/libexa.so (0x7fc7e3297000+0x8d44) [0x7fc7e329fd44]
> 6: /usr/lib64/X11/modules/libexa.so (0x7fc7e3297000+0x53c2) [0x7fc7e329c3c2]
> 7: /usr/bin/Xorg (0x400000+0xa4a20) [0x4a4a20]
> 8: /usr/bin/Xorg (ChangeWindowAttributes+0x2e3) [0x455e23]
> 9: /usr/bin/Xorg (0x400000+0x27c38) [0x427c38]
> 10: /usr/bin/Xorg (0x400000+0x2db61) [0x42db61]
> 11: /usr/bin/Xorg (0x400000+0x215ce) [0x4215ce]
> 12: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x7fc7e5f1ec5d]
> 13: /usr/bin/Xorg (0x400000+0x21179) [0x421179]
> Segmentation fault at address (nil)
>
> Fatal server error:
> Caught signal 11 (Segmentation fault). Server aborting

Revision history for this message
In , Frédéric Crozat (fcrozat) wrote :

I didn't forgot anything : nouveau is rock stable with 2.6.39.x kernel on my T410 (no issue with VT switch. I didn't test suspend-to-ram). Your issues are a different bug, I think (X crash, no gpu lockup).

Revision history for this message
In , List0570 (list0570) wrote :

Uuggh, this seems to be a serious problem, even when using nvidia.ko.
Why? Because xorg seems to probe drivers on startup, and somehow managing to load the nouveau module even when I want to use the nvidia module. The nouveau module then fails to unload properly (use count is always at least 1, and stays 1 even after changing to runlevel 3 and killing xorg).

I could reliably hang the system just by logging out from kdm. Syslog is full of nouveau problems - why, given that I am using nvidia?
When renaming all instances of nouveau.ko under /lib/modules the hangs disappeared.

This has always been there and seems to have zilch effect:

> ls -l /etc/modprobe.d/nvidia.conf
-rw-r--r-- 1 root root 18 2011-04-29 22:55 /etc/modprobe.d/nvidia.conf
> cat /etc/modprobe.d/nvidia.conf
blacklist nouveau

Symptoms are random freezes requiring a hardware reset, but always involving some graphics operation. I can detect no other hardware malfunction, and lockup behaviour is not characteristic of general hardware faults (e.g. ram, cpu, disk).

My reset button is really polished now ;-((

System details:
openSUSE 11.4 with all updates installed,
currently that is kernel kernel-desktop-2.6.37.6-0.5.1.x86_64
xorg-x11-server-7.6_1.9.3-15.24.2.x86_64
The problem seems worse with
kernel-desktop-2.6.39.2-36.1.x86_64

01:00.0 VGA compatible controller: nVidia Corporation GT216 [GeForce GT 220] (rev a2)

Mobo is Gigabyte GA-880GA-UD3H with
NB/SB: AMD 880G / SB850
USB 2.0 + 3.0, SATA 3Gb + 6Gb (prob not relevant)
The chipset also contains an integrated ATI Radeon HD 4250 (which works fine on the OSS radeon driver, with the limits of the driver).
Phenom II X6 1090T CPU

Revision history for this message
In , Ppaalanen (ppaalanen) wrote :

(In reply to comment #58)
> Uuggh, this seems to be a serious problem, even when using nvidia.ko.
> Why? Because xorg seems to probe drivers on startup, and somehow managing to
> load the nouveau module even when I want to use the nvidia module. The nouveau
> module then fails to unload properly (use count is always at least 1, and stays
> 1 even after changing to runlevel 3 and killing xorg).

Your problem has nothing to do with this bug report.

You are (or the X server is) trying to use nouveau and nvidia drivers at the same time, which is known to cause havoc and should not be attempted. All the fallout you described fits perfectly.

The simple fix for you is to define Driver "nvidia" in your xorg.conf.
That prevents probing of any drivers and uses only what you want.

If you feel there is bad behavior in the X server, when it probes different drivers, you should file a bug against the X server. But first, make sure your distribution has not patched the X server driver probing code, e.g. by adding drivers not present in the original driver list. If they have, you should complain to your distribution.

Revision history for this message
In , Younes Manton (younes-m) wrote :

On , <email address hidden> wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=26980

> --- Comment #58 from Volker Kuhlmann <email address hidden>> 2011-07-13
> 04:03:19 PDT ---

> Uuggh, this seems to be a serious problem, even when using nvidia.ko.

> Why? Because xorg seems to probe drivers on startup, and somehow managing
> to

> load the nouveau module even when I want to use the nvidia module. The
> nouveau

> module then fails to unload properly (use count is always at least 1, and
> stays

> 1 even after changing to runlevel 3 and killing xorg).

> I could reliably hang the system just by logging out from kdm. Syslog is
> full

> of nouveau problems - why, given that I am using nvidia?

> When renaming all instances of nouveau.ko under /lib/modules the hangs

> disappeared.

> This has always been there and seems to have zilch effect:

> > ls -l /etc/modprobe.d/nvidia.conf

> -rw-r--r-- 1 root root 18 2011-04-29 22:55 /etc/modprobe.d/nvidia.conf

> > cat /etc/modprobe.d/nvidia.conf

> blacklist nouveau

> Symptoms are random freezes requiring a hardware reset, but always
> involving

> some graphics operation. I can detect no other hardware malfunction, and
> lockup

> behaviour is not characteristic of general hardware faults (eg ram, cpu,

> disk).

> My reset button is really polished now ;-((

> System details:

> openSUSE 11.4 with all updates installed,

> currently that is kernel kernel-desktop-2.6.37.6-0.5.1.x86_64

> xorg-x11-server-7.6_1.9.3-15.24.2.x86_64

> The problem seems worse with

> kernel-desktop-2.6.39.2-36.1.x86_64

> 01:00.0 VGA compatible controller: nVidia Corporation GT216 [GeForce GT
> 220]

> (rev a2)

> Mobo is Gigabyte GA-880GA-UD3H with

> NB/SB: AMD 880G / SB850

> USB 2.0 + 3.0, SATA 3Gb + 6Gb (prob not relevant)

> The chipset also contains an integrated ATI Radeon HD 4250 (which works
> fine on

> the OSS radeon driver, with the limits of the driver).

> Phenom II X6 1090T CPU

Delete nouveau.ko.

Revision history for this message
In , Arun Raghavan (arunraghavan) wrote :
Download full text (4.0 KiB)

Using 3.0.4, this bug is very much still there (or at least the backgrace looks very similar). Happy to help provide more information or debug if pointed in the right direction.

(gdb) bt
#0 0x00007f5af04f3007 in ioctl ()
    at ../sysdeps/unix/syscall-template.S:82
#1 0x00007f5aeea8f878 in drmIoctl (fd=9,
    request=1074291842, arg=0x7fff50cac510)
    at /usr/src/debug/x11-libs/libdrm-2.4.26/libdrm-2.4.26/xf86drm.c:167
#2 0x00007f5aeea91c2b in drmCommandWrite (
    fd=<optimized out>, drmCommandIndex=<optimized out>,
    data=<optimized out>, size=<optimized out>)
    at /usr/src/debug/x11-libs/libdrm-2.4.26/libdrm-2.4.26/xf86drm.c:2422
#3 0x00007f5aee44311d in nouveau_bo_wait (bo=0x1b75420,
    cpu_write=<optimized out>, no_wait=<optimized out>,
    no_block=<optimized out>)
    at /usr/src/debug/x11-libs/libdrm-2.4.26/libdrm-2.4.26/nouveau/nouveau_bo.c:390
#4 0x00007f5aee443703 in nouveau_bo_map_range (
    bo=0x1b75420, delta=0, size=<optimized out>, flags=4)
    at /usr/src/debug/x11-libs/libdrm-2.4.26/libdrm-2.4.26/nouveau/nouveau_bo.c:433
#5 0x00007f5aee64d2b5 in NVAccelDownloadM2MF (
    dst_pitch=3740, dst=0x2c96644 "", h=20, w=1,
    y=<optimized out>, x=237, pspix=0x29275f0)
    at /usr/src/debug/x11-drivers/xf86-video-nouveau-0.0.16_pre20110801/xf86-video-nouveau-0.0.16_pre20110801/src/nouveau_exa.c:132
#6 nouveau_exa_download_from_screen (pspix=0x29275f0,
    x=237, y=348, w=1, h=20, dst=0x2c96644 "",
    dst_pitch=3740)
    at /usr/src/debug/x11-drivers/xf86-video-nouveau-0.0.16_pre20110801/xf86-video-nouveau-0.0.16_pre20110801/src/nouveau_exa.c:386
#7 0x00007f5aed9f81ee in exaCopyDirty (
    migrate=<optimized out>, pValidDst=0x2927680,
    pValidSrc=0x2927690,
    transfer=0x7f5aee64cd80 <nouveau_exa_download_from_screen>, fallback_index=1, sync=0x7f5aed9f69b0 <exaWaitSync>)
    at /usr/src/debug/x11-base/xorg-server-1.10.4/xorg-server-1.10.4/exa/exa_migration_classic.c:220
#8 0x00007f5aed9fac41 in exaPrepareAccessReg_mixed (
    pPixmap=0x29275f0, index=0, pReg=0x0)
    at /usr/src/debug/x11-base/xorg-server-1.10.4/xorg-server-1.10.4/exa/exa_migration_mixed.c:254
#9 0x00007f5aeda061d0 in ExaPrepareCompositeReg (
    height=20, width=1, yDst=31, xDst=234, yMask=0,
    xMask=0, ySrc=0, xSrc=0, pDst=0x28a94e0, pMask=0x0,
    pSrc=0x28a9e20, op=57 '9', pScreen=<optimized out>)
    at /usr/src/debug/x11-base/xorg-server-1.10.4/xorg-server-1.10.4/exa/exa_unaccel.c:600
#10 ExaCheckComposite (op=57 '9', pSrc=0x28a9e20,
    pMask=0x0, pDst=0x28a94e0, xSrc=0, ySrc=0, xMask=0,
    yMask=0, xDst=234, yDst=31, width=1, height=20)
    at /usr/src/debug/x11-base/xorg-server-1.10.4/xorg-server-1.10.4/exa/exa_unaccel.c:625
#11 0x00007f5aeda02189 in exaComposite (op=57 '9',
    pSrc=0x28a9e20, pMask=0x0, pDst=0x28a94e0,
    xSrc=<optimized out>, ySrc=<optimized out>, xMask=0,
    yMask=0, xDst=<optimized out>, yDst=<optimized out>,
    width=1, height=20)
    at /usr/src/debug/x11-base/xorg-server-1.10.4/xorg-server-1.10.4/exa/exa_render.c:1066
#12 0x00000000004db50a in damageComposite (op=57 '9',
    pSrc=0x28a9e20, pMask=0x0, pDst=0x28a94e0, xSrc=0,
    ySrc=0, xMask=0, yMask=0, xDst=234, yDst=31, width=1,
    height=20)
    at /u...

Read more...

Revision history for this message
In , Skeggsb (skeggsb) wrote :

(In reply to comment #61)
> Using 3.0.4, this bug is very much still there (or at least the backgrace looks
> very similar). Happy to help provide more information or debug if pointed in
> the right direction.
Any GPU hang etc will have similar backtraces from X's point of view, and it's not usually useful in itself to see what's happening.

Did you have any output in your kernel log from when this occurred?

Revision history for this message
In , Arun Raghavan (arunraghavan) wrote :

(In reply to comment #62)
[...]
> Did you have any output in your kernel log from when this occurred?

There was no output after the initial module load messages.

Revision history for this message
In , Skeggsb (skeggsb) wrote :

(In reply to comment #63)
> (In reply to comment #62)
> [...]
> > Did you have any output in your kernel log from when this occurred?
>
> There was no output after the initial module load messages.

Are you able to install envytools (http://nouveau.git.sourceforge.net/git/gitweb.cgi?p=nouveau/envytools;a=summary) and run "nvapeek 0x400700 4" while the GPU is hung?

Also, how new are all your userspace components (xf86-video-nouveau, libdrm, mesa etc)?

Revision history for this message
In , Lithium-flower (lithium-flower) wrote :
Download full text (3.5 KiB)

I experimented this bug not long ago - right now, Im not sure it is exactly the same, since it's not really random but it happens everytime I run Supertux.
I recovered the folloving kernel log lines rebooting from another partition after the freeze.

kernel 3.0.4, Arch Linux, x86_64, fully updated
nouveau-dri 7.11-2

lspci:

01:00.0 VGA compatible controller: nVidia Corporation GT215 [GeForce GT 240] (rev a2) (prog-if 00 [VGA controller])

kernel.log:

Sep 11 08:44:25 faye kernel: [ 1297.836589] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 0
Sep 11 08:44:25 faye kernel: [ 1298.301418] [drm] nouveau 0000:01:00.0: PFIFO_INTR 0x00400000 - Ch 2
Sep 11 08:44:27 faye kernel: [ 1299.629024] [drm] nouveau 0000:01:00.0: PFIFO_INTR 0x04400000 - Ch 2
Sep 11 08:44:27 faye kernel: [ 1300.956673] [drm] nouveau 0000:01:00.0: PFIFO_INTR 0x04400000 - Ch 2
Sep 11 08:44:27 faye kernel: [ 1300.956719] psmouse.c: Wheel Mouse at isa0060/serio1/input0 lost synchronization, throwing 1 bytes away.
Sep 11 08:44:29 faye kernel: [ 1302.284342] [drm] nouveau 0000:01:00.0: PFIFO_INTR 0x00400000 - Ch 2
Sep 11 08:44:29 faye kernel: [ 1299.632344] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 5
Sep 11 08:44:31 faye kernel: [ 1302.960046] [drm] nouveau 0000:01:00.0: PGRAPH TLB flush idle timeout fail: 0x00c00f01 0x00000209 0x00001600 0x00000000
Sep 11 08:44:32 faye kernel: [ 1303.611794] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 5
Sep 11 08:44:34 faye kernel: [ 1303.615050] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 5
Sep 11 08:44:53 faye kernel: [ 1302.960046] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 0
Sep 11 08:44:53 faye kernel: [ 1305.614198] [drm] nouveau 0000:01:00.0: PFIFO_INTR 0x00400000 - Ch 2
Sep 11 08:44:53 faye kernel: [ 1307.613341] [drm] nouveau 0000:01:00.0: PGRAPH TLB flush idle timeout fail: 0x00c00f01 0x00000209 0x00001600 0x00000000
Sep 11 08:44:53 faye kernel: [ 1310.944310] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 5
Sep 11 08:44:53 faye kernel: [ 1307.613341] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 0
Sep 11 08:44:53 faye kernel: [ 1314.940910] [drm] nouveau 0000:01:00.0: PRAMIN flush timeout
Sep 11 08:44:53 faye kernel: [ 1310.941070] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 0
Sep 11 08:44:53 faye kernel: [ 1316.266913] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 5
Sep 11 08:44:53 faye kernel: [ 1312.943555] [drm] nouveau 0000:01:00.0: PGRAPH TLB flush idle timeout fail: 0x00c00f01 0x00000209 0x00001600 0x00000000
Sep 11 08:44:53 faye kernel: [ 1312.943555] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 0
Sep 11 08:44:55 faye kernel: [ 1318.264271] [drm] nouveau 0000:01:00.0: PFIFO_INTR 0x04c00000 - Ch 2
Sep 11 08:44:55 faye kernel: [ 1322.259717] psmouse.c: resync failed, issuing reconnect request
Sep 11 08:44:55 faye kernel: [ 1322.256898] [drm] nouveau 0000:01:00.0: PGRAPH TLB flush idle timeout fail: 0x00c00f01 0x00000209 0x00001600 0x00000000
Sep 11 08:44:55 faye kernel: [ 1322.256898] [drm] nouveau 0000:01:00.0: vm flush timeout: engine 0
Sep 11 08:44:55 faye kernel: [ 1323.596565] [drm] nouveau 0000:01:00.0: PFIFO_INTR 0x00400000 - Ch 2
Sep 11 08:44...

Read more...

Revision history for this message
In , Arun Raghavan (arunraghavan) wrote :

(In reply to comment #64)
> (In reply to comment #63)
> > (In reply to comment #62)
> > [...]
> > > Did you have any output in your kernel log from when this occurred?
> >
> > There was no output after the initial module load messages.
>
> Are you able to install envytools
> (http://nouveau.git.sourceforge.net/git/gitweb.cgi?p=nouveau/envytools;a=summary)
> and run "nvapeek 0x400700 4" while the GPU is hung?

All I get is a '...'.

> Also, how new are all your userspace components (xf86-video-nouveau, libdrm,
> mesa etc)?

mesa - 7.11
libdrm - 2.4.26
xf86-video-nouveau - 0.0.16_pre20110801 (that's the Gentoo package, which is presumably a snapshot from that date)

For what it's worth, kernel's 3.0.4, and the GPU is a 9400M (PCI id 10de:0863).

Revision history for this message
In , Marcin Slusarz (marcin-slusarz) wrote :

This bug was fixed in 2.6.39 ( http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=2b4cebe4e165b0ef30a138e4cf602538dea15583 ),
so I'm closing this bug report.

ColdFeetBob, Arun Raghavan: you are experiencing different bugs.
If you still can reproduce them please read http://nouveau.freedesktop.org/wiki/Bugs and open *new* bug reports. Thanks.

Revision history for this message
penalvch (penalvch) wrote :

Marc Deslauriers, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect 711434

tags: added: bios-outdated-2.30
description: updated
summary: - [natty] frequent Xorg freezes with nouveau
+ [Lenovo ThinkPad T61] frequent Xorg freezes with nouveau
summary: - [Lenovo ThinkPad T61] frequent Xorg freezes with nouveau
+ 10de:0429 [Lenovo ThinkPad T61] frequent Xorg freezes with nouveau
Changed in xserver-xorg-video-nouveau (Ubuntu):
importance: High → Low
Changed in nouveau:
importance: Unknown → High
status: Unknown → Fix Released
Changed in xserver-xorg-video-nouveau (Ubuntu):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.