Lucid Lync ugrade results in nvidia driver failure

Asked by Castalia

After upgrading from Ubuntu 9.10 to 10.04 the nVidia graphics driver is no longer working correctly. Currently the
system is operating in fall-back failsafe X server mode on a single GPU. The system is uses a Core i7 (x86-64) with
two nVidia GeForce 9800 GT graphics cards and two flat panel displays. The systems was working fine with
Unbutu 9.10, including displaying the nVidia splash screen during the boot up sequence. After the OS update,
using the Update Manager, to Ubuntu 10.04 the graphics driver fails to properly initialize.

It seems that there is a driver version mismatch problem. Can anyone tell me how to extricate the system from this
condition and get the nVidia driver working correctly again?

* From "nvidia-settings" (which used to work fine before the Lucic upgrade):

A dialog box immediately appears saying:

"You do not appear to be using the NVIDIA X driver.
Please edit your X configuration file (just run `nvidia-xconfig` as root),
and restart the X server."

It does not report any other information or offer any other controls.

* From "nvidia-xconfig --query-gpu-info":

Error: API mismatch: the NVIDIA kernel module has version 190.53,
but this NVIDIA driver component has version 195.36.15. Please make
sure that the kernel module and all NVIDIA driver components
have the same version.

ERROR: Unable to query GPU information

* From "nvidia-xconfig --enable-all-gpus --multigpu=On --no-twinview --xinerama":

Error: API mismatch: the NVIDIA kernel module has version 190.53,
but this NVIDIA driver component has version 195.36.15. Please make
sure that the kernel module and all NVIDIA driver components
have the same version.

ERROR: Unable to determine number of GPUs in system; cannot honor '--enable-all-gpus' option.

* Hardware Drivers lists:

- NVIDIA accelerated graphics driver (version 173)
- nvidia
- NVIDIA accelerated graphics driver (version current) [Recommended]

The recommended current version "driver is as being activated and currently
in use."

* From /var/log/jockey.log:

2010-04-30 17:20:11,101 DEBUG: loading custom handler /usr/share/jockey/handlers/nvidia.py
2010-04-30 17:20:11,110 WARNING: modinfo for module nvidia_96 failed: ERROR: modinfo: could not find module nvidia_96

2010-04-30 17:20:11,114 DEBUG: Instantiated Handler subclass __builtin__.NvidiaDriver96 from name NvidiaDriver96
2010-04-30 17:20:11,120 DEBUG: NVIDIA accelerated graphics driver availability undetermined, adding to pool
2010-04-30 17:20:11,127 DEBUG: Instantiated Handler subclass __builtin__.NvidiaDriverCurrent from name NvidiaDriverCurrent
2010-04-30 17:20:11,133 DEBUG: NVIDIA accelerated graphics driver availability undetermined, adding to pool
2010-04-30 17:20:11,133 DEBUG: Could not instantiate Handler subclass __builtin__.NvidiaDriverBase from name NvidiaDriverBase
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/jockey/detection.py", line 914, in get_handlers
    inst = obj(backend)
TypeError: __init__() takes exactly 3 arguments (2 given)
2010-04-30 17:20:11,145 WARNING: modinfo for module nvidia_173 failed: ERROR: modinfo: could not find module nvidia_173

2010-04-30 17:20:11,149 DEBUG: Instantiated Handler subclass __builtin__.NvidiaDriver173 from name NvidiaDriver173
2010-04-30 17:20:11,157 DEBUG: NVIDIA accelerated graphics driver availability undetermined, adding to pool
[...]
2010-04-30 17:20:12,028 WARNING: modinfo for module nvidia_173 failed: ERROR: modinfo: could not find module nvidia_173

2010-04-30 17:20:13,059 DEBUG: nvidia_173 is not the alternative in use
2010-04-30 17:20:12,030 DEBUG: got handler xorg:nvidia_173([NvidiaDriver173, nonfree, disabled] NVIDIA accelerated graphics driver)
2010-04-30 17:20:13,065 DEBUG: got handler kmod:nvidia([KernelModuleHandler, nonfree, disabled] nvidia)
2010-04-30 17:20:13,270 DEBUG: got handler xorg:nvidia_current([NvidiaDriverCurrent, nonfree, enabled] NVIDIA accelerated graphics driver)
2010-04-30 17:20:13,369 DEBUG: no corresponding handler available for {'driver_type': 'kernel_module', 'kernel_module': 'nouveau', 'jockey_handler': 'KernelModuleHandler'}
2010-04-30 17:20:13,369 DEBUG: no corresponding handler available for {'driver_type': 'kernel_module', 'kernel_module': 'nvidiafb', 'jockey_handler': 'KernelModuleHandler'}
2010-04-30 17:20:13,372 DEBUG: got handler xorg:nvidia_current([NvidiaDriverCurrent, nonfree, enabled] NVIDIA accelerated graphics driver)
2010-04-30 17:20:13,471 DEBUG: no corresponding handler available for {'driver_type': 'kernel_module', 'kernel_module': 'vga16fb', 'jockey_handler': 'KernelModuleHandler'}

* From "lspci -vnvn" listing:

02:00.0 VGA compatible controller [0300]: nVidia Corporation G92 [GeForce 9800 GT] [10de:0605] (rev a2)
 Subsystem: eVga.com. Corp. Device [3842:c973]
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0, Cache Line Size: 64 bytes
 Interrupt: pin A routed to IRQ 16
 Region 0: Memory at f8000000 (32-bit, non-prefetchable) [size=16M]
 Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
 Region 3: Memory at f6000000 (64-bit, non-prefetchable) [size=32M]
 Region 5: I/O ports at cf00 [size=128]
 [virtual] Expansion ROM at f9000000 [disabled] [size=128K]
 Capabilities: [60] Power Management version 3
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=0 PME-
 Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
  Address: 0000000000000000 Data: 0000
 Capabilities: [78] Express (v2) Endpoint, MSI 00
  DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <4us
   ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
   RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
   MaxPayload 128 bytes, MaxReadReq 512 bytes
  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
  LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <256ns, L1 <1us
   ClockPM- Suprise- LLActRep- BwNot-
  LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk-
   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 Capabilities: [100] Virtual Channel <?>
 Capabilities: [128] Power Budgeting <?>
 Capabilities: [600] Vendor Specific Information <?>
 Kernel driver in use: nvidia
 Kernel modules: nvidia, nvidia-current, nvidiafb, nouveau

04:00.0 VGA compatible controller [0300]: nVidia Corporation G92 [GeForce 9800 GT] [10de:0605] (rev a2)
 Subsystem: eVga.com. Corp. Device [3842:c973]
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0, Cache Line Size: 64 bytes
 Interrupt: pin A routed to IRQ 16
 Region 0: Memory at f4000000 (32-bit, non-prefetchable) [size=16M]
 Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
 Region 3: Memory at f2000000 (64-bit, non-prefetchable) [size=32M]
 Region 5: I/O ports at df00 [size=128]
 [virtual] Expansion ROM at f5000000 [disabled] [size=128K]
 Capabilities: [60] Power Management version 3
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=0 PME-
 Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
  Address: 0000000000000000 Data: 0000
 Capabilities: [78] Express (v2) Endpoint, MSI 00
  DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <4us
   ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
   RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
   MaxPayload 128 bytes, MaxReadReq 512 bytes
  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
  LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <256ns, L1 <1us
   ClockPM- Suprise- LLActRep- BwNot-
  LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk-
   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 Capabilities: [100] Virtual Channel <?>
 Capabilities: [128] Power Budgeting <?>
 Capabilities: [600] Vendor Specific Information <?>
 Kernel driver in use: nvidia
 Kernel modules: nvidia, nvidia-current, nvidiafb, nouveau

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Solved by:
wojox
Solved:
Last query:
Last reply:
Revision history for this message
Best wojox (wojox) said :
#1
Revision history for this message
Castalia (castalia) said :
#2

Bottom line: After following the procedure described in the Ubuntu Forum posting (http://ubuntuforums.org/showthread.php?t=1467074) the nvidia driver appears to be working again.

Caveats:

The NVIDIA-Linux-x86_64-195.36.24-pkg2.run procedure built the kernel module - it said Done/100% - but then did nothing else and there was no way to exit other than rebooting the system (!). On reboot the problem had not be resolved: No nvidia driver could be found. After a second try the procedure again hung and the system had to be rebooted, but this time the nvidia driver was found and both displays were properly initialized and became accessible to the X server.

The "lspci -vnvn" listing, for both graphics devices, shows -

    Kernel driver in use: nvidia
    Kernel modules: nvidia, nvidiafb, nouveau

- but nvidia-current is no longer present.

Synaptic does not see the nvidia-current package as installed. How will driver upgrades be noticed by the Update Manager
without nvidia-current installed?

The Hardware Drivers utility no longer lists any drivers at all!

The "nvidia-xconfig --query-gpu-info" utility now correctly lists both GPUs with the expected descriptions.

The nvidia-settings tool now correctly displays the graphics driver and related X server information, including identifying the NVIDIA Driver Version as 195.36.24 which is what the NVIDIA-Linux-x86_64-195.36.24-pkg2.run was expected to install.

The solution to the nVidia driver problem, while ultimately successful, was daunting at best (I was on the verge of reverting to Ubuntu 9.10 to retreat from the calamity). As a fundamental component in the system, it seems that the management of the nVidia graphics driver should be cleanly integrated into the software management mechanisms, including the Update Manager. Perhaps the Ubuntu folks are already working on a better solution....