Kernel oops with the 460.80 and 465.27 drivers when using DP, but not with HDMI

Bug #1930733 reported by Sønke Lorenzen
54
This bug affects 8 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-460 (Ubuntu)
Triaged
High
Unassigned
nvidia-graphics-drivers-465 (Ubuntu)
Triaged
High
Unassigned

Bug Description

I get a kernel oops with the 460.80 and 465.27 drivers on Hirsute

Jun 03 16:26:57 willow kernel: Oops: 0000 [#1] SMP PTI
Jun 03 16:26:57 willow kernel: CPU: 7 PID: 2004 Comm: Xorg Tainted: P OE 5.11.0-18-generic #19-Ubuntu
Jun 03 16:26:57 willow kernel: Hardware name: System manufacturer System Product Name/PRIME H270M-PLUS, BIOS 1605 12/13/2019
Jun 03 16:26:57 willow kernel: RIP: 0010:_nv015534rm+0x1b6/0x330 [nvidia]
Jun 03 16:26:57 willow kernel: Code: 8b 87 68 05 00 00 ba 01 00 00 00 be 02 00 00 00 e8 bf eb 55 c8 41 83 c5 01 41 83 fd 1f 0f 84 0b 01 00 00 48 8b 45 10 44 89 ee <48> 8b b8 70 01 00 00 48 8b 87 d8 04 00 00 e8>
Jun 03 16:26:57 willow kernel: RSP: 0000:ffffaf4201893958 EFLAGS: 00010297
Jun 03 16:26:57 willow kernel: RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000003
Jun 03 16:26:57 willow kernel: RDX: 0000000000000004 RSI: 0000000000000003 RDI: 0000000000000000
Jun 03 16:26:57 willow kernel: RBP: ffff8e318220add0 R08: 0000000000000001 R09: ffff8e318220acb8
Jun 03 16:26:57 willow kernel: R10: ffff8e3182204008 R11: 0000000010100000 R12: 0000000000000400
Jun 03 16:26:57 willow kernel: R13: 0000000000000003 R14: ffff8e3186ca8010 R15: 0000000000000800
Jun 03 16:26:57 willow kernel: FS: 00007f5807f38a40(0000) GS:ffff8e3466dc0000(0000) knlGS:0000000000000000
Jun 03 16:26:57 willow kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 03 16:26:57 willow kernel: CR2: 0000000000000170 CR3: 0000000140710005 CR4: 00000000003706e0
Jun 03 16:26:57 willow kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 03 16:26:57 willow kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 03 16:26:57 willow kernel: Call Trace:
Jun 03 16:26:57 willow kernel: ? _nv015556rm+0x7fd/0x1020 [nvidia]
Jun 03 16:26:57 willow kernel: ? _nv027155rm+0x22c/0x4f0 [nvidia]
Jun 03 16:26:57 willow kernel: ? _nv017787rm+0x303/0x5e0 [nvidia]
Jun 03 16:26:57 willow kernel: ? _nv017789rm+0xe1/0x220 [nvidia]
Jun 03 16:26:57 willow kernel: ? _nv022829rm+0xed/0x220 [nvidia]
Jun 03 16:26:57 willow kernel: ? _nv023065rm+0x30/0x60 [nvidia]
Jun 03 16:26:57 willow kernel: ? _nv000704rm+0x16da/0x22b0 [nvidia]
Jun 03 16:26:57 willow kernel: ? rm_init_adapter+0xc5/0xe0 [nvidia]
Jun 03 16:26:57 willow kernel: ? nv_open_device+0x122/0x8e0 [nvidia]
Jun 03 16:26:57 willow kernel: ? nvidia_open+0x2b7/0x560 [nvidia]
Jun 03 16:26:57 willow kernel: ? nvidia_frontend_open+0x58/0xa0 [nvidia]
Jun 03 16:26:57 willow kernel: ? chrdev_open+0xf7/0x220
Jun 03 16:26:57 willow kernel: ? cdev_device_add+0x90/0x90
Jun 03 16:26:57 willow kernel: ? do_dentry_open+0x156/0x370
Jun 03 16:26:57 willow kernel: ? vfs_open+0x2d/0x30
Jun 03 16:26:57 willow kernel: ? do_open+0x1c3/0x340
Jun 03 16:26:57 willow kernel: ? path_openat+0x10a/0x1d0
Jun 03 16:26:57 willow kernel: ? do_filp_open+0x8c/0x130
Jun 03 16:26:57 willow kernel: ? __check_object_size+0x1c/0x20
Jun 03 16:26:57 willow kernel: ? do_sys_openat2+0x9b/0x150
Jun 03 16:26:57 willow kernel: ? __x64_sys_openat+0x56/0x90
Jun 03 16:26:57 willow kernel: ? do_syscall_64+0x38/0x90
Jun 03 16:26:57 willow kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jun 03 16:26:57 willow kernel: Modules linked in: snd_seq_dummy snd_hrtimer vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) binfmt_misc zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) >
Jun 03 16:26:57 willow kernel: sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd c>
Jun 03 16:26:57 willow kernel: CR2: 0000000000000170
Jun 03 16:26:57 willow kernel: ---[ end trace 0013b6989b267f32 ]---
Jun 03 16:26:57 willow kernel: RIP: 0010:_nv015534rm+0x1b6/0x330 [nvidia]
Jun 03 16:26:57 willow kernel: Code: 8b 87 68 05 00 00 ba 01 00 00 00 be 02 00 00 00 e8 bf eb 55 c8 41 83 c5 01 41 83 fd 1f 0f 84 0b 01 00 00 48 8b 45 10 44 89 ee <48> 8b b8 70 01 00 00 48 8b 87 d8 04 00 00 e8>
Jun 03 16:26:57 willow kernel: RSP: 0000:ffffaf4201893958 EFLAGS: 00010297
Jun 03 16:26:57 willow kernel: RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000003
Jun 03 16:26:57 willow kernel: RDX: 0000000000000004 RSI: 0000000000000003 RDI: 0000000000000000
Jun 03 16:26:57 willow kernel: RBP: ffff8e318220add0 R08: 0000000000000001 R09: ffff8e318220acb8
Jun 03 16:26:57 willow kernel: R10: ffff8e3182204008 R11: 0000000010100000 R12: 0000000000000400
Jun 03 16:26:57 willow kernel: R13: 0000000000000003 R14: ffff8e3186ca8010 R15: 0000000000000800
Jun 03 16:26:57 willow kernel: FS: 00007f5807f38a40(0000) GS:ffff8e3466dc0000(0000) knlGS:0000000000000000
Jun 03 16:26:57 willow kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 03 16:26:57 willow kernel: CR2: 0000000000000170 CR3: 0000000140710005 CR4: 00000000003706e0
Jun 03 16:26:57 willow kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 03 16:26:57 willow kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 03 16:26:57 willow kernel: general protection fault, probably for non-canonical address 0xc483480f75000000: 0000 [#2] SMP PTI
Jun 03 16:26:57 willow kernel: CPU: 3 PID: 2004 Comm: Xorg Tainted: P D OE 5.11.0-18-generic #19-Ubuntu
Jun 03 16:26:57 willow kernel: Hardware name: System manufacturer System Product Name/PRIME H270M-PLUS, BIOS 1605 12/13/2019
Jun 03 16:26:57 willow kernel: RIP: 0010:_nv009368rm+0x3c/0x340 [nvidia]
Jun 03 16:26:57 willow kernel: Code: 07 0f 1f 44 00 00 31 d2 48 8b 07 48 85 c0 75 1a e9 a1 02 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 48 10 48 85 c9 74 17 48 89 c8 <48> 39 30 77 ef 0f 83 29 02 00 00 48 8b 48 18>
Jun 03 16:26:57 willow kernel: RSP: 0018:ffffaf4201893d50 EFLAGS: 00010086
Jun 03 16:26:57 willow kernel: RAX: c483480f75000000 RBX: ffffaf4201893d98 RCX: c483480f75000000
Jun 03 16:26:57 willow kernel: RDX: ffffaf4201893de8 RSI: 00000000000007d4 RDI: ffffffffc2b9f6d8
Jun 03 16:26:57 willow kernel: RBP: ffff8e3143782ff0 R08: 0000000000000001 R09: 0000000000000000
Jun 03 16:26:57 willow kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Jun 03 16:26:57 willow kernel: R13: ffffffffc2b9fec0 R14: ffffaf4201893e80 R15: ffffffffc2b9cb00
Jun 03 16:26:57 willow kernel: FS: 0000000000000000(0000) GS:ffff8e3466cc0000(0000) knlGS:0000000000000000
Jun 03 16:26:57 willow kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 03 16:26:57 willow kernel: CR2: 00007fa1b0004038 CR3: 00000001105e6006 CR4: 00000000003706e0
Jun 03 16:26:57 willow kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 03 16:26:57 willow kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 03 16:26:57 willow kernel: Call Trace:
Jun 03 16:26:57 willow kernel: ? _nv039616rm+0xdf/0x1e0 [nvidia]
Jun 03 16:26:57 willow kernel: ? rm_cleanup_file_private+0x42/0x140 [nvidia]
Jun 03 16:26:57 willow kernel: ? os_free_mem+0x22/0x30 [nvidia]
Jun 03 16:26:57 willow kernel: ? nvidia_close+0x156/0x320 [nvidia]
Jun 03 16:26:57 willow kernel: ? nvidia_frontend_close+0x2f/0x50 [nvidia]
Jun 03 16:26:57 willow kernel: ? __fput+0x9f/0x250
Jun 03 16:26:57 willow kernel: ? ____fput+0xe/0x10
Jun 03 16:26:57 willow kernel: ? task_work_run+0x6d/0xa0
Jun 03 16:26:57 willow kernel: ? do_exit+0x233/0x3e0
Jun 03 16:26:57 willow kernel: ? rewind_stack_do_exit+0x17/0x20
Jun 03 16:26:57 willow kernel: Modules linked in: snd_seq_dummy snd_hrtimer vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) binfmt_misc zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) >
Jun 03 16:26:57 willow kernel: sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd c>
Jun 03 16:26:57 willow kernel: ---[ end trace 0013b6989b267f33 ]---
Jun 03 16:26:57 willow kernel: RIP: 0010:_nv015534rm+0x1b6/0x330 [nvidia]
Jun 03 16:26:57 willow kernel: Code: 8b 87 68 05 00 00 ba 01 00 00 00 be 02 00 00 00 e8 bf eb 55 c8 41 83 c5 01 41 83 fd 1f 0f 84 0b 01 00 00 48 8b 45 10 44 89 ee <48> 8b b8 70 01 00 00 48 8b 87 d8 04 00 00 e8>
Jun 03 16:26:57 willow kernel: RSP: 0000:ffffaf4201893958 EFLAGS: 00010297
Jun 03 16:26:57 willow kernel: RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000003
Jun 03 16:26:57 willow kernel: RDX: 0000000000000004 RSI: 0000000000000003 RDI: 0000000000000000
Jun 03 16:26:57 willow kernel: RBP: ffff8e318220add0 R08: 0000000000000001 R09: ffff8e318220acb8
Jun 03 16:26:57 willow kernel: R10: ffff8e3182204008 R11: 0000000010100000 R12: 0000000000000400
Jun 03 16:26:57 willow kernel: R13: 0000000000000003 R14: ffff8e3186ca8010 R15: 0000000000000800
Jun 03 16:26:57 willow kernel: FS: 0000000000000000(0000) GS:ffff8e3466cc0000(0000) knlGS:0000000000000000
Jun 03 16:26:57 willow kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 03 16:26:57 willow kernel: CR2: 00007fa1b0004038 CR3: 00000001105e6006 CR4: 00000000003706e0
Jun 03 16:26:57 willow kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 03 16:26:57 willow kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 03 16:26:57 willow kernel: Fixing recursive fault but reboot is needed!

switching to a tty or login with ssh does not work at that point.

Installing the nvidia-driver-450-server from advanced mode got me a desktop back.

I found this upstream bugreport: https://forums.developer.nvidia.com/t/465-24-02-page-fault/175782/62

where people on different distributions have this problem, and it might be related to DP. My monitor is connected with DP.

From lspci:

01:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1)

I also have a laptop with a 1050ti where I can use these drivers on 21.04 without problem, while only the inbuilt screen is connected. I will try the laptop with DP, and my desktop with HDMI later

ProblemType: Bug
DistroRelease: Ubuntu 21.04
Package: nvidia-driver-465 (not installed)
ProcVersionSignature: Ubuntu 5.11.0-18.19-generic 5.11.17
Uname: Linux 5.11.0-18-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu65.1
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: ubuntu:GNOME
Date: Thu Jun 3 18:14:16 2021
EcryptfsInUse: Yes
InstallationDate: Installed on 2013-02-22 (3022 days ago)
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Release amd64 (20121017.5)
SourcePackage: nvidia-graphics-drivers-465
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :
Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

I do not get a kernel oops with 465.27 when I attach my monitor with HDMI instead of DisplayPort

summary: - Kernel oops with the 460.80 and 465.27 drivers
+ Kernel oops with the 460.80 and 465.27 drivers when using DP, but not
+ with HDMI
Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

460.80 does also work with HDMI

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-465 (Ubuntu):
status: New → Confirmed
Changed in nvidia-graphics-drivers-460 (Ubuntu):
status: New → Confirmed
Revision history for this message
Jonas Gamao (yamiyukisenpai) wrote :

Ubuntu should revert it to older 460 for now. It's causing issues right now.

Changed in nvidia-graphics-drivers-460 (Ubuntu):
status: Confirmed → Triaged
Changed in nvidia-graphics-drivers-465 (Ubuntu):
status: Confirmed → Triaged
Changed in nvidia-graphics-drivers-460 (Ubuntu):
importance: Undecided → High
Changed in nvidia-graphics-drivers-465 (Ubuntu):
importance: Undecided → High
Revision history for this message
Giles Coochey (gilesco) wrote :

Works with DVI-D, but not if DP is connected to a device.

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

The nvidia-driver-460-server packages in the Ubuntu repositries are on 460.73.01
I installed those, and disconnected my monitor from HDMI and connected with DisplayPort instead.

This seems to work without problems.

What is the purpose of the server packages?

Revision history for this message
Alberto Milone (albertomilone) wrote :

Thank you for taking the time to file this bug report.

NVIDIA is aware of the problem, and will deliver a fix. In the meantime, we recommend that you use either the nvidia-driver-460-server package or the nvidia-driver-450-server.

While -server drivers are usually recommended for datacentres, they also work fine on the desktop. You can read more about this on the following website:

https://docs.nvidia.com/datacenter/tesla/index.html

Revision history for this message
Jonas Gamao (yamiyukisenpai) wrote :

Why not downgrade 460 & 465 in the repo? Its causing issues with new install. I have a friend who suffered that same problem, and I have to walk her through GRUB in order to install the server driver.

Revision history for this message
Giles Coochey (gilesco) wrote :

You can choose not to install third-party drivers at install time. Then just choose 460-server in the driver manager post-install.

I think upstream have released 460.84 now... so this should be resolved soon.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-460 460.84 is available in impish-proposed, but as dkms only at the moment - not yet available as signed lrm package.

It would be nice to know if 460.84 works or not.

Revision history for this message
Gabe L (glopez4) wrote :
Download full text (4.7 KiB)

I posted this on the NVIDIA forums, but thought it might be helpful to cross-post here.

---

I just tried the 460 server driver solution. This did not work for me and I still have the display port bug.

I have an AMD 3800X, 64 gb ram, rtx 2080 super, fresh install of Ubuntu 20.04 kernel 5.8.0-55, with 3 identical 4K monitors, (1 on HDMI, 2 on DP). The opensource driver continues to work just fine, but we need CUDA for our science to continue.

Here are the outputs from nvidia-smi, uname -r, and xrandr. Note that all monitors should have 4K options.

'''
(base) x@x:~$ nvidia-smi
Wed Jun 9 17:18:16 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:0E:00.0 On | N/A |
| 0% 45C P5 17W / 250W | 702MiB / 7981MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 918 G /usr/lib/xorg/Xorg 198MiB |
| 0 N/A N/A 1466 G /usr/lib/xorg/Xorg 349MiB |
| 0 N/A N/A 1603 G /usr/bin/gnome-shell 137MiB |
| 0 N/A N/A 2014 G gnome-control-center 2MiB |
+-----------------------------------------------------------------------------+
(base) x@x:~$ uname -r
5.8.0-55-generic
(base) x@x:~$
(base) x@x:~$ xrandr
Screen 0: minimum 8 x 8, current 7680 x 2160, maximum 32767 x 32767
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 connected 1920x1080+3840+0 (normal left inverted right x axis y axis) 1110mm x 620mm
   1920x1080 60.00 + 59.94 29.97* 23.98
   1680x1050 59.95
   1440x900 59.89
   1360x768 60.02
   1280x1024 75.02 60.02
   1280x960 60.00
   1280x800 59.81
   1280x720 60.00 59.94 29.97 23.98
   1152x864 75.00
   1024x768 75.03 70.07 60.00
   800x600 75.00 72.19 60.32
   720x480 59.94
   640x480 75.00 72.81 59.94 59.93
HDMI-0 connected primary 3840x2160+0+0 (normal left inverted right x axis ...

Read more...

Revision history for this message
Bas Schilperoord (bschilperoord) wrote :

460.84 doesn't work with DisplayPort cables, still. I have tried to install it, but it freezes at the end of the installation.

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

I have tried 460.84 and 465.31 on a fresh hirsute install upgraded to impish with proposed enabled.
Both result in a kernel oops when using DisplayPort

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

Gabe, this bug results in a complete locked up computer, not just a wrong resolution.
I suggest you report a new bug with your problem.

Revision history for this message
Bas Schilperoord (bschilperoord) wrote :

When is this going to be fixed? It's really annoying we have to use old drivers in order to get it to work. 460.56 works, but all the others don't

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

This is fixed with the new 470 beta drivers from https://launchpad.net/~albertomilone/+archive/ubuntu/nvidia-testing

At least for me on my affected system.

There are no 32bit packages there yet though.

Revision history for this message
Karim BAKKAL (kabak-85) wrote (last edit ):

On my computer with nvidia 465.27 :
- while on desktop and connect DisplayPort monitor : Freeze and need to force reboot.
- while DisplayPort connected before boot : Freeze during boot and need to force reboot.

Seem OK with 470.42.01

Revision history for this message
badasve (sve-daproject) wrote :

It has been more than a month now. I do not understand why canonical won't just rollback the drivers, or at least allow installing an older driver from their ppa? Simply put, some people like us are left having to cope with a non working LTS, that's not very reassuring

Revision history for this message
David Tantono (dtantono) wrote :

I roll back to xubuntu and it looks fine and I think stability is the main issue for this occasion.

Revision history for this message
Giles Coochey (gilesco) wrote :

@badasve

Is there a reason why you can't select the nvidia-driver-460-server drivers as a workaround?

I agree it would be nice to see some traction on this one, but the -server drivers have taken the urgency off of this issue for the time being.

I think they believe this is an upstream issue, so with the -server driver workaround working then they're hoping to fix-forward, rather than backward.

Revision history for this message
badasve (sve-daproject) wrote (last edit ):

@gilesco The server drivers do not work properly with my multi monitor setup, I get only the main monitor (DP), the other one (HDMI) is not detected. Everything works fine when I install 460.73 from Nvidia website, the manual way

Revision history for this message
Keith Jackson (temporal-net) wrote :

Do we have an ETA for this? I've had to essentially block any system upgrades on my desktop since this began after having to roll back a kernel update that forced this bug on me.

I noticed a new kernel update has just been posted and doesn't seem to have rolled out this issue yet again.

This isn't a minor annoyance, this is a complete system fail type bug.

This NVidia forum thread has been really useful here: https://forums.developer.nvidia.com/t/465-24-

It's looking like the bug on NVidia's side seems to have been rectified in 470.42 - Manjaro have put that to their stable branch, yet I'm not seeing anything in Ubuntu even in pre-release beyond 460.84.

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

@Keith
This does not seem to be kernel related, when I upgraded to 460.80 and 465.27 and got hit by this bug, I first tried to boot with older kernels I had installed with the same problem then. And look at https://forums.developer.nvidia.com/t/465-24-02-page-fault/175782 there users have tried a lot of different kernels without it helping.

You can either install the nvidia-driver-460-server which is at 460.73.01, or install it manually if it gives other problems, see above. Or use the 470 beta from https://launchpad.net/~albertomilone/+archive/ubuntu/nvidia-testing

And then just update all other packages including kernels.

Revision history for this message
Keith Jackson (temporal-net) wrote : Re: [Bug 1930733] Re: Kernel oops with the 460.80 and 465.27 drivers when using DP, but not with HDMI
Download full text (14.5 KiB)

It was only kernel related in that the update from .53 to .55 also updated my NVidia drivers at the same time - I assume due to being a dependency. I was unable to boot my PC in that kernel at all in order to fix it or downgrade any driver packages so I've been locked on .53 ever since effectively, with 460.73. I noticed today that a new kernel is out (.58?) but if that also updates my graphics drivers then I'm still up creek without a paddle.

Just after some info as to when I can safely unblock my updates and upgrade safely again.

Get Outlook for Android<https://aka.ms/AAb9ysg>
________________________________
From: <email address hidden> <email address hidden> on behalf of S?nke Lorenzen <email address hidden>
Sent: Wednesday, July 14, 2021 5:35:19 PM
To: <email address hidden> <email address hidden>
Subject: [Bug 1930733] Re: Kernel oops with the 460.80 and 465.27 drivers when using DP, but not with HDMI

@Keith
This does not seem to be kernel related, when I upgraded to 460.80 and 465.27 and got hit by this bug, I first tried to boot with older kernels I had installed with the same problem then. And look at https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fforums.developer.nvidia.com%2Ft%2F465-24-02-page-fault%2F175782&amp;data=04%7C01%7C%7Cd40c289b7ac7494ecf2108d946e62a60%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637618776640052404%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yCd5zPkJLnn2aaO0LO7XjVtmZ8KQOYcDBhRRED9vdr0%3D&amp;reserved=0 there users have tried a lot of different kernels without it helping.

You can either install the nvidia-driver-460-server which is at
460.73.01, or install it manually if it gives other problems, see above.
Or use the 470 beta from
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flaunchpad.net%2F~albertomilone%2F%2Barchive%2Fubuntu%2Fnvidia-testing&amp;data=04%7C01%7C%7Cd40c289b7ac7494ecf2108d946e62a60%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637618776640052404%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=pVGDfRT2eLoO3SULG6sXT3n0hRVt5PU41mneHXzxUds%3D&amp;reserved=0

And then just update all other packages including kernels.

--
You received this bug notification because you are subscribed to the bug
report.
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fbugs%2F1930733&amp;data=04%7C01%7C%7Cd40c289b7ac7494ecf2108d946e62a60%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637618776640062360%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=dSSz4TkfyQz2sCFnbas%2BmEd66JJXQoAFfJA4sj%2F9OXQ%3D&amp;reserved=0

Title:
  Kernel oops with the 460.80 and 465.27 drivers when using DP, but not
  with HDMI

Status in nvidia-graphics-drivers-460 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  Triaged

Bug description:
  I get a kernel oops with the 460.80 and 465.27 drivers on Hirsute

  Jun 03 16:26:57 willow kernel: Oops: 0000 [#1] SMP PTI
  Jun 03 16:26:57 willow kernel: CPU: 7 PID: 2004 Comm: Xorg Tainte...

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

You have to install another driver than nvidia-driver-460 in order to update all your packages again. This is because nvidia-driver-460 does have the version 460.80 in the repositories now.

Easiest is to change to nvidia-driver-460-server if that works for you, next easiest is to enable the PPA I mentioned above to install the 470 beta driver.

Revision history for this message
Keith Jackson (temporal-net) wrote :
Download full text (13.5 KiB)

The server driver doesn't work for my setup. I could try the beta package but at the moment it's working as it is, updates aside. I'm just after an idea of timescale for a proper fix with 470, not in a mad rush.

I'm not inclined to experiment at the moment due to other priorities and I can't afford to be without my PC for an extended period.

As long as we get a post on here keeping us up to date with when the fixed packages are in the Ubuntu upgrade repositories that's good enough for me. Then I can just unlock everything, upgrade, cross all my fingers and hope all is well!

Get Outlook for Android<https://aka.ms/AAb9ysg>

________________________________
From: <email address hidden> <email address hidden> on behalf of S?nke Lorenzen <email address hidden>
Sent: Wednesday, July 14, 2021 7:15:35 PM
To: <email address hidden> <email address hidden>
Subject: [Bug 1930733] Re: Kernel oops with the 460.80 and 465.27 drivers when using DP, but not with HDMI

You have to install another driver than nvidia-driver-460 in order to
update all your packages again. This is because nvidia-driver-460 does
have the version 460.80 in the repositories now.

Easiest is to change to nvidia-driver-460-server if that works for you,
next easiest is to enable the PPA I mentioned above to install the 470
beta driver.

--
You received this bug notification because you are subscribed to the bug
report.
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fbugs%2F1930733&amp;data=04%7C01%7C%7Ca554b67baec7490041ec08d946f41882%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637618836455319573%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=s3aOcRT0weNNtu723VwqY7uELxuEopJBSoxp14eX4zI%3D&amp;reserved=0

Title:
  Kernel oops with the 460.80 and 465.27 drivers when using DP, but not
  with HDMI

Status in nvidia-graphics-drivers-460 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  Triaged

Bug description:
  I get a kernel oops with the 460.80 and 465.27 drivers on Hirsute

  Jun 03 16:26:57 willow kernel: Oops: 0000 [#1] SMP PTI
  Jun 03 16:26:57 willow kernel: CPU: 7 PID: 2004 Comm: Xorg Tainted: P OE 5.11.0-18-generic #19-Ubuntu
  Jun 03 16:26:57 willow kernel: Hardware name: System manufacturer System Product Name/PRIME H270M-PLUS, BIOS 1605 12/13/2019
  Jun 03 16:26:57 willow kernel: RIP: 0010:_nv015534rm+0x1b6/0x330 [nvidia]
  Jun 03 16:26:57 willow kernel: Code: 8b 87 68 05 00 00 ba 01 00 00 00 be 02 00 00 00 e8 bf eb 55 c8 41 83 c5 01 41 83 fd 1f 0f 84 0b 01 00 00 48 8b 45 10 44 89 ee <48> 8b b8 70 01 00 00 48 8b 87 d8 04 00 00 e8>
  Jun 03 16:26:57 willow kernel: RSP: 0000:ffffaf4201893958 EFLAGS: 00010297
  Jun 03 16:26:57 willow kernel: RAX: 0000000000000000 RBX: 0000000000000400 RCX: 0000000000000003
  Jun 03 16:26:57 willow kernel: RDX: 0000000000000004 RSI: 0000000000000003 RDI: 0000000000000000
  Jun 03 16:26:57 willow kernel: RBP: ffff8e318220add0 R08: 0000000000000001 R09: ffff8e318220acb8
  Jun 03 16:26:57 willow kernel: R10: ffff8e3182204008 R11: 0000000010100000 R12: 0000000000000400
  Jun 03 16:26:57 w...

Revision history for this message
erio (eri0) wrote :

Apparently 470 fixes this, is there anywhere to track when those are available?

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

470 just got out of beta today: https://www.nvidia.com/Download/driverResults.aspx/177145/en-us

https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470 shows in which repositories including PPAs it is available in. Currently only the beta is listed there.

And a bug to track the testing and releases should be listed on https://people.canonical.com/~ubuntu-archive/pending-sru.html soon.

Revision history for this message
Peabody (peabodygerald) wrote :

I am still having the DisplayPort issue caused by the .55 kernel update. I manually installed 470.42.01 yesterday and it did not resolve the issue. The 460 server drivers also do not resolve the issue.

I am forced to use the open source Nouveau drivers which give me inconsistent frame rate around 15-20fps. I've also had a handful of crashes even with the nouveau drivers where the system freezes, blackscreens with a cursor, prints a few logs about Xorg (sorry for lack of detail, I don't know where to find them), and then reinitializes to the login screen with my session logged out.

I'm using a GTX 760 Rev A with a 4K monitor over DisplayPort.

I do not see any mention of these issues in the 470.71.01 release notes NOR the release notes associated with today's kernel .63 update (aka 5.8.0.63.71~20.04.45)

**Is there any clear statement from Nvidia that these drivers fix the problem, or anything from the kernel release that would remedy the DisplayPort bug?**

Nvidia release notes (mentioned by @solorenzen) https://www.nvidia.com/Download/driverResults.aspx/177145/en-us

Revision history for this message
Peabody (peabodygerald) wrote :

Update: I installed the latest kernel update 5.8.0.63.71 and tested the Nvidia drivers 470, 460, and 460-server that ship with this new kernel, but none fixed the problem. I even rolled back to kerlen 5.8.0.55 and tried the 460-server drivers to no avail.

For those doubting, the bug presented alongside the kernel update to .59 and is mostly alleviated with X.org drivers as opposed to nvidia drivers.

All that to say I don't think the 470 driver + .63 kernel solves the displayport issue for everyone.

Revision history for this message
Sønke Lorenzen (solorenzen) wrote :

Thanks. This is now fixed for me with the 470.57.02 and the 460.91.03. drivers from the normal repositories in 21.04

Revision history for this message
Daniel Dadap (ddadap) wrote :

Hello, NVIDIA employee here.

Can those who are still experiencing problems please provide the following information?

* NVIDIA driver version installed
* NVIDIA GPU model
* Exact model of affected monitor

An nvidia-bug-report.log will capture the driver version and GPU model, along with some other useful information, though possibly not the monitor model since the system may be crashing before the EDID can be read. You can generate an nvidia-bug-report.log file by running `nvidia-bug-report.sh` as root. This will create a new file named nvidia-bug-report.log.gz in your current directory.

As for the fix being absent from the release notes, that was an accidental omission. The fix should be present in 460.91.03 and later, as well as all publicly released 470 drivers. Unfortunately, the fix did not make it into 465 before support for the 465 series was discontinued. We will want to collect the relevant information for anybody still experiencing problems with DP monitors so that any remaining issues can be addressed as well.

Revision history for this message
Keith Jackson (temporal-net) wrote :
Download full text (14.2 KiB)

I'll be attempting an upgrade next week if the drivers are now in the canonical repositories but, should it fail how can I get the log when my system completely locks up and dies?

In that eventuality I've only been able to recover by booting an older kernel with the older drivers. Will that not invalidate the log content on restart?

If there's a way I can stop at a terminal and dump the files to a stick before I restore can you provide instructions on how that can be done then if all is in vain, then at least I can get you useful info.

I'm a fairly noobish user but happy to help as much as I can.

Get Outlook for Android<https://aka.ms/AAb9ysg>

________________________________
From: <email address hidden> <email address hidden> on behalf of Daniel Dadap <email address hidden>
Sent: Wednesday, July 21, 2021 5:56:48 PM
To: <email address hidden> <email address hidden>
Subject: [Bug 1930733] Re: Kernel oops with the 460.80 and 465.27 drivers when using DP, but not with HDMI

Hello, NVIDIA employee here.

Can those who are still experiencing problems please provide the
following information?

* NVIDIA driver version installed
* NVIDIA GPU model
* Exact model of affected monitor

An nvidia-bug-report.log will capture the driver version and GPU model,
along with some other useful information, though possibly not the
monitor model since the system may be crashing before the EDID can be
read. You can generate an nvidia-bug-report.log file by running `nvidia-
bug-report.sh` as root. This will create a new file named nvidia-bug-
report.log.gz in your current directory.

As for the fix being absent from the release notes, that was an
accidental omission. The fix should be present in 460.91.03 and later,
as well as all publicly released 470 drivers. Unfortunately, the fix did
not make it into 465 before support for the 465 series was discontinued.
We will want to collect the relevant information for anybody still
experiencing problems with DP monitors so that any remaining issues can
be addressed as well.

--
You received this bug notification because you are subscribed to the bug
report.
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fbugs%2F1930733&amp;data=04%7C01%7C%7C50e47dc149644e6a68ec08d94c69c870%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637624839480958737%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=489S6ZVdRRW7P6PbuM6zRIKofoHn5s%2FaL4oCti18y9s%3D&amp;reserved=0

Title:
  Kernel oops with the 460.80 and 465.27 drivers when using DP, but not
  with HDMI

Status in nvidia-graphics-drivers-460 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  Triaged

Bug description:
  I get a kernel oops with the 460.80 and 465.27 drivers on Hirsute

  Jun 03 16:26:57 willow kernel: Oops: 0000 [#1] SMP PTI
  Jun 03 16:26:57 willow kernel: CPU: 7 PID: 2004 Comm: Xorg Tainted: P OE 5.11.0-18-generic #19-Ubuntu
  Jun 03 16:26:57 willow kernel: Hardware name: System manufacturer System Product Name/PRIME H270M-PLUS, BIOS 1605 12/13/2019
  Jun 03 16:26:57 willow kernel: RIP: 0010:_nv015534rm+0x1b6/0...

Revision history for this message
Daniel Dadap (ddadap) wrote :

> should it fail how can I get the log when my system completely locks up and dies?

My understanding of the bug is that it occurs when the NVIDIA driver tries to bring up the DisplayPort link on certain DisplayPort monitors. You should be able to drive the system in text mode with a failing driver, to capture the log. One way to do that would be to add "systemd.unit=multi-user.target" to the kernel command line. (When the bootloader menu comes up, edit the boot options and add this argument to the line that starts with 'linux'.) If you happen to have another system on the same network you can even try SSHing into the system that reproduces the problem, then start the graphical session manually with `sudo systemctl isolate graphical`, and if the crash doesn't take down the SSH session, you ought to be able to generate the log file while the problem is occurring. (Log files taken while a bug is actively being reproduced tend to be the most useful.) Depending on the exact nature of the crash, you may even be able to SSH into a system which has already crashed, without having to go to the effort of starting it in text mode first.

> In that eventuality I've only been able to recover by booting an older kernel with the older drivers. Will that not invalidate the log content on restart?

It may, or may not. nvidia-bug-report.sh attempts to capture logs from the previous boot, if they have been retained. If you do find that you are unable to capture a log file using a driver version which reproduces the problem, just make sure to note that you captured the log file after rebooting to a known good driver.

Revision history for this message
Keith Jackson (temporal-net) wrote :
Download full text (14.3 KiB)

Excellent. I'll use the SSH option as I've got a working Ubuntu laptop of that's more useful. Thanks for the info.

Get Outlook for Android<https://aka.ms/AAb9ysg>

________________________________
From: <email address hidden> <email address hidden> on behalf of Daniel Dadap <email address hidden>
Sent: Wednesday, July 21, 2021 10:01:06 PM
To: <email address hidden> <email address hidden>
Subject: [Bug 1930733] Re: Kernel oops with the 460.80 and 465.27 drivers when using DP, but not with HDMI

> should it fail how can I get the log when my system completely locks
up and dies?

My understanding of the bug is that it occurs when the NVIDIA driver
tries to bring up the DisplayPort link on certain DisplayPort monitors.
You should be able to drive the system in text mode with a failing
driver, to capture the log. One way to do that would be to add
"systemd.unit=multi-user.target" to the kernel command line. (When the
bootloader menu comes up, edit the boot options and add this argument to
the line that starts with 'linux'.) If you happen to have another system
on the same network you can even try SSHing into the system that
reproduces the problem, then start the graphical session manually with
`sudo systemctl isolate graphical`, and if the crash doesn't take down
the SSH session, you ought to be able to generate the log file while the
problem is occurring. (Log files taken while a bug is actively being
reproduced tend to be the most useful.) Depending on the exact nature of
the crash, you may even be able to SSH into a system which has already
crashed, without having to go to the effort of starting it in text mode
first.

> In that eventuality I've only been able to recover by booting an older
kernel with the older drivers. Will that not invalidate the log content
on restart?

It may, or may not. nvidia-bug-report.sh attempts to capture logs from
the previous boot, if they have been retained. If you do find that you
are unable to capture a log file using a driver version which reproduces
the problem, just make sure to note that you captured the log file after
rebooting to a known good driver.

--
You received this bug notification because you are subscribed to the bug
report.
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.launchpad.net%2Fbugs%2F1930733&amp;data=04%7C01%7C%7Ccefaf8b686a044088e3d08d94c8c0036%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637624986447380869%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=eO8jfN5MbevXRAqp0irMe22IyOABsKE3YlvBENkoYH8%3D&amp;reserved=0

Title:
  Kernel oops with the 460.80 and 465.27 drivers when using DP, but not
  with HDMI

Status in nvidia-graphics-drivers-460 package in Ubuntu:
  Triaged
Status in nvidia-graphics-drivers-465 package in Ubuntu:
  Triaged

Bug description:
  I get a kernel oops with the 460.80 and 465.27 drivers on Hirsute

  Jun 03 16:26:57 willow kernel: Oops: 0000 [#1] SMP PTI
  Jun 03 16:26:57 willow kernel: CPU: 7 PID: 2004 Comm: Xorg Tainted: P OE 5.11.0-18-generic #19-Ubuntu
  Jun 03 16:26:57 willow kernel: Hardware name: System manufacturer System Product Name/PRIME H270...

Revision history for this message
Eugene Cormier (eugene-cormier) wrote :

I can confirm that nvidia 470.57.02 and linux 5.11.0-25-generic is working with displayport on my system, all monitors, one on hdmi, two on displayport and the internal lcd on the laptop, come up correctly without crash

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.