TGL-H system NV GPU fallen off the bus after resuming from s2idle with the external display connected via docking station

Bug #1929166 reported by Chris Chiu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HWE Next
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Invalid
Undecided
Unassigned
Focal
Invalid
Undecided
Unassigned
linux-oem-5.10 (Ubuntu)
Invalid
Undecided
Unassigned
Focal
Fix Released
Undecided
Chris Chiu

Bug Description

[SRU Justification]

[Impact]
The NVIDIA GPU will fall off the bus after exiting s2idle in TGL-H systems if the docking station with external display connected is unplugged when the system is still in s2idle. The system will be hold by the infinite loop in ACPI method IPCS and then the PCIe root port of NVIDIA gpu fails the power transition from D3cold to D0. Then display managed by the NVDIA GPU shows nothing until system reboot. Note that it only happens when NVIDIA GPU in either Performance mode or On-Demand mode.

[Fix]
A BIOS workaround is used to skip the ACPI method PGSC which invokes IPCS to do source clock control of the PCIe root port with an _OSI string "Linux-Dell-USB4-NVWakeup". Should be removed until we get a generic fix for it.

[Test Case]
1. On all TigerLake-H and later platforms with NVIDIA GPU, make sure the NVIDIA GPU is running in either On-Demand mode or Performance mode.
2. Connect the docking station with the external display connected.
3. Suspend the system.
4. Remove the docking station when the system is suspended.
5. Press power button to wake up the system and wait > 1 minutes to make sure if the display comes back.

[Regression Potential]
Low. This only works on platforms supporting "Linux-Dell-USB4-NVWakeup".
No other platforms will be affected.

CVE References

Chris Chiu (mschiu77)
Changed in linux (Ubuntu Focal):
status: New → Invalid
Changed in linux-oem-5.10 (Ubuntu):
status: New → Invalid
Changed in linux-oem-5.10 (Ubuntu Focal):
status: New → In Progress
assignee: nobody → Chris Chiu (mschiu77)
Chris Chiu (mschiu77)
tags: added: oem-priority originate-from-1925291 somerville
tags: added: originate-from-1923729
tags: added: originate-from-1923722
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1929166

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Chris Chiu (mschiu77)
description: updated
description: updated
description: updated
Chris Chiu (mschiu77)
description: updated
summary: - TGL-H system NV GPU fallen off the bus after resumes from s2idle with
+ TGL-H system NV GPU fallen off the bus after resuming from s2idle with
the external display connected via docking station
Timo Aaltonen (tjaalton)
Changed in linux-oem-5.10 (Ubuntu Focal):
status: In Progress → Fix Committed
Timo Aaltonen (tjaalton)
tags: added: verification-needed-focal
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

can't wait for the bios, kernel needs to release

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (16.0 KiB)

This bug was fixed in the package linux-oem-5.10 - 5.10.0-1029.30

---------------
linux-oem-5.10 (5.10.0-1029.30) focal; urgency=medium

  * focal/linux-oem-5.10: 5.10.0-1029.30 -proposed tracker (LP: #1930076)

  * CVE-2021-33200
    - bpf: Wrap aux data inside bpf_sanitize_info container
    - bpf: Fix mask direction swap upon off reg sign change
    - bpf: No need to simulate speculative domain for immediates

linux-oem-5.10 (5.10.0-1028.29) focal; urgency=medium

  * focal/linux-oem-5.10: 5.10.0-1028.29 -proposed tracker (LP: #1929167)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * TGL-H system NV GPU fallen off the bus after resuming from s2idle with the
    external display connected via docking station (LP: #1929166)
    - SAUCE: ACPI: avoid NVIDIA GPU fallen with an _OSI string

  * AX201 BT will cause system could not enter S0i3 (LP: #1928047)
    - drm/i915: Tweaked Wa_14010685332 for all PCHs

  * Realtek USB hubs in Dell WD19SC/DC/TB fail to work after exiting s2idle
    (LP: #1928242)
    - USB: Verify the port status when timeout happens during port suspend
    - Revert "USB: Add reset-resume quirk for WD19's Realtek Hub"

  * Support mic-mute on Dell's platform (LP: #1928750)
    - ASoC: rt715: add main capture switch and main capture volume
    - ASoC: rt715: remove kcontrols which no longer be used
    - ASoC: rt715: modification for code simplicity
    - platform/x86: Move all dell drivers to their own subdirectory
    - SAUCE: platform/x86: dell-privacy: Add support for Dell hardware privacy
    - SAUCE: ASoC: rt715:add micmute led state control supports
    - [Config] Update configs for Dell's E-Privacy

linux-oem-5.10 (5.10.0-1027.28) focal; urgency=medium

  * focal/linux-oem-5.10: 5.10.0-1027.28 -proposed tracker (LP: #1927620)

  * Introduce the 465 driver series, fabric-manager, and libnvidia-nscq
    (LP: #1925522)
    - debian/dkms-versions -- add NVIDIA 465 and migrate 450 to 460

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * On TGL platforms screen shows garbage when browsing website by scrolling
    mouse (LP: #1926579)
    - SAUCE: drm/i915/display: Disable PSR2 if TGL Display stepping is B1 from A0

  * Add s2idle support on AMD Renoir and Cezanne (LP: #1927067)
    - drm/amd/display: setup system context in dm_init
    - drm/amd/display: add S/G support for Renoir
    - drm/amdgpu: drop extra drm_kms_helper_poll_enable/disable calls
    - drm/amdgpu: use runpm flag rather than fbcon for kfd runtime suspend (v2)
    - drm/amdgpu: reset runpm flag if device suspend fails
    - drm/amdgpu: add s0i3 capacity check for s0i3 routine (v2)
    - drm/amdgpu: add amdgpu_gfx_state_change_set() set gfx power change entry
      (v2)
    - drm/amdgpu: update amdgpu device suspend/resume sequence for s0i3 support
    - drm/amd/pm: add gfx_state_change_set() for rn gfx power switch (v2)
    - drm/amdgpu: add judgement for suspend/resume sequence
    - drm/amdgpu/pm: no need GPU status set since
      mmnbif_gpu_BIF_DOORBELL_FENCE_CNTL added in FSDL
    - drm/amdgpu: fix shutdown and poweroff process failed with s0ix
    - drm/amdgpu: Only check for S0ix if AMD_PMC ...

Changed in linux-oem-5.10 (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Anthony Wong (anthonywong) wrote :

Proper fix in LP:1931072

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Changed in hwe-next:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.