Freeze if Thunderbolt 3 dock is unplugged while suspended - Ubuntu 22.04

Asked by mushi

I have Ubuntu 22.04 running on a laptop with a Thunderbolt 3 dock connected.
Next I suspend Ubuntu.
Then I remove the dock while Ubuntu is suspended.
Later I resume Ubuntu without reconnecting the dock.

The screen turns on and it looks like Ubuntu is starting to resume, but it freezes with a black screen and never starts up.
Hard reboot is the only way out.
It happens every time.

In the syslog there is a call trace that looks like something crashed:

pcieport 0000:03:02.0: can't change power state from D3hot to D0 (config space inaccessible)
thunderbolt 0000:04:00.0: can't change power state from D3hot to D0 (config space inaccessible)

Call Trace:
 <TASK>
 usb_hcd_pci_remove+0xac/0x110
 xhci_pci_remove+0x72/0xb0 [xhci_pci]
 pci_device_remove+0x3b/0xb0
 __device_release_driver+0x1a8/0x2a0
 device_release_driver+0x29/0x40
 pci_stop_bus_device+0x74/0xa0
 pci_stop_bus_device+0x30/0xa0
 pci_stop_bus_device+0x30/0xa0
 pci_stop_and_remove_bus_device+0x13/0x30
 pciehp_unconfigure_device+0x7e/0x140
 pciehp_disable_slot+0x6c/0x100
 pciehp_handle_presence_or_link_change+0xb7/0x110
 pciehp_ist+0x19a/0x1b0
 ? irq_forced_thread_fn+0x90/0x90
 irq_thread_fn+0x25/0x70
 irq_thread+0xdc/0x1b0
 ? irq_thread_fn+0x70/0x70
 ? irq_thread_check_affinity+0x100/0x100
 kthread+0x127/0x150
 ? set_kthread_struct+0x50/0x50
 ret_from_fork+0x1f/0x30
 </TASK>

$ uname -r
5.15.0-52-generic

$ lsb_release -crid
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy

Any suggestions on what I could try?

My work around has been to try to remember, and remove the dock each time before suspend. This works but is not ideal. For reference, if I suspend and resume without removing the dock, it all works fine.

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Solved by:
Manfred Hampl
Solved:
Last query:
Last reply:
Revision history for this message
Bernard Stafford (bernard010) said :
#1

In this Tutorial:
Solution #5 Update the BIOS settings & check the Thunderbolt config.
https://bytexd.com/using-thunderbolt-3-or-4-on-ubuntu/

Revision history for this message
mushi (michael-colefax) said :
#2

Thanks

Unfortunately it didn't fix it.

The instructions in the tutorial said:
    Enable Thunderbolt Technology Support
    Enable Thunderbolt Adapter Boot Support
    Enable Thunderbolt Adapter Pre-boot Modules

I already had the first, but enabled the others, then retested but the result is the same - it still freezes on resume.

The errors on the black screen are:

xhci_hcd PCI post-resume error -19!
xhci_hcd HC died; cleaning up
PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -19
xhci_hcd PM: failed to resume async: error -19

I just notice Dell have released a new BIOS version recently, so I'll try to upgrade.

Revision history for this message
mushi (michael-colefax) said :
#3

Update: after upgrading the BIOS and all other firmware to latest versions, the problem persists.

Revision history for this message
mushi (michael-colefax) said (last edit ):
#4

Searching some more, it seems similar in some ways to this issue:
https://bugzilla.kernel.org/show_bug.cgi?id=216728

Revision history for this message
Bernard Stafford (bernard010) said :
#5

According to the info on the bug report what kernel are you using on Ubuntu ?
Terminal:
lsb_release -a; uname -a; python3 --version

Revision history for this message
mushi (michael-colefax) said :
#6

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy

$ uname -a
Linux computer 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

$ python3 --version
Python 3.10.6

Revision history for this message
Manfred Hampl (m-hampl) said :
#7

What are the lines immediately preceding the crash call trace in the system log?

A potential solution might be to create script that is automatically run immediately before entering sleep mode and which unloads the driver that is crashing, and another script that reloads this driver at wakeup time.

(another possibility for identifying the driver(s) in question is checking what happens when undocking, which drivers are unloaded resp. which drivers are loaded when docking)

Revision history for this message
mushi (michael-colefax) said :
#8

Thanks

I did try to disable the xhci before suspend, i found this script that I thought did what I needed:
https://gist.github.com/ioggstream/8f380d398aef989ac455b93b92d42048

It didn't seem to work, but I wasn't really sure if it was targetting the right device. I will look into it further, and your other suggestion too.

Here are the lines from the syslog, the call trace is toward the bottom:

 pcieport 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
 pcieport 0000:03:00.0: can't change power state from D3hot to D0 (config space inaccessible)
 pcieport 0000:03:01.0: can't change power state from D3hot to D0 (config space inaccessible)
 pcieport 0000:03:02.0: can't change power state from D3hot to D0 (config space inaccessible)
 thunderbolt 0000:04:00.0: can't change power state from D3hot to D0 (config space inaccessible)
 pcieport 0000:05:00.0: can't change power state from D3hot to D0 (config space inaccessible)
 xhci_hcd 0000:3a:00.0: can't change power state from D3hot to D0 (config space inaccessible)
 pcieport 0000:06:02.0: can't change power state from D3hot to D0 (config space inaccessible)
 xhci_hcd 0000:07:00.0: can't change power state from D3hot to D0 (config space inaccessible)
 nvidia 0000:01:00.0: Enabling HDA controller
 ACPI: EC: event unblocked
 pcieport 0000:00:1b.0: pciehp: Slot(20): Card not present
 xhci_hcd 0000:3a:00.0: remove, state 4
 xhci_hcd 0000:07:00.0: can't change power state from D3cold to D0 (config space inaccessible)
 xhci_hcd 0000:07:00.0: Controller not ready at resume -19
 xhci_hcd 0000:07:00.0: PCI post-resume error -19!
 xhci_hcd 0000:07:00.0: HC died; cleaning up
 PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -19
 xhci_hcd 0000:07:00.0: PM: failed to resume async: error -19
 usb usb6: USB disconnect, device number 1
 xhci_hcd 0000:3a:00.0: USB bus 6 deregistered
 xhci_hcd 0000:3a:00.0: remove, state 4
 usb usb5: USB disconnect, device number 1
 xhci_hcd 0000:3a:00.0: Host halt failed, -19
 xhci_hcd 0000:3a:00.0: Host not accessible, reset failed.
 xhci_hcd 0000:3a:00.0: USB bus 5 deregistered
 ------------[ cut here ]------------
 xhci_hcd 0000:3a:00.0: disabling already-disabled device
 PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -19
 xhci_hcd 0000:07:00.0: PM: failed to resume async: error -19
 WARNING: CPU: 0 PID: 161 at drivers/pci/pci.c:2228 pci_disable_device+0xe4/0x140
 Modules linked in: rfcomm ccm vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) r8153_ecm cdc_ether usbnet r8152 mii snd_usb_audio snd_usbmidi_lib usbhid cmac algif_hash algif_skcipher af_alg bnep nvidia_uvm(PO) snd_hda_codec_hdmi uvcvideo btusb snd_ctl_led btrtl btbcm videobuf2_vmalloc snd_hda_codec_realtek btintel videobuf2_memops videobuf2_v4l2 snd_hda_codec_generic bluetooth videobuf2_common videodev ecdh_generic
 i915 0000:00:02.0: [drm] [ENCODER:94:DDI A/PHY A] is disabled/in DSI mode with an ungated DDI clock, gate it
 i915 0000:00:02.0: [drm] [ENCODER:102:DDI B/PHY B] is disabled/in DSI mode with an ungated DDI clock, gate it
 i915 0000:00:02.0: [drm] [ENCODER:113:DDI C/PHY C] is disabled/in DSI mode with an ungated DDI clock, gate it
 i915 0000:00:02.0: [drm] [ENCODER:120:DDI D/PHY D] is disabled/in DSI mode with an ungated DDI clock, gate it
  mc ecc cdc_acm intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp nvidia_drm(PO) coretemp nvidia_modeset(PO) snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_intel snd_hda_core snd_hwdep snd_pcm binfmt_misc joydev snd_seq_midi nvidia(PO) mei_hdcp intel_rapl_msr nls_iso8859_1 i915 kvm snd_seq_midi_event crct10dif_pclmul dell_laptop snd_rawmidi ghash_clmulni_intel ath10k_pci dell_smm_hwmon aesni_intel ath10k_core snd_seq crypto_simd cryptd dell_wmi ttm ath ledtrig_audio snd_seq_device dell_smbios rapl processor_thermal_device_pci_legacy intel_cstate mac80211 input_leds snd_timer dcdbas drm_kms_helper dell_wmi_sysman processor_thermal_device
  firmware_attributes_class cec processor_thermal_rfim intel_wmi_thunderbolt wmi_bmof serio_raw dell_wmi_descriptor processor_thermal_mbox mxm_wmi cfg80211 snd rc_core i2c_algo_bit mei_me ucsi_acpi processor_thermal_rapl fb_sys_fops typec_ucsi intel_rapl_common syscopyarea sysfillrect ee1004 soundcore libarc4 hid_multitouch sysimgblt mei typec intel_soc_dts_iosf intel_pch_thermal int3403_thermal int340x_thermal_zone dell_smo8800 intel_hid acpi_pad mac_hid int3400_thermal sparse_keymap acpi_thermal_rel sch_fq_codel ipmi_devintf ipmi_msghandler msr parport_pc ppdev lp drm parport ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic rtsx_pci_sdmmc crc32_pclmul psmouse nvme i2c_i801 nvme_core thunderbolt i2c_smbus rtsx_pci ahci intel_lpss_pci intel_lpss libahci xhci_pci idma64 xhci_pci_renesas i2c_hid_acpi i2c_hid hid wmi video pinctrl_cannonlake
 CPU: 0 PID: 161 Comm: irq/123-pciehp Tainted: P OE 5.15.0-52-generic #58-Ubuntu
 RIP: 0010:pci_disable_device+0xe4/0x140
 Code: 4d 85 e4 75 07 4c 8b a3 d0 00 00 00 48 8d bb d0 00 00 00 e8 8e b2 19 00 4c 89 e2 48 c7 c7 30 08 e4 b7 48 89 c6 e8 ed 49 62 00 <0f> 0b e9 4a ff ff ff 48 8d 55 e6 be 04 00 00 00 48 89 df e8 d4 3c
 RSP: 0018:ffffb70f00893c18 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff895681f49000 RCX: 0000000000000027
 RDX: ffff8959ec420588 RSI: 0000000000000001 RDI: ffff8959ec420580
 RBP: ffffb70f00893c38 R08: 0000000000000038 R09: 00000000b8e34e28
 R10: ffffffffffffffff R11: 0000000000000001 R12: ffff895681f61940
 R13: ffff89568129e000 R14: ffffffffc03c4078 R15: 0000000000000008
 FS: 0000000000000000(0000) GS:ffff8959ec400000(0000) knlGS:0000000000000000
 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000564f0e064c08 CR3: 00000002c3a10005 CR4: 00000000003706f0
 Call Trace:
  <TASK>
  usb_hcd_pci_remove+0xac/0x110
  xhci_pci_remove+0x72/0xb0 [xhci_pci]
  pci_device_remove+0x3b/0xb0
  __device_release_driver+0x1a8/0x2a0
  device_release_driver+0x29/0x40
  pci_stop_bus_device+0x74/0xa0
  pci_stop_bus_device+0x30/0xa0
  pci_stop_bus_device+0x30/0xa0
  pci_stop_and_remove_bus_device+0x13/0x30
  pciehp_unconfigure_device+0x7e/0x140
  pciehp_disable_slot+0x6c/0x100
  pciehp_handle_presence_or_link_change+0xb7/0x110
  pciehp_ist+0x19a/0x1b0
  ? irq_forced_thread_fn+0x90/0x90
  irq_thread_fn+0x25/0x70
  irq_thread+0xdc/0x1b0
  ? irq_thread_fn+0x70/0x70
  ? irq_thread_check_affinity+0x100/0x100
  kthread+0x127/0x150
  ? set_kthread_struct+0x50/0x50
  ret_from_fork+0x1f/0x30
  </TASK>

Revision history for this message
Best Manfred Hampl (m-hampl) said :
#9
Revision history for this message
mushi (michael-colefax) said :
#10

Thanks Manfred Hampl, that solved my question.

Revision history for this message
mushi (michael-colefax) said (last edit ):
#11

For reference this one solved it, from https://askubuntu.com/a/1089078/1551114 :

#!/bin/bash

# Original script was using /bin/sh but shellcheck reporting warnings.

# NAME: custom-xhci_hcd
# PATH: /lib/systemd/system-sleep
# CALL: Called from SystemD automatically
# DESC: Suspend broken for USB3.0 as of Oct 25/2018 various kernels all at once

# DATE: Oct 28 2018.

# NOTE: From comment #61 at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/522998

TMPLIST=/tmp/xhci-dev-list

# Original script was: case "${1}" in hibernate|suspend)

case $1/$2 in
  pre/*)
    echo "$0: custom-xhci_hcd - going to $2..."
    echo -n '' > $TMPLIST
          for i in `ls /sys/bus/pci/drivers/xhci_hcd/ | egrep '[0-9a-z]+\:[0-9a-z]+\:.*$'`; do
              # Unbind xhci_hcd for first device XXXX:XX:XX.X:
               echo -n "$i" | tee /sys/bus/pci/drivers/xhci_hcd/unbind
           echo "$i" >> $TMPLIST
          done
        ;;
  post/*)
    echo "$0: custom-xhci_hcd - waking up from $2..."
    for i in `cat $TMPLIST`; do
              # Bind xhci_hcd for first device XXXX:XX:XX.X:
              echo -n "$i" | tee /sys/bus/pci/drivers/xhci_hcd/bind
    done
    rm $TMPLIST
        ;;
esac