Comment 32 for bug 1861395

Revision history for this message
Oded Arbel (oded-geek) wrote :

Just got this triggered with 5.4.0-15-generic #18-Ubuntu from focal-proposed.

Logs below.

If you want me to run another kernel, or try patched kernels, I can do that.

Kernel log:

----8<----
Feb 25 10:08:08 vesho kernel: i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
Feb 25 10:08:08 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 25 10:08:08 vesho kernel: [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Feb 25 10:08:08 vesho kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
Feb 25 10:08:08 vesho kernel: [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Feb 25 10:08:08 vesho kernel: [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Feb 25 10:08:16 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 25 10:08:24 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 25 10:08:26 vesho kernel: GpuWatchdog[15034]: segfault at 0 ip 000055af78399e32 sp 00007f8b359414c0 error 6 in chrome[55af74453000+7287000]
Feb 25 10:08:26 vesho kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 0f 84 99 00 00 00 48 8d 3d 63 61 4b fb be 01 00 00 00 ba 03 00 00 00 e8 fe 17 a6 fe <c7> 04 25 00>
Feb 25 10:08:28 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
...
Feb 25 10:09:18 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 25 10:09:20 vesho kernel: i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
Feb 25 10:09:20 vesho kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
Feb 25 10:09:22 vesho kernel: i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
Feb 25 10:09:22 vesho kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
Feb 25 10:09:22 vesho kernel: fbcon: Taking over console
Feb 25 10:09:23 vesho kernel: Console: switching to colour frame buffer device 240x67
Feb 25 10:09:30 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 25 10:09:38 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
...
Feb 25 10:10:32 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 25 10:10:34 vesho kernel: i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
Feb 25 10:10:34 vesho kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
Feb 25 10:10:36 vesho kernel: i915 0000:00:02.0: GPU recovery timed out, cancelling all in-flight rendering.
Feb 25 10:10:36 vesho kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
Feb 25 10:10:46 vesho kernel: GpuWatchdog[53827]: segfault at 0 ip 000055f286b35e32 sp 00007f396a9a24c0 error 6 in chrome[55f282bef000+7287000]
Feb 25 10:10:46 vesho kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 0f 84 99 00 00 00 48 8d 3d 63 61 4b fb be 01 00 00 00 ba 03 00 00 00 e8 fe 17 a6 fe <c7> 04 25 00>
Feb 25 10:10:56 vesho kernel: GpuWatchdog[53920]: segfault at 0 ip 0000555c042dee32 sp 00007f446a7b24c0 error 6 in chrome[555c00398000+7287000]
Feb 25 10:10:56 vesho kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 0f 84 99 00 00 00 48 8d 3d 63 61 4b fb be 01 00 00 00 ba 03 00 00 00 e8 fe 17 a6 fe <c7> 04 25 00>
----8<----

/sys/class/drm/card0/error:

----8<----
GPU HANG: ecode 9:1:0x00000000, hang on rcs0
Kernel: 5.4.0-15-generic x86_64
Driver: 20190822
Time: 1582618088 s 331762 us
Boottime: 57363 s 47980 us
Uptime: 834 s 211397 us
Epoch: 4297439752 jiffies (250 HZ)
Capture: 4297439752 jiffies; 8224 ms ago, 0 ms after epoch
Reset count: 0
Suspend count: 1
Platform: KABYLAKE
Subplatform: 0x0
PCI ID: 0x591b
PCI Revision: 0x04
PCI Subsystem: 1028:07bf
IOMMU enabled?: 1
DMC loaded: yes
DMC fw version: 1.4
GT awake: yes
RPM wakelock: yes
PM suspended: no
EIR: 0x00000000
IER: 0x08080000
GTIER[0]: 0x01010101
GTIER[1]: 0x01010101
GTIER[2]: 0x00000070
GTIER[3]: 0x00000101
PGTBL_ER: 0x00000000
FORCEWAKE: 0x00010001
DERRMR: 0x2077efef
CCID: 0x00000000
  fence[0] = 00000000
  fence[1] = 00000000
  fence[2] = 00000000
  fence[3] = 00000000
  fence[4] = 00000000
  fence[5] = a43f097074c0001
  fence[6] = 00000000
  fence[7] = 00000000
  fence[8] = 00000000
  fence[9] = 00000000
  fence[10] = 00000000
  fence[11] = 00000000
  fence[12] = 00000000
  fence[13] = 00000000
  fence[14] = 00000000
  fence[15] = 00000000
  fence[16] = 00000000
  fence[17] = 00000000
  fence[18] = 00000000
  fence[19] = 00000000
  fence[20] = 00000000
  fence[21] = 00000000
  fence[22] = 00000000
  fence[23] = 00000000
  fence[24] = 00000000
  fence[25] = 00000000
  fence[26] = 00000000
  fence[27] = 00000000
  fence[28] = 00000000
  fence[29] = 00000000
  fence[30] = 00000000
  fence[31] = 00000000
ERROR: 0x00000000
DONE_REG: 0xffffffff
FAULT_TLB_DATA: 0x00000019 0xf4bbf313
Num Pipes: 3
Pipe [0]:
  Power: on
  SRC: 077f0437
  STAT: 00000000
Plane [0]:
  CNTR: c4042400
  STRIDE: 00000026
  SURF: 08500000
  TILEOFF: 000d0240
Cursor [0]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
Pipe [1]:
  Power: on
  SRC: 09ff059f
  STAT: 00000000
Plane [1]:
  CNTR: c4043000
  STRIDE: 00000050
  SURF: 01700000
  TILEOFF: 00000000
Cursor [1]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
Pipe [2]:
  Power: on
  SRC: 09ff059f
  STAT: 00000000
Plane [2]:
  CNTR: c4043000
  STRIDE: 00000050
  SURF: 03700000
  TILEOFF: 00000000
Cursor [2]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
CPU transcoder: A
  Power: on
  CONF: 00000000
  HTOTAL: 00000000
  HBLANK: 00000000
  HSYNC: 00000000
  VTOTAL: 00000000
  VBLANK: 00000000
  VSYNC: 00000000
CPU transcoder: B
  Power: on
  CONF: c0000000
  HTOTAL: 0a9f09ff
  HBLANK: 0a9f09ff
  HSYNC: 0a4f0a2f
  VTOTAL: 05c8059f
  VBLANK: 05c8059f
  VSYNC: 05a705a2
CPU transcoder: C
  Power: on
  CONF: c0000000
  HTOTAL: 0a9f09ff
  HBLANK: 0a9f09ff
  HSYNC: 0a4f0a2f
  VTOTAL: 05c8059f
  VBLANK: 05c8059f
  VSYNC: 05a705a2
CPU transcoder: EDP
  Power: on
  CONF: c0000000
  HTOTAL: 081f077f
  HBLANK: 081f077f
  HSYNC: 07cf07af
  VTOTAL: 04560437
  VBLANK: 04560437
  VSYNC: 043f043a
is_mobile: no
is_lp: no
require_force_probe: no
has_64bit_reloc: yes
gpu_reset_clobbers_display: no
has_reset_engine: yes
has_fpga_dbg: yes
has_global_mocs: no
has_gt_uc: yes
has_l3_dpf: no
has_llc: yes
has_logical_ring_contexts: yes
has_logical_ring_elsq: no
has_logical_ring_preemption: yes
has_pooled_eu: no
has_rc6: yes
has_rc6p: no
has_rps: yes
has_runtime_pm: yes
has_snoop: no
has_coherent_ggtt: yes
unfenced_needs_alignment: no
hws_needs_physical: no
cursor_needs_physical: no
has_csr: yes
has_ddi: yes
has_dp_mst: yes
has_fbc: yes
has_gmch: no
has_hotplug: yes
has_ipc: yes
has_modular_fia: no
has_overlay: no
has_psr: yes
overlay_needs_physical: no
supports_tv: no
Has logical contexts? yes
scheduler: 1f
slice0: 3 subslice(s) (0x7):
        subslice0: 8 EUs (0xff)
        subslice1: 8 EUs (0xff)
        subslice2: 8 EUs (0xff)
        subslice3: 0 EUs (0x0)
slice1: 0 subslice(s) (0x0):
        subslice0: 0 EUs (0x0)
        subslice1: 0 EUs (0x0)
        subslice2: 0 EUs (0x0)
        subslice3: 0 EUs (0x0)
slice2: 0 subslice(s) (0x0):
        subslice0: 0 EUs (0x0)
        subslice1: 0 EUs (0x0)
        subslice2: 0 EUs (0x0)
        subslice3: 0 EUs (0x0)
i915.vbt_firmware=(null)
i915.modeset=-1
i915.lvds_channel_mode=0
i915.panel_use_ssc=-1
i915.vbt_sdvo_panel_type=-1
i915.enable_dc=-1
i915.enable_fbc=1
i915.enable_psr=0
i915.disable_power_well=1
i915.enable_ips=1
i915.invert_brightness=0
i915.enable_guc=0
i915.guc_log_level=-1
i915.guc_firmware_path=(null)
i915.huc_firmware_path=(null)
i915.dmc_firmware_path=(null)
i915.mmio_debug=0
i915.edp_vswing=0
i915.reset=2
i915.inject_load_failure=0
i915.fastboot=-1
i915.enable_dpcd_backlight=-1
i915.force_probe=
i915.alpha_support=no
i915.enable_hangcheck=yes
i915.prefault_disable=no
i915.load_detect_test=no
i915.force_reset_modeset_test=no
i915.error_capture=yes
i915.disable_display=no
i915.verbose_state_checks=yes
i915.nuclear_pageflip=no
i915.enable_dp_mst=yes
i915.enable_gvt=no
GuC firmware:
        status: DISABLED
        version: wanted 33.0, found 0.0
        uCode: 0 bytes
        RSA: 0 bytes
HuC firmware: (null)
        status: N/A
        version: wanted 0.0, found 0.0
        uCode: 0 bytes
        RSA: 0 bytes
----8<----