Fix enabling bridge MMIO windows

Bug #1771344 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Canonical Kernel Team
linux (Ubuntu)
Fix Released
High
Joseph Salisbury
Artful
Fix Released
High
Joseph Salisbury
Bionic
Fix Released
High
Joseph Salisbury
Cosmic
Fix Released
High
Joseph Salisbury

Bug Description

== SRU Justification ==
IBM is requesting this patch in Bionic and Artful to fix a regression. The
regression was introduced in v3.11-rc1. The patch fixes enabling bridge
MMIO windows. Commit 13a83eac373c was also cc'd to upstream stable, and
has already landed in Xenial via upstream stable updates.

== Fix ==
13a83eac373c ("powerpc/eeh: Fix enabling bridge MMIO windows")

== Regression Potential ==
Low. Limited to powerpc and fixes a current regression.

== Test Case ==
A test kernel was built with this patch and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.

== Comment: #0 - Breno Leitao <email address hidden>
On boot we save the configuration space of PCIe bridges. We do this so
when we get an EEH event and everything gets reset that we can restore
them.

Unfortunately we save this state before we've enabled the MMIO space
on the bridges. Hence if we have to reset the bridge when we come back
MMIO is not enabled and we end up taking an PE freeze when the driver
starts accessing again.

This patch forces the memory/MMIO and bus mastering on when restoring
bridges on EEH. Ideally we'd do this correctly by saving the
configuration space writes later, but that will have to come later in
a larger EEH rewrite. For now we have this simple fix.

The original bug can be triggered on a boston machine by doing:
 echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound
On boston, this PHB has a PCIe switch on it. Without this patch,
you'll see two EEH events, 1 expected and 1 the failure we are fixing
here. The second EEH event causes the anything under the PHB to
disappear (i.e. the i40e eth).

With this patch, only 1 EEH event occurs and devices properly recover.

This is commit id 13a83eac373c49c0a081cbcd137e79210fe78acd and should be part of Ubuntu 18.04 kernel.

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-167852 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
tags: added: triage-g
Changed in ubuntu-power-systems:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
importance: Critical → High
status: New → Triaged
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit 13a83eac373c49c0a081cbcd137e79210fe78acd. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1771344

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-image-unsigned, linux-modules and linux-modules-extra .deb packages.

Thanks in advance!

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-05-22 14:42 EDT-------
(In reply to comment #4)
> I built a test kernel with commit 13a83eac373c49c0a081cbcd137e79210fe78acd.
> The test kernel can be downloaded from:
> http://kernel.ubuntu.com/~jsalisbury/lp1771344
>
> Can you test this kernel and see if it resolves this bug?
>
> Note about installing test kernels:
> ? If the test kernel is prior to 4.15(Bionic) you need to install the
> linux-image and linux-image-extra .deb packages.
> ? If the test kernel is 4.15(Bionic) or newer, you need to install the
> linux-image-unsigned, linux-modules and linux-modules-extra .deb packages.
>
> Thanks in advance!

Thanks for the custom kernel. I tested it and it is working as expected:

? ~ uname -a
Linux ubuntu 4.15.0-20-generic #22~lp1771344 SMP Wed May 16 18:34:49 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Changed in ubuntu-power-systems:
status: Triaged → In Progress
description: updated
Changed in linux (Ubuntu Artful):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Changed in linux (Ubuntu Artful):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Bionic):
status: In Progress → Invalid
Revision history for this message
Manoj Iyer (manjo) wrote :

I do not see this patch in bionic kernel, moving the status back to in progress.

Changed in linux (Ubuntu Bionic):
status: Invalid → In Progress
Revision history for this message
Khaled El Mously (kmously) wrote :

@Manoj, this patch looks to be in Bionic as 16735b38aeae1ce2ae21983a7a7440922d6941e4 - can you please double check?

Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-artful' to 'verification-done-artful'. If the problem still exists, change the tag 'verification-needed-artful' to 'verification-failed-artful'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-artful
Manoj Iyer (manjo)
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Released
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Manoj Iyer (manjo)
Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Hello IBM,

Could you please verify the fix(es) with the Artful kernel currently in -proposed?

Thank you.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.13.0-46.51

---------------
linux (4.13.0-46.51) artful; urgency=medium

  * linux: 4.13.0-46.51 -proposed tracker (LP: #1776333)

  * register on binfmt_misc may overflow and crash the system (LP: #1775856)
    - fs/binfmt_misc.c: do not allow offset overflow

  * CVE-2018-11508
    - compat: fix 4-byte infoleak via uninitialized struct field

  * rfi-flush: Switch to new linear fallback flush (LP: #1744173)
    - SAUCE: rfi-flush: Factor out init_fallback_flush()
    - SAUCE: rfi-flush: Move rfi_flush_fallback_area to end of paca
    - powerpc/64s: Improve RFI L1-D cache flush fallback
    - powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again
    - powerpc/rfi-flush: Differentiate enabled and patched flush types
    - powerpc/rfi-flush: Call setup_rfi_flush() after LPM migration

  * Fix enabling bridge MMIO windows (LP: #1771344)
    - powerpc/eeh: Fix enabling bridge MMIO windows

  * CVE-2018-1130
    - dccp: check sk for closed state in dccp_sendmsg()

  * CVE-2018-7757
    - scsi: libsas: fix memory leak in sas_smp_get_phy_events()

  * cpum_sf: ensure sample freq is non-zero (LP: #1772593)
    - s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero

  * wlp3s0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-22)
    (LP: #1720930)
    - iwlwifi: mvm: fix "failed to remove key" message

  * CVE-2018-6927
    - futex: Prevent overflow by strengthen input validation

  * After update to 4.13-43 Intel Graphics are Laggy (LP: #1773520)
    - SAUCE: Revert "drm/i915/edp: Allow alternate fixed mode for eDP if
      available."

  * ELANPAD ELAN0612 does not work, patch available (LP: #1773509)
    - SAUCE: Input: elan_i2c - add ELAN0612 to the ACPI table

  * kernel backtrace when receiving large UDP packages (LP: #1772031)
    - iov_iter: fix page_copy_sane for compound pages

  * FS-Cache: Assertion failed: FS-Cache: 6 == 5 is false (LP: #1774336)
    - SAUCE: CacheFiles: fix a read_waiter/read_copier race

  * CVE-2018-5803
    - sctp: verify size of a new chunk in _sctp_make_chunk()

  * enable mic-mute hotkey and led on Lenovo M820z and M920z (LP: #1774306)
    - ALSA: hda/realtek - Enable mic-mute hotkey for several Lenovo AIOs

  * CVE-2018-7755
    - SAUCE: floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl

  * CVE-2018-5750
    - ACPI: sbshc: remove raw pointer from printk() message

 -- Khalid Elmously <email address hidden> Mon, 11 Jun 2018 23:25:30 +0000

Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin1804
removed: targetmilestone-inin---
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.