ipmmu is always registered

Bug #1783746 reported by Marc Dietrich
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Medium
Paolo Pisati
Bionic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

The ipmmu-vmsa driver registers itself to the system via an initcall() to ipmmu_init(), and in case it's the first (or the only) iommu driver, it registers its iommu ops to the platform bus - in the tegra2 case, there's no iommu hardware, so all drivers bail out, except for the ipmmu-vmsa.

Later on, during boot, when the Tegra host1x is probing (drivers/gpu/host1x/dev.c::host1x_probe()), it checks if an iommu device is present (drivers/iommu/iommu.c::iommu_present() that does so by identifying if any iommu ops were registered) and attach to the supposedly present device, incurring in a null pointer dereference.

Upstream quickly acknowledged the problem, and rolled a patch to restrict the ipmmu-vmsa driver to register if and only if a compatible device is present.

The fix appared initially in 4.19, and was later backported via stable to 4.18.x, and this is a clean cherry pick of that commit.

[Fix]

Apply the attached patch and recompile.

[How to test]

Try to boot a patched kernel on a Tegra2 board.

[Regession potential]

None, the fix is trivial.

--

Original bug:

Hi,

booting the bionic kernel (4.15.0-29-generic) on my Tegra20 device (no iommu), I found it crashes during display driver setup. The bootlog (and crash) is attached. Asking on Tegra IRC channel, digetx found that this is caused by the IPMMU-VMSA driver which is always registered via initcall. Adding "initcall_blacklist=ipmmu_init" to the kernel parameters makes it boot fine.

Maybe this buggy driver should be disabled in the config (or fixed somehow)?

Revision history for this message
Marc Dietrich (marvin24) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1783746

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Marc Dietrich (marvin24)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18-rc6

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Marc Dietrich (marvin24) wrote :

It started after upgrade from X to B. This may only show up on Tegra2, which does not have an IOMMU, and which creates problems if one is registered. Sorry, I don't understand the technical details. AFAICT, the whole driver is still broken in mainline kernel (since 3.20). An upstream fixed is currently proposed. See:
https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=next&id=5c5c87411488af3cd082221e567498d813d0fe83

Revision history for this message
Marc Dietrich (marvin24) wrote :

It is possible that it didn't show up in older kernels (4.4 for X) because Tegra didn't used the IOMMU at that time.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Paolo Pisati (p-pisati)
Changed in linux (Ubuntu):
assignee: nobody → Paolo Pisati (p-pisati)
Revision history for this message
Paolo Pisati (p-pisati) wrote :

The patch looks good, but i'll have to wait until it's pulled into Linus tree.

Paolo Pisati (p-pisati)
description: updated
Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Marc Dietrich (marvin24)
tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (20.0 KiB)

This bug was fixed in the package linux - 4.15.0-38.41

---------------
linux (4.15.0-38.41) bionic; urgency=medium

  * linux: 4.15.0-38.41 -proposed tracker (LP: #1797061)

  * Silent data corruption in Linux kernel 4.15 (LP: #1796542)
    - block: add a lower-level bio_add_page interface
    - block: bio_iov_iter_get_pages: fix size of last iovec
    - blkdev: __blkdev_direct_IO_simple: fix leak in error case
    - block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs

linux (4.15.0-37.40) bionic; urgency=medium

  * linux: 4.15.0-37.40 -proposed tracker (LP: #1795564)

  * hns3: enable ethtool rx-vlan-filter on supported hw (LP: #1793394)
    - net: hns3: Add vlan filter setting by ethtool command -K

  * hns3: Modifying channel parameters will reset ring parameters back to
    defaults (LP: #1793404)
    - net: hns3: Fix desc num set to default when setting channel

  * hisi_sas: Add SATA FIX check for v3 hw (LP: #1794151)
    - scsi: hisi_sas: Add SATA FIS check for v3 hw

  * Fix potential corruption using SAS controller on HiSilicon arm64 boards
    (LP: #1794156)
    - scsi: hisi_sas: add memory barrier in task delivery function

  * hisi_sas: Reduce unnecessary spin lock contention (LP: #1794165)
    - scsi: hisi_sas: Tidy hisi_sas_task_prep()

  * Add functional level reset support for the SAS controller on HiSilicon D06
    systems (LP: #1794166)
    - scsi: hisi_sas: tidy host controller reset function a bit
    - scsi: hisi_sas: relocate some common code for v3 hw
    - scsi: hisi_sas: Implement handlers of PCIe FLR for v3 hw

  * HiSilicon SAS controller doesn't recover from PHY STP link timeout
    (LP: #1794172)
    - scsi: hisi_sas: tidy channel interrupt handler for v3 hw
    - scsi: hisi_sas: Fix the failure of recovering PHY from STP link timeout

  * getxattr: always handle namespaced attributes (LP: #1789746)
    - getxattr: use correct xattr length

  * Fix unusable NVIDIA GPU after S3 (LP: #1793338)
    - PCI: Reprogram bridge prefetch registers on resume

  * Fails to boot under Xen PV: BUG: unable to handle kernel paging request at
    edc21fd9 (LP: #1789118)
    - x86/EISA: Don't probe EISA bus for Xen PV guests

  * qeth: use vzalloc for QUERY OAT buffer (LP: #1793086)
    - s390/qeth: use vzalloc for QUERY OAT buffer

  * SRU: Enable middle button of touchpad on ThinkPad P72 (LP: #1793463)
    - Input: elantech - enable middle button of touchpad on ThinkPad P72

  * Dell new AIO requires a new uart backlight driver (LP: #1727235)
    - SAUCE: platform/x86: dell-uart-backlight: new backlight driver for DELL AIO
    - updateconfigs for Dell UART backlight driver

  * [Ubuntu] s390/crypto: Fix return code checking in cbc_paes_crypt.
    (LP: #1794294)
    - s390/crypto: Fix return code checking in cbc_paes_crypt()

  * hns3: Retrieve RoCE MSI-X config from firmware (LP: #1793221)
    - net: hns3: Fix MSIX allocation issue for VF
    - net: hns3: Refine the MSIX allocation for PF

  * net: hns: Avoid hang when link is changed while handling packets
    (LP: #1792209)
    - net: hns: add the code for cleaning pkt in chip
    - net: hns: add netif_carrier_off before change speed and duplex

  * Page leaki...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Paolo Pisati (p-pisati)
Changed in linux (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.