Fix MCE handling for user access of poisoned device-dax mapping

Bug #1774366 reported by quanxian
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
intel
Fix Released
Medium
Unassigned
linux (Ubuntu)
Fix Released
High
Thadeu Lima de Souza Cascardo
Cosmic
Fix Released
High
Thadeu Lima de Souza Cascardo

Bug Description

Description:

Customer reports:

We've tried the ndctl error injection. Now the error injection is successful. But we have a couple of questions related with the poisoned block.

Here are some tests/steps that I did:

1. I mmap the first 10GB of /dev/dax13.0 to virtual address space and inject an error at offset 1GB (block offset should be 1GB/512bytes = 2097152) which seems fine:

[root@scaqae09celadm13 wning]# ndctl inject-error --uninject --block=2097152 --count=1 namespace13.0
Warning: Un-injecting previously injected errors here will
not cause the kernel to 'forget' its badblock entries. Those
have to be cleared through the normal process of writing
the affected blocks

{
"dev":"namespace13.0",
"mode":"dax",
"size":518967525376,
"uuid":"0738c8bd-3b3f-4989-9d0e-0e9c6006c810",
"chardev":"dax13.0",
"numa_node":0,
"badblock_count":1,
"badblocks":[

{ "offset":2097152, "length":1, "dimms":[ "nmem1" ] }
]
}

2. In my test program, I just try to read every address of the first 10GB. At the first time, when I read the offset 1GB, I got the SIGBUS error, but in the sinfo struct of signal handler, the failed address is NULL and signal code is 128 which seems incorrect. But then if we run again, the unit test gets stuck here:

rt_sigaction(SIGBUS,

{0x400dd2, [], SA_RESTORER|SA_SIGINFO, 0x7fb5cf839270}
, NULL, 8) = 0

And here is the output of log messages:

Apr 3 14:39:23 scaqae09celadm13 kernel: mce: Uncorrected hardware memory error in user-access at 952b200000
Apr 3 14:51:11 scaqae09celadm13 kernel: mce: Uncorrected hardware memory error in user-access at 94eb300000
Apr 3 14:51:11 scaqae09celadm13 kernel: mce_notify_irq: 48 callbacks suppressed
Apr 3 14:51:11 scaqae09celadm13 kernel: mce: [Hardware Error]: Machine check events logged
Apr 3 14:51:11 scaqae09celadm13 kernel: Memory failure: 0x94eb300: reserved kernel page still referenced by 1 users
Apr 3 14:51:11 scaqae09celadm13 kernel: Memory failure: 0x94eb300: recovery action for reserved kernel page: Failed
Apr 3 14:51:11 scaqae09celadm13 kernel: mce: Memory error not recovered

The program I use is (simple do memcpy and directly read from the target address):

for (i = 0; i < DAX_MAPPING_SIZE; i++) // DAX_MAPPING_SIZE is 10G

{ total += peek(buf + i); }
char peek(void *addr)

{ char temp[128]; memcpy(temp, addr, 1); return *(char *)addr; }
May I ask do we missed steps in triggering the SIGBUS error?

Target Kernel: 4.19
Target Release: 18.10

CVE References

quanxian (quanxian-wang)
tags: added: intel-kernel-18.10
quanxian (quanxian-wang)
description: updated
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can we move this bug to the "Linux" package and make it public?

Changed in intel:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
quanxian (quanxian-wang) wrote :

yes, it is a bug. We can make it public. Thanks

quanxian (quanxian-wang)
information type: Proprietary → Public
Revision history for this message
quanxian (quanxian-wang) wrote :

4.19-rc1

226ab561075 device-dax: Convert to vmf_insert_mixed and vm_fault_t
2232c6382a4 device-dax: Enable page_mapping()
35de299547d device-dax: Set page->index
73449daf8f0d filesystem-dax: Set page->index
86a66810baa mm, madvise_inject_error: Disable MADV_SOFT_OFFLINE for ZONE_DEVICE pages
2fa147bdbf67 mm, dev_pagemap: Do not clear ->mapping on final put
23e7b5c2e27 mm, madvise_inject_error: Let memory_failure() optionally take a page reference
ae1139ece12 mm, memory_failure: Collect mapping size in collect_procs()
c2a7d2a1155 filesystem-dax: Introduce dax_lock_mapping_entry()
6100e34b252 mm, memory_failure: Teach memory_failure() about dev_pagemap pages
510ee090abc x86/mm/pat: Prepare
284ce4011ba x86/memory_failure: Introduce
c953cc987ab libnvdimm, pmem: Restore page attributes when clearing errors

Revision history for this message
quanxian (quanxian-wang) wrote :

need backporting for 18.10

Changed in linux (Ubuntu Cosmic):
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
status: Confirmed → In Progress
Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed
Revision history for this message
quanxian (quanxian-wang) wrote :

sorry, there will be more commits available.

c7486104a5ce x86/mce: Fix set_mce_nospec() to avoid #GP fault

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

c7486104a5ce applied to cosmic master-next.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (29.0 KiB)

This bug was fixed in the package linux - 4.18.0-8.9

---------------
linux (4.18.0-8.9) cosmic; urgency=medium

  * linux: 4.18.0-8.9 -proposed tracker (LP: #1791663)

  * Cosmic update to v4.18.7 stable release (LP: #1791660)
    - rcu: Make expedited GPs handle CPU 0 being offline
    - net: 6lowpan: fix reserved space for single frames
    - net: mac802154: tx: expand tailroom if necessary
    - 9p/net: Fix zero-copy path in the 9p virtio transport
    - spi: davinci: fix a NULL pointer dereference
    - spi: pxa2xx: Add support for Intel Ice Lake
    - spi: spi-fsl-dspi: Fix imprecise abort on VF500 during probe
    - spi: cadence: Change usleep_range() to udelay(), for atomic context
    - mmc: block: Fix unsupported parallel dispatch of requests
    - mmc: renesas_sdhi_internal_dmac: mask DMAC interrupts
    - mmc: renesas_sdhi_internal_dmac: fix #define RST_RESERVED_BITS
    - readahead: stricter check for bdi io_pages
    - block: fix infinite loop if the device loses discard capability
    - block: blk_init_allocated_queue() set q->fq as NULL in the fail case
    - block: really disable runtime-pm for blk-mq
    - blkcg: Introduce blkg_root_lookup()
    - block: Introduce blk_exit_queue()
    - block: Ensure that a request queue is dissociated from the cgroup controller
    - apparmor: fix bad debug check in apparmor_secid_to_secctx()
    - dma-buf: Move BUG_ON from _add_shared_fence to _add_shared_inplace
    - libertas: fix suspend and resume for SDIO connected cards
    - media: Revert "[media] tvp5150: fix pad format frame height"
    - mailbox: xgene-slimpro: Fix potential NULL pointer dereference
    - Replace magic for trusting the secondary keyring with #define
    - Fix kexec forbidding kernels signed with keys in the secondary keyring to
      boot
    - powerpc/fadump: handle crash memory ranges array index overflow
    - powerpc/64s: Fix page table fragment refcount race vs speculative references
    - powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
    - powerpc/pkeys: Give all threads control of their key permissions
    - powerpc/pkeys: Deny read/write/execute by default
    - powerpc/pkeys: key allocation/deallocation must not change pkey registers
    - powerpc/pkeys: Save the pkey registers before fork
    - powerpc/pkeys: Fix calculation of total pkeys.
    - powerpc/pkeys: Preallocate execute-only key
    - powerpc/nohash: fix pte_access_permitted()
    - powerpc64/ftrace: Include ftrace.h needed for enable/disable calls
    - powerpc/powernv/pci: Work around races in PCI bridge enabling
    - cxl: Fix wrong comparison in cxl_adapter_context_get()
    - IB/mlx5: Honor cnt_set_id_valid flag instead of set_id
    - IB/mlx5: Fix leaking stack memory to userspace
    - IB/srpt: Fix srpt_cm_req_recv() error path (1/2)
    - IB/srpt: Fix srpt_cm_req_recv() error path (2/2)
    - IB/srpt: Support HCAs with more than two ports
    - overflow.h: Add arithmetic shift helper
    - RDMA/mlx5: Fix shift overflow in mlx5_ib_create_wq
    - ib_srpt: Fix a use-after-free in srpt_close_ch()
    - ib_srpt: Fix a use-after-free in __srpt_close_all_ch()
    - RDMA/rxe: Set wqe->status correctly if an unexpected...

Changed in linux (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
quanxian (quanxian-wang)
Changed in intel:
status: Triaged → Fix Released
Andy Whitcroft (apw)
tags: added: kernel-fixup-verification-needed-bionic
removed: verification-needed-bionic
Brad Figg (brad-figg)
tags: added: verification-needed-bionic
Andy Whitcroft (apw)
tags: removed: verification-needed-bionic
Revision history for this message
Andy Whitcroft (apw) wrote :

This bug was erroneously marked for verification in bionic; verification is not required and verification-needed-bionic is being removed.

tags: added: verification-done-bionic
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.