seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE

Bug #1847744 reported by Christian Brauner
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Christian Brauner
Disco
Fix Released
Medium
Unassigned
Eoan
Fix Released
Medium
Unassigned

Bug Description

SRU Justification

Impact: Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4]) which enables a process (watchee) to retrieve an fd for its seccomp filter. This fd can then be handed to another (usually more privileged) process (watcher). The watcher will then be able to receive seccomp messages about the syscalls having been performed by the watchee.

This feature is heavily used in some userspace workloads. For example, it is currently used to intercept mknod() syscalls in user namespaces aka in containers. The mknod() syscall can be easily filtered based on dev_t. This allows us to only intercept a very specific subset of mknod() syscalls. Furthermore, mknod() is not possible in user namespaces toto coelo and so intercepting and denying syscalls that are not in the whitelist on accident is not a big deal. The watchee won't notice a difference.

In contrast to mknod(), a lot of other syscall we intercept (e.g. setxattr()) cannot be easily filtered like mknod() because they have pointer arguments. Additionally, some of them might actually succeed in user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we currently cannot tell seccomp to continue from a user notifier we are stuck with performing all of the syscalls in lieu of the container. This is a huge security liability since it is extremely difficult to correctly assume all of the necessary privileges of the calling task
such that the syscall can be successfully emulated without escaping other additional security restrictions (think missing CAP_MKNOD for mknod(), or MS_NODEV on a filesystem etc.). This can be solved by telling seccomp to resume the syscall.

Fix: Allow the seccomp notifier to continue a syscall. A positive discussion about this feature was triggered by a post to the ksummit-discuss mailing list (cf. [3]) and took place during KSummit (cf. [1]) and again at the containers/checkpoint-restore micro-conference at Linux Plumbers.

Regression Potential: Limited to seccomp. The patchset also comes with proper selftests in addition to the large set of seccomp selftests that are already there. This further reduces regression potential.

Test Case:
Compile a kernel with the patch applied and run the selftests or trap a syscall via the notifier fd and set the newly introduced flag. The syscall should then have continued.

Target Kernels: All current LTS kernels.

Patches:
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=for-next/seccomp&id=fb3c5386b382d4097476ce9647260fc89b34afdb
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=for-next/seccomp&id=0eebfed2954f152259cae0ad57b91d3ea92968e8

/* References */
[1]: https://linuxplumbersconf.org/event/4/contributions/560
[2]: https://linuxplumbersconf.org/event/4/contributions/477
[3]: https://<email address hidden>
[4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")

Changed in linux (Ubuntu):
assignee: nobody → Christian Brauner (cbrauner)
status: New → In Progress
Stefan Bader (smb)
Changed in linux (Ubuntu Disco):
importance: Undecided → Medium
Changed in linux (Ubuntu Eoan):
importance: Undecided → Medium
Changed in linux (Ubuntu Disco):
status: New → Fix Committed
Changed in linux (Ubuntu Eoan):
status: New → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco
Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
tags: added: verification-done-disco verification-done-eoan
removed: verification-needed-disco verification-needed-eoan
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (linux-gcp-5.3/5.3.0-1008.9~18.04.1)

All autopkgtests for the newly accepted linux-gcp-5.3 (5.3.0-1008.9~18.04.1) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

linux-gcp-5.3/unknown (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#linux-gcp-5.3

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (53.1 KiB)

This bug was fixed in the package linux - 5.3.0-22.24

---------------
linux (5.3.0-22.24) eoan; urgency=medium

  * [REGRESSION] md/raid0: cannot assemble multi-zone RAID0 with default_layout
    setting (LP: #1849682)
    - Revert "md/raid0: avoid RAID0 data corruption due to layout confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // CVE-2019-15793
    - SAUCE: shiftfs: Correct id translation for lower fs operations
    - SAUCE: shiftfs: prevent type confusion
    - SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.3.0-21.22) eoan; urgency=medium

  * eoan/linux: 5.3.0-21.22 -proposed tracker (LP: #1850486)

  * Fix signing of staging modules in eoan (LP: #1850234)
    - [Packaging] Leave unsigned modules unsigned after adding .gnu_debuglink

linux (5.3.0-20.21) eoan; urgency=medium

  * eoan/linux: 5.3.0-20.21 -proposed tracker (LP: #1849064)

  * eoan: alsa/sof: Enable SOF_HDA link and codec (LP: #1848490)
    - [Config] Enable SOF_HDA link and codec

  * Eoan update: 5.3.7 upstream stable release (LP: #1848750)
    - panic: ensure preemption is disabled during panic()
    - [Config] updateconfigs for USB_RIO500
    - USB: rio500: Remove Rio 500 kernel driver
   ...

Changed in linux (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (38.6 KiB)

This bug was fixed in the package linux - 5.0.0-35.38

---------------
linux (5.0.0-35.38) disco; urgency=medium

  * [REGRESSION] md/raid0: cannot assemble multi-zone RAID0 with default_layout
    setting (LP: #1849682)
    - SAUCE: Fix revert "md/raid0: avoid RAID0 data corruption due to layout
      confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // CVE-2019-15793
    - SAUCE: shiftfs: Correct id translation for lower fs operations
    - SAUCE: shiftfs: prevent type confusion
    - SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
    - kvm: Convert kvm_lock to a mutex
    - kvm: x86: Do not release the page inside mmu_set_spte()
    - KVM: x86: make FNAME(fetch) and __direct_map more similar
    - KVM: x86: remove now unneeded hugepage gfn adjustment
    - KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
    - KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - KVM: x86: use Intel speculation bugs and features as derived in generic x86
      code
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - SAUCE: x86/speculation/taa: Call tsx_init()
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.0.0-34.36) disco; urgency=medium

  * disco/linux: <version to be filled> -proposed tracker (LP: #1850574)

  * [REGRESSION] md/raid0: cannot as...

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (33.2 KiB)

This bug was fixed in the package linux - 5.3.0-24.26

---------------
linux (5.3.0-24.26) eoan; urgency=medium

  * eoan/linux: 5.3.0-24.26 -proposed tracker (LP: #1852232)

  * Eoan update: 5.3.9 upstream stable release (LP: #1851550)
    - io_uring: fix up O_NONBLOCK handling for sockets
    - dm snapshot: introduce account_start_copy() and account_end_copy()
    - dm snapshot: rework COW throttling to fix deadlock
    - Btrfs: fix inode cache block reserve leak on failure to allocate data space
    - btrfs: qgroup: Always free PREALLOC META reserve in
      btrfs_delalloc_release_extents()
    - iio: adc: meson_saradc: Fix memory allocation order
    - iio: fix center temperature of bmc150-accel-core
    - libsubcmd: Make _FORTIFY_SOURCE defines dependent on the feature
    - perf tests: Avoid raising SEGV using an obvious NULL dereference
    - perf map: Fix overlapped map handling
    - perf script brstackinsn: Fix recovery from LBR/binary mismatch
    - perf jevents: Fix period for Intel fixed counters
    - perf tools: Propagate get_cpuid() error
    - perf annotate: Propagate perf_env__arch() error
    - perf annotate: Fix the signedness of failure returns
    - perf annotate: Propagate the symbol__annotate() error return
    - perf annotate: Fix arch specific ->init() failure errors
    - perf annotate: Return appropriate error code for allocation failures
    - perf annotate: Don't return -1 for error when doing BPF disassembly
    - staging: rtl8188eu: fix null dereference when kzalloc fails
    - RDMA/siw: Fix serialization issue in write_space()
    - RDMA/hfi1: Prevent memory leak in sdma_init
    - RDMA/iw_cxgb4: fix SRQ access from dump_qp()
    - RDMA/iwcm: Fix a lock inversion issue
    - HID: hyperv: Use in-place iterator API in the channel callback
    - kselftest: exclude failed TARGETS from runlist
    - selftests/kselftest/runner.sh: Add 45 second timeout per test
    - nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request
    - arm64: cpufeature: Effectively expose FRINT capability to userspace
    - arm64: Fix incorrect irqflag restore for priority masking for compat
    - arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419
    - tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
    - tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
    - serial/sifive: select SERIAL_EARLYCON
    - tty: n_hdlc: fix build on SPARC
    - misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach
    - RDMA/core: Fix an error handling path in 'res_get_common_doit()'
    - RDMA/cm: Fix memory leak in cm_add/remove_one
    - RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
    - RDMA/mlx5: Do not allow rereg of a ODP MR
    - RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu
    - RDMA/mlx5: Add missing synchronize_srcu() for MW cases
    - gpio: max77620: Use correct unit for debounce times
    - fs: cifs: mute -Wunused-const-variable message
    - arm64: vdso32: Fix broken compat vDSO build warnings
    - arm64: vdso32: Detect binutils support for dmb ishld
    - serial: mctrl_gpio: Check for NULL pointer
    - serial: 8250_...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.