DHCP Server regularly killed code=killed, status=6/ABRT

Bug #1870729 reported by Andrew Welham
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
isc-dhcp (Ubuntu)
Fix Released
High
Jamie Strandboge
Focal
Fix Released
High
Jamie Strandboge

Bug Description

On Ubuntu 20.04 The idc-dhcp- server version is
isc-dhcp-server:
  Installed: (none)
  Candidate: 4.4.1-2.1ubuntu3
  Version table:
     4.4.1-2.1ubuntu3 500
        500 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages

The DHCP server is being regularly killed, when searching google the bug looks very similar to an older bug regarding accessing a lease file, that was fixed year ago. As a temporary fix in systemd I have enabled a restart every time the main process fails. The error got worse when i start to run a pair of dhcp servers syncing state between each other

I regularly see the following in the syslog

Apr 4 00:04:55 gw sh[1500]: ../../../../lib/isc/unix/socket.c:3361: INSIST(!sock->pending_send) failed, back trace
Apr 4 00:04:55 gw sh[1500]: #0 0x7f36befeda4a in ??
Apr 4 00:04:55 gw sh[1500]: #1 0x7f36befed980 in ??
Apr 4 00:04:55 gw sh[1500]: #2 0x7f36bf0297e1 in ??
Apr 4 00:04:55 gw sh[1500]: #3 0x7f36bedd0609 in ??
Apr 4 00:04:55 gw sh[1500]: #4 0x7f36bef0c153 in ??
Apr 4 00:04:55 gw systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
Apr 4 00:04:55 gw systemd[1]: isc-dhcp-server.service: Failed with result 'signal'.
Apr 4 00:05:00 gw systemd[1]: isc-dhcp-server.service: Scheduled restart job, restart counter is at 3.
Apr 4 00:05:00 gw systemd[1]: Stopped ISC DHCP IPv4 server.
Apr 4 00:05:00 gw systemd[1]: Started ISC DHCP IPv4 server.
Apr 4 00:05:00 gw kernel: [ 3508.161248] audit: type=1400 audit(1585958700.678:46): apparmor="DENIED" operation="open" profile="/usr/sbin/dhcpd" name="/proc/2049/task/2068/comm" pid=2049 comm="dhcpd" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
Apr 4 00:05:00 gw dhcpd[2049]: Internet Systems Consortium DHCP Server 4.4.1
Apr 4 00:05:00 gw sh[2049]: Internet Systems Consortium DHCP Server 4.4.1
Apr 4 00:05:00 gw sh[2049]: Copyright 2004-2018 Internet Systems Consortium.
Apr 4 00:05:00 gw sh[2049]: All rights reserved.
Apr 4 00:05:00 gw sh[2049]: For info, please visit https://www.isc.org/software/dhcp/
Apr 4 00:05:00 gw dhcpd[2049]: Copyright 2004-2018 Internet Systems Consortium.
Apr 4 00:05:00 gw dhcpd[2049]: All rights reserved.
Apr 4 00:05:00 gw dhcpd[2049]: For info, please visit https://www.isc.org/software/dhcp/
Apr 4 00:05:00 gw kernel: [ 3508.161561] audit: type=1400 audit(1585958700.678:47): apparmor="DENIED" operation="open" profile="/usr/sbin/dhcpd" name="/proc/2049/task/2069/comm" pid=2049 comm="dhcpd" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
Apr 4 00:05:00 gw kernel: [ 3508.161563] audit: type=1400 audit(1585958700.682:48): apparmor="DENIED" operation="open" profile="/usr/sbin/dhcpd" name="/proc/2049/task/2070/comm" pid=2049 comm="dhcpd" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
Apr 4 00:05:00 gw dhcpd[2049]: Config file: /etc/dhcp/dhcpd.conf
Apr 4 00:05:00 gw sh[2049]: Config file: /etc/dhcp/dhcpd.conf
Apr 4 00:05:00 gw sh[2049]: Database file: /var/lib/dhcp/dhcpd.leases
Apr 4 00:05:00 gw sh[2049]: PID file: /run/dhcp-server/dhcpd.pid
Apr 4 00:05:00 gw dhcpd[2049]: Database file: /var/lib/dhcp/dhcpd.leases
Apr 4 00:05:00 gw dhcpd[2049]: PID file: /run/dhcp-server/dhcpd.pid
Apr 4 00:05:00 gw dhcpd[2049]: Wrote 0 deleted host decls to leases file.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Does adjusting /etc/apparmor.d/usr.sbin.dhcpd to have:

  owner @{PROC}/[0-9]*/comm r,
  owner @{PROC}/[0-9]*/task/[0-9]*/comm r,

resolve the issue for you? Be sure to load the policy into the kernel with: sudo apparmor_parser -r /etc/apparmor.d/usr.sbin.dhcpd before performing your testing.

Changed in isc-dhcp (Ubuntu):
assignee: nobody → Jamie Strandboge (jdstrand)
importance: Undecided → High
milestone: none → ubuntu-20.04
status: New → In Progress
Revision history for this message
Jamie Strandboge (jdstrand) wrote :

4.4.1-2.1ubuntu4 was uploaded for the above. Please let us know if it doesn't fix the issue for you.

Changed in isc-dhcp (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package isc-dhcp - 4.4.1-2.1ubuntu4

---------------
isc-dhcp (4.4.1-2.1ubuntu4) focal; urgency=medium

  * debian/apparmor/usr.sbin.dhcpd: allow owner read on /proc/*/comm and
    /proc/*/task/*/comm (LP: #1870729)

 -- Jamie Strandboge <email address hidden> Mon, 06 Apr 2020 21:58:35 +0000

Changed in isc-dhcp (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
mm (mtl-0) wrote :

The bug is still present for me with
ii isc-dhcp-server 4.4.1-2.1ubuntu4 amd64 ISC DHCP server for automatic IP address assignment.

apparmor rules are up to date with:
owner @{PROC}/[0-9]*/comm r,
owner @{PROC}/[0-9]*/task/[0-9]*/comm r,

Error in syslog:
Apr 7 19:23:25 cdsv2 systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
Apr 7 19:23:25 cdsv2 systemd[1]: isc-dhcp-server.service: Failed with result 'signal'.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Now that there are no apparmor denials, this sounds like something for the server team to take a look at. Can you file a new bug since this one was used to address the apparmor denials? Thanks!

Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :

The primary DHCP server is now fine, but the secondary won't start any more (although it did immediately after patching)
I now get the error
Apr 8 06:14:11 dhcp-backup dhcpd[1181]:
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: Sending on Socket/fallback/fallback-net
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: failover peer failover-partner: I move from normal to startup
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: Server starting service.
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: failover peer failover-partner: peer moves from normal to communications-interrupted
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: failover peer failover-partner: I move from startup to normal
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: balancing pool 5645d4679930 Internal total 250 free 125 backup 125 lts 0 max-own (+/-)25
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: balanced pool 5645d4679930 Internal total 250 free 125 backup 125 lts 0 max-misbal 38
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: balancing pool 5645d46e2cf0 Internal total 250 free 111 backup 139 lts 14 max-own (+/-)25
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: balanced pool 5645d46e2cf0 Internal total 250 free 111 backup 139 lts 14 max-misbal 38
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: failover peer failover-partner: peer moves from communications-interrupted to normal
Apr 8 06:14:11 dhcp-backup dhcpd[1181]: failover peer failover-partner: Both servers normal
Apr 8 06:14:11 dhcp-backup sh[1181]: ../../../../lib/isc/unix/socket.c:3361: INSIST(!sock->pending_send) failed, back trace
Apr 8 06:14:11 dhcp-backup sh[1181]: #0 0x7f0a52b25a4a in ??
Apr 8 06:14:11 dhcp-backup sh[1181]: #1 0x7f0a52b25980 in ??
Apr 8 06:14:11 dhcp-backup sh[1181]: #2 0x7f0a52b617e1 in ??
Apr 8 06:14:11 dhcp-backup sh[1181]: #3 0x7f0a52908609 in ??
Apr 8 06:14:11 dhcp-backup sh[1181]: #4 0x7f0a52a44103 in ??
Apr 8 06:14:12 dhcp-backup systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
Apr 8 06:14:12 dhcp-backup systemd[1]: isc-dhcp-server.service: Failed with result 'signal'.

Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :

I tried this.
If you stop the primary dhcp server
Start the backup (This time it works)
Restart the primary dhcp server
Secondary dhcp server fails with the error

Apr 8 06:26:00 dhcp-backupl dhcpd[1360]: failover peer failover-partner: peer moves from communications-interrupted to normal
Apr 8 06:26:00 dhcp-backupl dhcpd[1360]: failover peer failover-partner: Both servers normal
Apr 8 06:26:00 dhcp-backupl sh[1360]: ../../../../lib/isc/unix/socket.c:3361: INSIST(!sock->pending_send) failed, back trace
Apr 8 06:26:00 dhcp-backupl sh[1360]: #0 0x7fbe199fda4a in ??
Apr 8 06:26:00 dhcp-backupl sh[1360]: #1 0x7fbe199fd980 in ??
Apr 8 06:26:00 dhcp-backupl sh[1360]: #2 0x7fbe19a397e1 in ??
Apr 8 06:26:00 dhcp-backupl sh[1360]: #3 0x7fbe197e0609 in ??
Apr 8 06:26:00 dhcp-backupl sh[1360]: #4 0x7fbe1991c103 in ??
Apr 8 06:26:01 dhcp-backupl systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
Apr 8 06:26:01 dhcp-backupl systemd[1]: isc-dhcp-server.service: Failed with result 'signal'

Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :

incorrect statement above, since i have been bouncing the dhcp servers to try to find a pattern, the primary dhcp server logged a single error but continues to work

Apr 8 06:32:36 dhcp-primary systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT

Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :

Just checked a day later, only one kill on the primary dhcp server, 2nd server will not start

Changed in isc-dhcp (Ubuntu):
status: Fix Released → In Progress
Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :

FIXED

on the backup DHCP server I kept seeing

Apr 10 11:10:08 dhcp-backup kernel: [ 86.045487] audit: type=1400 audit(1586513408.708:44): apparmor="DENIED" operation="open" profile="/usr/sbin/dhcpd" name="/proc/1404/task/1417/comm" pid=1404 comm="dhcpd" requested_mask="w" denied_mask="w" fsuid=0 ouid=0
Apr 10 11:10:08 dhcp-backup kernel: [ 86.047925] audit: type=1400 audit(1586513408.708:45): apparmor="DENIED" operation="open" profile="/usr/sbin/dhcpd" name="/proc/1404/task/1418/comm" pid=1404 comm="dhcpd" requested_mask="w" denied_mask="w" fsuid=0 ouid=0
Apr 10 11:10:08 dhcp-backup kernel: [ 86.048075] audit: type=1400 audit(1586513408.708:46): apparmor="DENIED" operation="open" profile="/usr/sbin/dhcpd" name="/proc/1404/task/1419/comm" pid=1404 comm="dhcpd" requested_mask="w" denied_mask="w" fsuid=0 ouid=0

Changing
/etc/apparmor.d/usr.sbin.dhcpd

 # owner @{PROC}/[0-9]*/task/[0-9]*/comm r,
  owner @{PROC}/[0-9]*/task/[0-9]*/comm rw,

then reloading the policy into the kernel with: sudo apparmor_parser -r /etc/apparmor.d/usr.sbin.dhcpd

resolved the issue

Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :

dhcp-primary started to log the same errors so applied the same fix

Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :

ran for a few hours then crashed with

ERROR: apport (pid 3789) Fri Apr 10 15:34:35 2020: called for pid 1900, signal 6, core limit 0, dump mode 2
ERROR: apport (pid 3789) Fri Apr 10 15:34:35 2020: not creating core for pid with dump mode of 2
ERROR: apport (pid 3789) Fri Apr 10 15:34:35 2020: executable: /usr/sbin/dhcpd (command line "dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf")
ERROR: apport (pid 3789) Fri Apr 10 15:34:35 2020: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 3789) Fri Apr 10 15:34:35 2020: apport: report /var/crash/_usr_sbin_dhcpd.0.crash already exists and unseen, doing nothing to avoid disk usage DoS

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I will update the policy for the write access. I suggest removing the crash file in /var/crash, then if you see the crash again, file a new bug with the crash information (eg, apport-cli if on a server) so it can be analyzed.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package isc-dhcp - 4.4.1-2.1ubuntu5

---------------
isc-dhcp (4.4.1-2.1ubuntu5) focal; urgency=medium

  * debian/apparmor/usr.sbin.dhcpd: also allow write on /proc/*/comm and
    /proc/*/task/*/comm (LP: #1870729)

 -- Jamie Strandboge <email address hidden> Fri, 10 Apr 2020 17:21:12 +0000

Changed in isc-dhcp (Ubuntu Focal):
status: In Progress → Fix Released
Revision history for this message
Robert Stroetgen (m-robert) wrote :

Problem still exists even after disabling apparmor completely:

May 14 09:46:18 dc sh[4241]: ../../../../lib/isc/unix/socket.c:3361: INSIST(!sock->pending_send) failed, back trace
May 14 09:46:18 dc sh[4241]: #0 0x7fdeeb0dca4a in ??
May 14 09:46:18 dc sh[4241]: #1 0x7fdeeb0dc980 in ??
May 14 09:46:18 dc sh[4241]: #2 0x7fdeeb1187e1 in ??
May 14 09:46:18 dc sh[4241]: #3 0x7fdeeaebf609 in ??
May 14 09:46:18 dc sh[4241]: #4 0x7fdeeaffb103 in ??
May 14 09:46:18 dc systemd[1]: isc-dhcp-server.service: Main process exited, code=killed, status=6/ABRT
May 14 09:46:18 dc systemd[1]: isc-dhcp-server.service: Failed with result 'signal'.

isc-dhcp-server 4.4.1-2.1ubuntu5 amd64

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

This bug is marked fixed release. As I suggested in comment #13, please file a new bug. This will allow you to use apport to upload any crash information/etc that will assist developers in fixing this.

Revision history for this message
Andrew Welham (andreww-ubuntu) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.