autopkgtest identified fighting time services

Bug #1870144 reported by Christian Ehrhardt 
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
chrony (Ubuntu)
Fix Released
Critical
Unassigned

Bug Description

This SRU Template is meant to help the release team accepting this into focal before release.

[Impact]

Important:
The fix to avoid the services running concurrently moved
to bug 1849156 - this upload will only fix the test issues now.
I'll keep the test steps below as I'll want to test them with
the new systemd-timesyncd then.

 * Over the years there were multiple approaches to adapt different time
   service like ntp/chrony/systemd-timesyncd to work together. The recent
   Debian changes (3.5-6) fixed a certain type of issues of the former
   solution (conflicts) but opened another issue after install until
   reboot where now multiple service are active.

 Overall this:
 * Fix autopkgtests to not have chrony installed (fighting with the chrony
   service started by the tests)
 * Fix autopkgtests to disable systemd-timesyncd to not fight over the
   clock
 * Reload systemd-timesyncd after chrony install to pick up the drop in
   config placed by chrony
 * Adding a new test that ensures this is detected earlier in the future
   in case it reoccurs (again)

[Test Case]

 * Install chrony, the state should be:
   - chrony running
   - systemd-timesyncd inactive
   => That should persist over reboots
   => In Ubuntu that even applies to containers

 * Check new (and old) autopkgtest results

[Regression Potential]

 * I don't think it is an issue , but I could see potential regressions in
   the restart of systemctl-timesyncd if people made customizations to it.
   Therefore we ignore the RC to not break chrony install - but it would
   be "please fix your config" as usual. Also important is that this would
   not be different to the situation these users would have had after a
   reboot.

[Other Info]

 * n/a

----

FYI Separated from bug 1867036

It seems to be racy/flaky.
It uses a comparison on "seconds" granularity.
I was adding debug to check the values it fails on.

It was somewhat reproducible in the bileto ticket (again autopkgtest infrastructure).
Therefore I inserted some debug code and re-ran it there.

Our test environment is too fast (sometimes) and hence the flakyness:

  checking chronyc output OK
    DEBUG before 1585747433
    DEBUG before 1585747433
  checking system clock BAD

That is checked with "lt" and here the results are ==.
I'll need to discuss if there is a strict reason for these to be non-equal.
Otherwise the fix might be as easy as using "-le" instead

Related branches

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (3.3 KiB)

  starting chronyd OK
  running chronyc dfreq 1e-3 OK
  checking chronyc output OK
    DATE 1 1585758870
  running chronyc doffset -30.0 OK
    DATE 2 1585758870
  checking chronyc output OK
    DATE 3 1585758870
  running chronyc makestep OK
    DATE 4 1585758870
  checking chronyc output OK
    DATE 5 1585758870
    DATE S1 1585758870
    DATE SET Wed Apr 1 16:34:35 UTC 2020
    DATE S2 1585758870
    DEBUG before 1585758870
    DEBUG after 1585758870
    DEBUG S1: * chrony.service - chrony, an NTP client/server
     Loaded: loaded (/lib/systemd/system/chrony.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2020-04-01 16:34:25 UTC; 4s ago
       Docs: man:chronyd(8)
             man:chronyc(1)
             man:chrony.conf(5)
   Main PID: 5499 (chronyd)
      Tasks: 2 (limit: 1660)
     Memory: 5.9M
     CGroup: /system.slice/chrony.service
             |-5499 /usr/sbin/chronyd -F -1
             `-5500 /usr/sbin/chronyd -F -1

Apr 01 16:34:25 autopkgtest systemd[1]: Starting chrony, an NTP client/server...
Apr 01 16:34:25 autopkgtest chronyd[5499]: chronyd version 3.5 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 -DEBUG)
Apr 01 16:34:25 autopkgtest chronyd[5499]: Loaded seccomp filter
Apr 01 16:34:25 autopkgtest systemd[1]: Started chrony, an NTP client/server.
    DEBUG S2: * systemd-timesyncd.service - Network Time Synchronization
     Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
    Drop-In: /usr/lib/systemd/system/systemd-timesyncd.service.d
             `-disable-with-time-daemon.conf
     Active: active (running) since Wed 2020-04-01 16:33:44 UTC; 45s ago
       Docs: man:systemd-timesyncd.service(8)
   Main PID: 602 (systemd-timesyn)
     Status: "Initial synchronization to time server 91.189.91.157:123 (ntp.ubuntu.com)."
      Tasks: 2 (limit: 1660)
     Memory: 8.0M
     CGroup: /system.slice/systemd-timesyncd.service
             `-602 /lib/systemd/systemd-timesyncd

Apr 01 16:33:43 autopkgtest systemd[1]: Starting Network Time Synchronization...
Apr 01 16:33:44 autopkgtest systemd[1]: Started Network Time Synchronization.
Apr 01 16:33:45 autopkgtest systemd-timesyncd[602]: Initial synchronization to time server 91.189.91.157:123 (ntp.ubuntu.com).
    DEBUG O1:
    DEBUG O2:
    DEBUG P1: 0 0 6057 6055 20 0 5056 1792 pipe_w S ? 0:00 \_ grep chronyd
5 111 5499 1 20 0 15104 4160 do_sel S ? 0:00 /usr/sbin/chronyd -F -1
1 111 5500 5499 20 0 7104 4224 skb_wa S ? 0:00 \_ /usr/sbin/chronyd -F -1
5 0 6028 1 20 0 6656 4160 do_sel S ? 0:00 ../../chronyd -l /tmp/autopkgtest.ChF59G/build.rot/src/test/system/tmp/chronyd.log -f /tmp/autopkgtest.ChF59G/build.rot/src/test/system/tmp/chronyd.conf -u root
  checking system clock ...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Installing chrony with timesyncd running (the default) is what triggers the "both running" situation.
I have fixed this in the past, something must have changed to open this issue again.

A restart on systemd-timesyncd would detect and stop it.

Must be due to:
  45 chrony (3.5-6) unstable; urgency=medium
  46
  47 * debian/chrony.service:
  48 - Don’t conflict with systemd-timesyncd.service.
  49 A few users complain that chronyd does not start at boot. The way the
  50 Conflict= directive works internally might cause both systemd-timesyncd
  51 and chronyd to be inactive at boot. So by relying solely on the
  52 disable-with-time-daemon.conf drop-in file provided by systemd, we should
  53 get rid of this malfunction while still preventing these two time daemons
  54 from being active at the same time. Kudos notably go to Santiago Vila for
  55 the report and providing SSH access to a GCE instance where the issue was
  56 reproducible and Michael Biebl for debugging. (Closes: #947936)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Chrony would need in postinst to "systemctl restart systemd-timesyncd.service" (ignoring RC in any case) to make sure timesyncd picks up the newly deployed drop-in config.
Otherwise (see above) both will be running.

And for the test I think we want to drop:
  Depends: @,
from d/t/control to avoid the systems daemon fighting with the tests daemon

I uploaded some preliminary fixes for that [1] to the PPA [2] to retest those.

[1]: https://code.launchpad.net/~paelzer/ubuntu/+source/chrony/+git/chrony/+ref/test-issues-focal
[2]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4000

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thi sstill is a bit crude due to all the debug code, but after ensuring all background time services are off the test now works.
I'll give this a cleanup and prep a proper MP and upload.

Also I'll make the description a SRU template to help the release-team accepting this into Focal before release (the test issue is annoying but not too bad - but the multiple services run issue is bad enough and needs a fix asap).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Changed in chrony (Ubuntu):
status: New → In Progress
summary: - chrony test 100-clockupdate flaky in autopkgtest
+ autopkgtest identified fighting time services
description: updated
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Without chrony installed some tests get skipped:
  104-systemdirs SKIP (missing /var/lib/chrony)
Thanks Vincent for making me aware of that.

So lets install, but stop it before the tests.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Tests good - MP and MR are now ready for review

Changed in chrony (Ubuntu):
importance: Undecided → High
importance: High → Critical
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

There are systemd changes coming which will split systemd-timesyncd and make it a conflicts.
That will fix part of this, but we need to make sure it arrives in Focal in time.

It just landed in Debian, so there we only took the test improvements to protect the tests against chrony itself running in the background. Time to check the systemd uploads in 20.04 if things are there or if we still need this upload as planned.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The "other bug" is bug 1849156 will fix the concurrent running services.

I have subscribed on that other bug and asked rbalint to speak up if he expect it to not land in time for 20.04. On our side I'll upload the chrony fix, but reduced to just the test fixes.

Those fixes are also already merged in Debian, so we can drop that Delta early in 20.10.

description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I uploaded the chrony fix for hte tests and I'll re-verify the behavior once I know of a new systemd version (they might need to test together to work smoothly).

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package chrony - 3.5-6ubuntu3

---------------
chrony (3.5-6ubuntu3) focal; urgency=medium

  * avoid multiple time services running concurrently (LP: #1870144).
    This fixes the autopkgtests vs chrond itself, the issue of concurrent
    systemd-timesyncd will be fixed in systemd by (LP 1849156)
    - d/t/upstream-system-tests: stop chrony/systemd-timesynd before tests
    - d/t/upstream-system-tests: fix stderr in case services do not exist

 -- Christian Ehrhardt <email address hidden> Wed, 01 Apr 2020 09:25:45 +0200

Changed in chrony (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.