glibc tst-getrandom test needs more entropy causing test failures

Bug #1891403 reported by Balint Reczey
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Auto Package Testing
Fix Committed
Undecided
Unassigned
launchpad-buildd
Triaged
Critical
Unassigned
glibc (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

 * Builds and autpkgtests frequently fail due to failing tst-getrandom probably due to insufficient source of entropy on Launchpad. This causes extra work by requiring retriggering failing builds and autopkgtests.

[Test Case]

* Observe tst-getrandom tests being marked XFAIL in build logs.

[Regression Potential]

* The fix does not change run-time behaviour of glibc, just ignores a test. This lets potential breakages in getrandom()'s implementation or in kernel's random source being undetected, but the test results can still be observed in the logs if needed and the random implementation is also most likely exercised in other tests, too.

[Other Info]

The issue can be reproduced locally in a built glibc package tree without haveged or any other good entropy source installed:

rbalint@gaia:~/projects/deb/build-area/glibc-2.31$ for i in $(seq 1000); do ./build-tree/amd64-libc/stdlib/tst-getrandom; done
Timed out: killed the child process
Termination time: 2020-08-12T21:05:28.234517252
Last write to standard output: 2020-08-12T21:05:04.069616637
Timed out: killed the child process
Termination time: 2020-08-12T21:05:48.337084141
Last write to standard output: 2020-08-12T21:05:28.069616637
Timed out: killed the child process
Termination time: 2020-08-12T21:06:08.439695563
Last write to standard output: 2020-08-12T21:05:48.069616637
Timed out: killed the child process
Termination time: 2020-08-12T21:06:28.542962616
Last write to standard output: 2020-08-12T21:06:08.069616637
Timed out: killed the child process
...

With haveged installed the tests pass quickly and without error:

rbalint@gaia:~/projects/deb/build-area/glibc-2.31$ for i in $(seq 1000); do ./build-tree/amd64-libc/stdlib/tst-getrandom; done
rbalint@gaia:~/projects/deb/build-area/glibc-2.31$

An other way of fixing the issue would be installing better entropy sources in the build and autopkgtest infrastructure.

CVE References

Balint Reczey (rbalint)
description: updated
Balint Reczey (rbalint)
summary: - glibc tests needs more entropy causing tst-getrandom failures
+ glibc tst-getrandom test needs more entropy causing test failures
Revision history for this message
Julian Andres Klode (juliank) wrote :

 I know qemu has some entropy forwarding stuff and I wonder if that could be used in the cloud so instances have enough entropy to fix this

e.g.

              -object rng-random,id=id,filename=/dev/random
                     Creates a random number generator backend which obtains entropy from a device on the host. The id parameter is a unique ID that will be used to reference this entropy backend from the virtio-rng device. The filename parameter specifies which file to ob‐
                     tain entropy from and if omitted defaults to /dev/urandom.

I think openstack has support for that, and this maybe should be toggled on? Might be worth checking with IS and security.

Revision history for this message
Julian Andres Klode (juliank) wrote :

So we pass RNG through to our scalingstack instances, but we need to install rng-tools in the VMs to feed that to /dev/random.

We're a bit confused about the haveged comment, as haveged is installed in autopkgtest images, so if that fixes it, why do you still see an issue?

Revision history for this message
Julian Andres Klode (juliank) wrote :

We have now enabled installing rng-tools for autopkgtest (and removing haveged), by rebasing on latest git master, so this is fixed from our side. If this turns out to not produce enough entropy, I guess the hosts need more entropy.

Changed in auto-package-testing:
status: New → Fix Released
status: Fix Released → Fix Committed
Revision history for this message
Balint Reczey (rbalint) wrote :

@juliank As I read autopkgtest package's changelog haveged used to be installed but it was dropped in 5.11.
I used haveged in local tests and it seems to fix the test as expected and I think the scalingstack install needs adjusting. I suspect there is enough entropy for most of the things, the glibc test just times out when it can't get enough randomness quickly.

Revision history for this message
Iain Lane (laney) wrote :

haveged was installed all along for autopkgtests. For package builds, I don't think that's the case.

A link to a failing autopkgtest would be useful I think.

Julian added a launchpad-buildd task - the question there is if you want to install rng-tools too, to use the host's /dev/hwrng to feed /dev/random for package builds. I guess that's not strictly for launchpad-buildd but for the cloud images you're using?

Revision history for this message
Balint Reczey (rbalint) wrote :

I have this PPA build log: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4017/+build/19810464

I had other failures, just retried them.

I _think_ I have seen it in autopkgtest as well, but I may be wrong and only builders are affected.
I started looking this up in autopkgtest logs but downloads are very slow.
Should I reassign it to launchpad to get the builders fixed?

Revision history for this message
Balint Reczey (rbalint) wrote :

Ah, launchpad-buildd is already affected.

Ioana Lasc (ilasc)
Changed in launchpad-buildd:
importance: Undecided → Critical
status: New → Triaged
Revision history for this message
Julian Andres Klode (juliank) wrote :

This "needs" https://portal.admin.canonical.com/C127295 fixed and then installing rng-tools in the images should do the right thing.

Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Balint, or anyone else affected,

Accepted glibc into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/glibc/2.31-0ubuntu9.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in glibc (Ubuntu Focal):
status: New → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.31-0ubuntu11

---------------
glibc (2.31-0ubuntu11) groovy; urgency=medium

  [ Michael Hudson-Doyle ]
  * Mark tst-getpw as XFAIL on arm64. (LP: #1869364)

  [ Balint Reczey ]
  * debian/control: Add Vcs-* pointing to Ubuntu packaging repository
  * debian/gbp.conf: Add initial configuration
  * debian/debhelper.in/libc.preinst: Fix setting LDCONFIG_NOTRIGGER
  * debian/control.in/main: Add Vcs-* pointing to Ubuntu packaging repository
  * Fall back to calling nanosleep syscall when __clock_nanosleep returns EINVAL
    (LP: #1871240)
  * debian/testsuite-xfail-debian.mk: XFAIL stdlib/tst-strtod-round on riscv64
  * debian/testsuite-xfail-debian.mk: XFAIL tst-getpw on armhf, too
  * debian/watch: Use HTTPS and download xz-compressed tarball
  * debian/watch: Use upstream's signing key to verify the tarball
  * XFAIL stdlib/tst-getrandom (LP: #1891403)

  [ Dimitri John Ledkov ]
  * debian/patches/powerpc: Cherrypick upstream patches to support POWER10
    optimized library loading. (LP: #1887989)

  [ Aurelien Jarno ]
  * debian/patches/any/submitted-selinux-deprecations.diff: proposed patch to
    ignore the selinux deprecations introduced in libselinux (>= 3.1), fixing
    an FTBFS. (Closes: #965941)

 -- Balint Reczey <email address hidden> Thu, 27 Aug 2020 23:36:39 +0200

Changed in glibc (Ubuntu):
status: New → Fix Released
Revision history for this message
Balint Reczey (rbalint) wrote :

Verified 2.31-0ubuntu9.1 on Focal:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-focal/focal/amd64/g/glibc/20200903_000448_09fd7@/log.gz

zgrep tst-getrandom log.gz | grep X
XPASS: stdlib/tst-getrandom
XPASS: stdlib/tst-getrandom
XPASS: stdlib/tst-getrandom
XPASS: stdlib/tst-getrandom
XPASS: stdlib/tst-getrandom
XPASS: stdlib/tst-getrandom

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Balint, or anyone else affected,

Accepted glibc into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/glibc/2.27-3ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in glibc (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed verification-needed-bionic
removed: verification-done
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (glibc/2.27-3ubuntu1.3)
Download full text (6.9 KiB)

All autopkgtests for the newly accepted glibc (2.27-3ubuntu1.3) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

mysql-5.7/5.7.31-0ubuntu0.18.04.1 (armhf)
libsys-utmp-perl/1.8-1 (armhf)
libscope-upper-perl/0.30-1 (armhf)
octave-miscellaneous/1.2.1-4 (armhf, arm64, s390x, amd64, i386, ppc64el)
libsocket-multicast6-perl/unknown (armhf)
octave-strings/1.2.0-3 (armhf, arm64, s390x, amd64, i386, ppc64el)
libgnatcoll/unknown (armhf)
octave-econometrics/1:1.1.1-5 (armhf, arm64, s390x, amd64, i386, ppc64el)
octave-secs2d/0.0.8-9 (armhf, arm64, s390x, amd64, i386, ppc64el)
libb-hooks-parser-perl/unknown (armhf)
octave-general/2.0.0-3 (armhf, arm64, s390x, amd64, i386, ppc64el)
libcompress-raw-bzip2-perl/2.074-1build2 (armhf)
libunicode-casefold-perl/unknown (armhf)
mod-wsgi/4.5.17-1ubuntu1 (ppc64el)
libdata-alias-perl/unknown (armhf)
libdata-clone-perl/unknown (armhf)
libsort-key-perl/unknown (armhf)
linux-raspi-5.4/5.4.0-1018.20~18.04.1 (armhf)
ann/unknown (armhf)
icecast2/unknown (i386)
python-maxminddb/1.3.0-1 (armhf)
lua-torch-sundown/unknown (armhf)
libkf5mailcommon/4:17.12.3-0ubuntu1 (arm64, i386)
apport/2.20.9-0ubuntu7.17 (amd64)
linux-hwe-5.0/5.0.0-61.65 (armhf)
ffmpeg/7:3.4.8-0ubuntu0.2 (armhf, arm64, s390x, amd64, i386, ppc64el)
glibc/2.27-3ubuntu1.3 (armhf)
nut/2.7.4-5.1ubuntu2 (amd64)
mbuffer/unknown (armhf)
linux-aws-edge/5.0.0-1019.21~18.04.1 (amd64, arm64)
octave-ocs/0.1.5-6 (armhf, arm64, s390x, amd64, i386, ppc64el)
libx11-xcb-perl/unknown (armhf)
pgbouncer/1.8.1-1build1 (amd64)
indicator-session/17.3.20+17.10.20171006-0ubuntu1 (armhf)
gcc-6/6.5.0-2ubuntu1~18.04 (armhf)
vmtouch/unknown (armhf)
libhtml-gumbo-perl/0.17-1build1 (ppc64el)
octave-sparsersb/1.0.5-3 (armhf, arm64, s390x, amd64, i386, ppc64el)
octave-mpi/1.2.0-4 (armhf, arm64, s390x, amd64, i386, ppc64el)
libalgorithm-svm-perl/0.13-2build2 (s390x)
libconvert-binary-c-perl/0.78-1build2 (amd64)
kauth/5.44.0-0ubuntu1 (i386)
libkdegames-kde4/unknown (amd64)
openssh/1:7.6p1-4ubuntu0.3 (armhf, arm64, s390x, amd64, i386, ppc64el)
keditbookmarks/17.12.3-0ubuntu1 (ppc64el)
jovie/unknown (armhf)
kdepim-runtime/4:17.12.3-0ubuntu2 (armhf)
libscalar-util-numeric-perl/0.40-1build3 (s390x)
pgpdump/unknown (armhf)
libdevice-cdio-perl/0.4.0-3 (armhf)
octave-sockets/1.2.0-3 (armhf, arm64, s390x, amd64, i386, ppc64el)
octave-gsl/2.1.0-3 (armhf, arm64, s390x, amd64, i386, ppc64el)
libdbd-odbc-perl/1.56-1build1 (armhf)
libnet-dbus-perl/1.1.0-4build2 (armhf)
linux-aws-5.3/unknown (arm64)
libalgorithm-permute-perl/0.16-1 (s390x)
xdg-desktop-portal/1.0.3-0ubuntu0.2 (i386, ppc64el)
octave-ltfat/2.2.0+dfsg-7 (s390x, amd64, i386, ppc64el)
octave-geometry/3.0.0-6 (armhf, arm64, s390x, amd64, i386, ppc64el)
octave-linear-algebra/2.2.2-4 (armhf, arm64, s390x, amd64, i386, ppc64el)
octave-nurbs/1.3.13-4 (armhf, arm64, s390x, amd64, i386, ppc64el)
devscripts/2.17.12ubuntu1.1 (armhf, arm64, s390x, amd64, i386, ppc64el)
meliae/0.4.0+bzr199-3build1 (ppc64el)
libocas/unknown (armhf)
k3d/unknown (armhf)
firefox/80.0.1+build1-0ubuntu0.18.04.1 (armhf)
libb-hooks-op-check-perl/unknown (armhf)
octave-quaternion/2.4.0-4 (armhf, arm64, s390x, amd64, i38...

Read more...

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (glibc/2.31-0ubuntu9.1)

All autopkgtests for the newly accepted glibc (2.31-0ubuntu9.1) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

prometheus-blackbox-exporter/0.13.0+ds-2 (armhf, amd64, ppc64el, s390x, arm64)
prometheus-pushgateway/1.0.0+ds-1 (armhf, amd64, ppc64el, s390x, arm64)
systemd/245.4-4ubuntu3.2 (s390x, amd64, ppc64el)
gfs2-utils/unknown (amd64)
hugo/0.68.3-1 (armhf, amd64, ppc64el, s390x, arm64)
grubzfs-testsuite/0.4.10 (amd64)
glibc/2.31-0ubuntu9.1 (armhf)
badger/2.0.1-3 (armhf, amd64, ppc64el, s390x, arm64)
resource-agents/1:4.5.0-2ubuntu2 (armhf)
etcd/3.2.26+dfsg-6 (amd64, ppc64el)
postgresql-multicorn/1.3.4-31-g9ff7875-3 (armhf, amd64, ppc64el, s390x)
gfs2-utils/3.2.0-3 (ppc64el, s390x, arm64)
scipy/1.3.3-3build1 (ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#glibc

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.31-0ubuntu9.1

---------------
glibc (2.31-0ubuntu9.1) focal; urgency=medium

  [ Michael Hudson-Doyle ]
  * Mark tst-getpw as XFAIL on arm64. (LP: #1869364)

  [ Matthias Klose ]
  * Copy the fully conditionalized x86 variant for math-vector-fortran.h
    to /usr/include/finclude. On all architectures. (LP: #1879092)

  [ Balint Reczey ]
  * debian/gbp.conf: Add initial configuration
  * debian/control.in/main: Add Vcs-* pointing to Ubuntu packaging repository
  * debian/debhelper.in/libc.preinst: Fix setting LDCONFIG_NOTRIGGER
    (LP: #1889190)
  * Fall back to calling nanosleep syscall when __clock_nanosleep returns
    EINVAL due to CLOCK_REALTIME not being supported (LP: #1871129)
  * debian/testsuite-xfail-debian.mk: XFAIL tst-getpw on armhf, too
    (LP: #1869364)
  * XFAIL stdlib/tst-getrandom (LP: #1891403)

  [ Dimitri John Ledkov ]
  * debian/patches/powerpc: Cherrypick upstream patches to support POWER10
    optimized library loading. LP: #1887989

 -- Balint Reczey <email address hidden> Mon, 17 Aug 2020 22:02:52 +0200

Changed in glibc (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for glibc has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Balint Reczey (rbalint) wrote :

Verified with 2.27-3ubuntu1.3 on Bionic:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-bionic/bionic/arm64/g/glibc/20200912_032704_94e70@/log.gz

$ zgrep tst-getrandom log.gz | grep X
XPASS: stdlib/tst-getrandom
XPASS: stdlib/tst-getrandom

tags: added: verification-done verification-done-bionic
removed: verification-needed verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.27-3ubuntu1.3

---------------
glibc (2.27-3ubuntu1.3) bionic; urgency=medium

  [ Balint Reczey ]
  * debian/gbp.conf: Add initial configuration
  * debian/control.in/main: Add Vcs-* pointing to Ubuntu packaging repository
  * arm64: Enable searching shared libraries in atomics/ on LSE HW
  * Ship arm64 variant with LSE support in libc6-lse (LP: #1885012)
  * Run tests of libc6-lse on HW supporting LSE
  * debian/patches/git-updates.diff: update from upstream stable branch
    - pthread_cond_broadcast: Fix waiters-after-spinning case
    - Fix SSe2-based memmove corrupting memory (CVE-2017-18269)
    - Fix strstr() performance regression on Haswell processors
    - Support Japanese new era "令和 (Reiwa)"
    - io: Remove copy_file_range emulation
    (LP: #1851263, #1858203, #1838327, #1797335, #1756209, #1853193)
  * XFAIL stdlib/tst-getrandom (LP: #1891403)
  * debian/testsuite-xfail-debian.mk: XFAIL new tst-support_descriptors

  [ Thadeu Lima de Souza Cascardo ]
  * tests: Make preadwritev2 invalid flags tests unsupported (LP: #1770480)

  [ Andreas Hasenack ]
  * branch-pthread_rwlock_trywrlock-hang-23844.patch:
    nptl: Fix pthread_rwlock_try*lock stalls (Bug 23844) (LP: #1864864)

 -- Balint Reczey <email address hidden> Wed, 02 Sep 2020 11:18:37 +0200

Changed in glibc (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.