[SRU] New libepoxy0 causes hang at boot for AMDGPU-PRO

Bug #1698233 reported by Robert M. Muncrief
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
libepoxy (Ubuntu)
Fix Released
High
Gianfranco Costamagna
Xenial
Fix Released
High
Gianfranco Costamagna
Zesty
Fix Released
High
Gianfranco Costamagna

Bug Description

[Impact]
* Updating libepoxy with the updates pocket (xenial) and the -proposed pocket (zesty) breaks systems with AMDGPU-PRO proprietary drivers

[Test Case]
* Have that driver/card

[Regression Potential]
* None, I'm disabling a patch that I introduced in the last SRU.
* Debian opened an RC bug and will fix the same in the next point release,
and I'm discussing with upstream to revert that patch (they should have already reverted it)
* I just updated it to artful and Debian will follow in a few hours I guess

[Other info]

When updating my Xubuntu 16.04.2 system today the upgrade from libepoxy0 1.3.1-1 to libepoxy0 1.3.1-1ubuntu0.16.04.1 caused my system to hang right before the login screen.

Unfortunately since this was a complete system lock up I can't provide any logs as I had to restore the entire root partition from backup. Since I can't get system recovery mode to work (it's been broken for a few years and times out within two minutes) it doesn't give me enough time gather any logs before it exits. I understand the system recovery problem is known, but I've tried all the solutions I could over the years, on multiple systems, and just can't get it to work without timing out anymore.

In any case I verified the problem was libepoxy0 by applying all the available updates except libexpoxy0 and rebooting without any problems. I then allowed the upgrade of libepoxy0, libepoxy0:i386, and libeepoxy-dev and the system hung at boot as previously described. I went through the same procedure twice to confirm.

I'm running kernel 4.9.31-040931-lowlatency with AMDGPU-PRO 16.50 patched drivers, with all system updates. If you need more information please contact me and I'll get you anything I can. But as I said, since the system locks up I'm not sure what else I can garner.

lsb_release -rd:
Description: Ubuntu 16.04.2 LTS
Release: 16.04

apt-cache policy libepoxy0:
libepoxy0:
  Installed: 1.3.1-1
  Candidate: 1.3.1-1ubuntu0.16.04.1
  Version table:
     1.3.1-1ubuntu0.16.04.1 500
        500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
 *** 1.3.1-1 500
        500 http://us.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
        100 /var/lib/dpkg/status

Revision history for this message
Michael (m-ad) wrote :

I can confirm this.

Linux Mint 18.1 Serena
Linux Kernel 4.4.0-79-generic
AMDGPU-PRO 16.40-348864

After libepoxy update from 1.3.1-1 to 1.3.1-1ubuntu0.16.04.1 I got a black screen on system boot. However, in contrast to the original submission, I was able to boot into recovery mode and downgrade libepoxy with
sudo apt-get install libepoxy0=libepoxy1.3.1-1

This solved the problem for me.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libepoxy (Ubuntu):
status: New → Confirmed
Revision history for this message
Robert M. Muncrief (rmuncrief-9) wrote :

Yes Micheal, that's all anyone confronted with this bug needs to do initially so they can boot.

After booting they should also open a terminal and issue one of the following commands:

For amd64 systems:
sudo apt-mark hold libepoxy-dev libepoxy0 libepoxy0:i386

For i386 systems:
sudo apt-mark hold libepoxy-dev libepoxy0

This will prevent libepoxy0 from being updated until this bug is fixed. Note that if you use programs like Synaptic libepoxy0 needs to be locked there as well.

And by the way, the only reason I wasn't able to use recovery mode initially was because I didn't know libepoxy0 was the culprit. About 15 updates appeared together, with several of them being for Mir which is what I initially suspected caused the problem. Since I wasn't sure what was wrong it took awhile for me to whittle it down to libepoxy0.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Does it hang without amdgpu-pro? Note that it's not shipped with ubuntu

Changed in libepoxy (Ubuntu):
assignee: nobody → Locutus (locutus)
assignee: Locutus (locutus) → LocutusOfBorg (costamagnagianfranco)
status: Confirmed → Incomplete
Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

Interestingly, I would add a question for you:
are you available to test some ppa libepoxy versions?
I admit, you are the first one reporting this issue, with a package that has thousand installations.

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

E.g. I uploaded a libepoxy that is the same as the xenial working one, just rebuilt on top of new dependencies.
I would like to ask you to test it, just to make sure the eventual regression is not coming from somewhere else
(e.g. a bug in the toolchain can be spot with a no-change rebuild)
please test
https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/locutusofborg-ppa/+packages
 libepoxy - 1.3.1-1ubuntu1

thanks!

Revision history for this message
Michael (m-ad) wrote :

@Timo: No, it only hangs with amdgpu-pro. I initially tried changing to amdgpu and the issue went away. I also tried the current amdgpu-pro 17.10, which also fixed this issue. Unfortunately both drivers don't work as they should on my system, so installing them isn't a long-term solution for me. But yeah, the problem is definitely related to the amdgpu-pro drivers in version 16.40 (me) and 16.50 (Robert).

@LocutusOfBorg: I tried your ppa and could upgrade to libepoxy 1.3.1-1ubuntu1 without observing the black screen on boot.

Revision history for this message
Robert M. Muncrief (rmuncrief-9) wrote :

I removed the libepoxy0 package holds, added the LocutusOfBorg repository, and did an apt-get update and apt-get dist-upgrade. However my system still hung on reboot. If I did something wrong please let me know.

Also the problem only occurs with AMDGPU-PRO, I should have been clear about that. Unfortunately I can't use the open source mesa though because I need OpenGL 4.5 compatibility mode and mesa doesn't support anything over 3.3. The developers say they don't have the resources to implement it, so anyone using Ubuntu who needs it will always have to use AMDGPU-PRO.

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

so, even a no-change rebuild is bad... this is completely unrelated to my fix then.
Will try to have a look tomorrow

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

the rebuild was fine for michael, so doubt it's a build env issue

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

@Robert, please dpkg -l |grep libepox
on the system that is not starting.
thanks

you are giving me two opposite answers, lets start from that :)

Revision history for this message
Robert M. Muncrief (rmuncrief-9) wrote :

@LocutusOfBorg, I tried adding your repository again today and received this error with sudo apt-get update:

Err:33 https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/locutusofborg-ppa/+packages xenial/main amd64 Packages
  404 Not Found

I don't know if this occurred before, but in any case I downloaded libepoxy-dev, libepoxy-dev:i386, libepoxy0, and libepoxy0:i386 from your repository and manually installed them with gdebi. I then placed the packages on hold and did a sudo apt-get update and sudo apt-get dist-upgrade.

After this the system booted without hanging, so I played No Man's Sky under wine for a few moments, as this is one of the things that requires OpenGL 4.5 compatibility mode. The game played fine, but after exiting it and Steam I tried opening Chrome and the system froze and the screen turned black.

However I rebooted the system and tried the same thing again a few times and it didn't crash. So I suspect there's some type of memory corruption that's difficult to replicate, but can't know for sure. In any case whatever you did solved the hang at boot problem, so you're correct about whatever you thought the problem was. Thank you for your work, and I appreciate you looking into a problem only a few of us have.

As to where to go from here, I don't know. The crux of the problem is that AMD abandoned their fglrx drivers before a replacement was ready, and threw those of us who had just bought $300+ graphics cards off the cliff. I'd like to abandon them now and get an Nvidia GPU, but at this moment there's a problem procuring medium to high end GPUs, of any brand, because of digital currency mining.

So if it's not a lot of effort and you know what the problem is and can fix it permanently, that would be awesome. On the other hand I know development resources are limited and I won't blame you if you decide it's not worth the effort because of the very few people it affects, and the fact that it's AMD's fault not Ubuntu's.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

I don't know which GPU you have, but the OSS driver should support OpenGL 4.5, at least with the updates ppa enabled (ppa:ubuntu-x-swat/updates) which has Mesa 17.1.2 for xenial and zesty.

Revision history for this message
Robert M. Muncrief (rmuncrief-9) wrote :

The Mesa drivers only support OpenGL 4.5 Core mode, they don't support OpenGL 4.5 Compatibility mode. It's odd but for some reason they only support OpenGL 3.3 Compatibility mode, and the developers have stated they don't intend on changing it. So games and other software that require higher Compatibility modes will never run on Mesa.

There are some environment variables, MESA_GLSL_VERSION_OVERRIDE and MESA_GL_VERSION_OVERRIDE, that make the drivers report a higher compatibility mode but since the underlying functionality is still missing they don't work.

In any case I have a Sapphire Nitro R9-390 so I'm trying to work out a deal a crypto-miner to exchange it for an Nvidia GPU. Unfortunately AMD is only fully functional for crypto-mining under Linux at this time, and since AMDGPU-PRO will be a mess for the next few years and the open source drivers will never be fully functional I'm giving up on AMD. Thank you for your help but I've lived with a crippled GPU for almost a year now and I have to acknowledge there's no viable option left other than surrender. It's not Ubuntu's fault, AMD simply abandoned its loyal customers.

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

>On the other hand I know development resources are limited and I won't blame you if you decide it's not worth the effort because of the very few people it affects, and the fact that it's AMD's fault not Ubuntu's.

this is something I don't understand, sorry.
I caused a regression, and I will fix it.
I don't care about how difficult it might be, or how many times I'll have to ask you to test a fix, but if you agree, lets trace this issue to the bottom, and then forward the issue upstream.

I did a new test build, since the difference between the working one and the bad one are 4 patches, and two of them are just null pointer checks, I disabled one of the two,
in particular:
"upstream/8d58c890646fc1f43bcab702bb9ed6bae94daefe.patch"

this patch changes a little bit the API to get the gl context, so maybe since the proprietary drivers are old, it might be causing your issue.

The other patch that might be source of troubles is this one:
debian/patches/upstream/b3b8bd9af7bf1fcfe544fd131f4d4f0d117ae7bc.patch

because of this change:

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

(no edit button, sigh)

-ret = client < server ? client : server

that changed to:
+ret = client <= server ? client : server

now, if server is equaul to client, there is no point in returning the former or the latter
(same value), so I would exclude this one too.

the package to test is uploaded on my ppa
 libepoxy - 1.3.1-1ubuntu1.16.04.2

and the url you have to add in sources.list to grab it is:
"deb http://ppa.launchpad.net/costamagnagianfranco/locutusofborg-ppa/ubuntu xenial main"

please test if possible, thanks!

Revision history for this message
Robert M. Muncrief (rmuncrief-9) wrote :

@LocutusOfBorg, this seems to work perfectly. I tried a few iterations of playing No Man's Sky, browsing with Chrome, and playing video with VLC and everything worked as expected. Thank you for the quick fix.

If anyone else needs this update before it's in the main repositories execute the following (substituting whatever text editor you use):

sudo mousepad /etc/apt/sources.list

Add:
deb http://ppa.launchpad.net/costamagnagianfranco/locutusofborg-ppa/ubuntu xenial main

sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 4E9F5DD9
sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 1BCB19E03C2A1859
sudo apt-get update
sudo apt-get install libepoxy0 libepoxy-dev libepoxy0:i386 libepoxy-dev:i386

Revision history for this message
Michael (m-ad) wrote :

I can confirm this, libepoxy 1.3.1-1ubuntu1.16.04.2 works for me too.

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

Can you please test libepoxy in this ppa?
https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/costamagnagianfranco-ppa/+packages

I upgraded to the latest 1.4 release, lets see if this is fixed upstream in a different way.

So, in case testing is good, I'll try to apply some more patch
if testing fails, I'll remove the above patch and re-upload into the archive.

In the meanwhile I'm opening an upstream issue, to trace down this problem

summary: - New libepoxy0 causes hang at boot
+ New libepoxy0 causes hang at boot for AMDGPU-PRO
Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote : Re: New libepoxy0 causes hang at boot for AMDGPU-PRO

https://github.com/anholt/libepoxy/issues/130
this is now the upstream issue

Revision history for this message
Michael (m-ad) wrote :

I tried the version from the new ppa (1.4.3-0ubuntu2) but unfortunately it re-introduced the black screen hang on boot. I reverted to 1.3.1-1 for now.

Revision history for this message
Robert M. Muncrief (rmuncrief-9) wrote :

I tried the 1.4.3-0ubuntu2 version as well and it also reintroduced the black screen hang on boot for me. I reverted back to 1.3.1-1ubuntu1.16.04.2.

Changed in libepoxy (Ubuntu):
importance: Undecided → High
status: Incomplete → Confirmed
Changed in libepoxy (Ubuntu Xenial):
status: New → Confirmed
summary: - New libepoxy0 causes hang at boot for AMDGPU-PRO
+ [SRU] New libepoxy0 causes hang at boot for AMDGPU-PRO
tags: added: regression-proposed regression-release
description: updated
Changed in libepoxy (Ubuntu Zesty):
status: New → Confirmed
Changed in libepoxy (Ubuntu Xenial):
importance: Undecided → High
Changed in libepoxy (Ubuntu Zesty):
importance: Undecided → High
Changed in libepoxy (Ubuntu Xenial):
assignee: nobody → LocutusOfBorg (costamagnagianfranco)
Changed in libepoxy (Ubuntu Zesty):
assignee: nobody → LocutusOfBorg (costamagnagianfranco)
Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

Hello, can you please followup with the upstream issue?
Of course I'll be able to do test-builds on top of the new 1.4* release, even if I think this issue should be fixed by reverting this particular patch

https://github.com/anholt/libepoxy/issues/130

thanks, your contribution will solve this issue in a better way, because otherwise the new release will need to carry a patch and introduce possible issues somewhere else

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Robert, or anyone else affected,

Accepted libepoxy into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libepoxy/1.3.1-1ubuntu1.17.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in libepoxy (Ubuntu):
status: Confirmed → Fix Released
Changed in libepoxy (Ubuntu Zesty):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-zesty
Changed in libepoxy (Ubuntu Xenial):
status: Confirmed → Fix Committed
tags: added: verification-needed-xenial
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Robert, or anyone else affected,

Accepted libepoxy into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libepoxy/1.3.1-1ubuntu0.16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

zesty seems to work correctly, Robert, can you please test xenial?'

tags: added: verification-done-zesty
removed: verification-needed-zesty
Revision history for this message
Robert M. Muncrief (rmuncrief-9) wrote :

I tested libepoxy0_1.3.1-1ubuntu0.16.04.2 amd64 and i386 as requested and it works without error. My system is Xubuntu 16.04.2 with kernel 4.9.35-lowlatency. Thank you for your excellent work!

tags: added: verification-done-xenial
removed: verification-needed-xenial
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libepoxy - 1.3.1-1ubuntu0.16.04.2

---------------
libepoxy (1.3.1-1ubuntu0.16.04.2) xenial; urgency=medium

  * Disable patch 8d58c890646fc1f43bcab702bb9ed6bae94daefe:
    - this patch is causing some troubles on proprietary drivers.
      LP: #1698233

 -- Gianfranco Costamagna <email address hidden> Sat, 24 Jun 2017 00:02:18 +0200

Changed in libepoxy (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for libepoxy has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libepoxy - 1.3.1-1ubuntu1.17.04.2

---------------
libepoxy (1.3.1-1ubuntu1.17.04.2) zesty; urgency=medium

  * Disable patch 8d58c890646fc1f43bcab702bb9ed6bae94daefe:
    - this patch is causing some troubles on proprietary drivers.
      LP: #1698233

 -- Gianfranco Costamagna <email address hidden> Sat, 24 Jun 2017 00:00:55 +0200

Changed in libepoxy (Ubuntu Zesty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.