FTBFS on powerpc

Bug #1570055 reported by Barry Warsaw
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Invalid
Undecided
Unassigned
lua5.1 (Ubuntu)
Fix Released
Undecided
Barry Warsaw
lua5.2 (Ubuntu)
Fix Released
Undecided
Barry Warsaw
lua5.3 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

On Matthias's archive rebuild for Xenial, we see build failures for lua5.1, lua5.2, and lua5.3 on powerpc:

http://people.ubuntuwire.org/~wgrant/rebuild-ftbfs-test/test-rebuild-20160401-xenial.html

I've done some investigation, but not found the source of the problems. Trying to capture what I know here.

The build fails because part of the build tries to run `src/lua5.3 -v` and that segfaults. It crashes after main() exits in _IO_wsetb() in glibc's wgenops.c:105. This is because f->_wide_data points to bogus data. Setting a breakpoint in main() doesn't help because the data is already corrupted by then. Setting a breakpoint in _start or _init and then a watchpoint on this point shows that it gets corrupted in _IO_check_libio() in glibc's oldstdfiles.c. We then thought the likely culprit was compilation with g++ but linkage with gcc, however fixing that to compile and link *everything* with g++ doesn't solve the problem. This is the change we made in 5.3.1-1ubuntu1, which can be thrown away.

But the problem *is* related to lua5.3's d/patches/0001-build-system.patch because if you remove that from the quilt stack, you end up with a src/lua (not version numbered) for which `lua -v` doesn't segfault.

My only other thought was that maybe libtool was corrupting things, but I wasn't able to prove that. I tried various other transformations of that patch without success.

Tags: patch
Revision history for this message
Steve Langasek (vorlon) wrote :

I've tested building this with either all gcc or all g++, and the result is the same.

At this point I suspect a regression in glibc 2.23. I don't think lua is doing anything particularly fancy here, yet at the time the binary is built, stdout points to a place in memory (possibly read-only) that's too small for a struct _IO_FILE_complete, yet glibc nevertheless tries to cast to this in the destructor and then gets very sad.

It seems that it is possible to work around this by not passing the flags '-Wl,--version-script -Wl,../debian/version-script' to the link command.

Comparing the symbols between the binaries linked with and without this version script shows the following important difference:
  10030ab0 g DO .rodata 00000004 Base _IO_stdin_used
Seems like we probably want the executable to not hide that symbol.

OTOH seems like hiding that symbol shouldn't break the binary? which is happening only on powerpc.

Revision history for this message
Steve Langasek (vorlon) wrote :

Patch that fixes lua5.3; should be applied also to lua5.{1,2}.

Changed in lua5.1 (Ubuntu):
assignee: nobody → Barry Warsaw (barry)
status: New → Triaged
Changed in lua5.2 (Ubuntu):
assignee: nobody → Barry Warsaw (barry)
status: New → Triaged
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lua5.3 - 5.3.1-1ubuntu2

---------------
lua5.3 (5.3.1-1ubuntu2) xenial; urgency=medium

  * Revert changes from previous upload, not needed and don't fix the issue.
  * debian/patches/0001-build-system.patch: never use libtool --quiet
    which is always the wrong answer for package builds.
  * debian/patches/0001-build-system.patch: do not pass a version script when
    building executables; this is meant for libraries, and its use appears to
    be breaking the build on powerpc (probably due to a toolchain regression).
    LP: #1570055.

 -- Steve Langasek <email address hidden> Wed, 13 Apr 2016 18:09:50 -0700

Changed in lua5.3 (Ubuntu):
status: New → Fix Released
tags: added: patch
Revision history for this message
Steve Langasek (vorlon) wrote :

sorry Barry, swiping this from you to unblock buildability of lua-* stuff in -proposed.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lua5.1 - 5.1.5-8ubuntu1

---------------
lua5.1 (5.1.5-8ubuntu1) xenial; urgency=medium

  * debian/patches/0001-debian_make.patch: never use libtool --quiet
    which is always the wrong answer for package builds.
  * debian/patches/0001-debian_make.patch: do not pass a version script when
    building executables; this is meant for libraries, and its use appears to
    be breaking the build on powerpc (probably due to a toolchain regression).
    LP: #1570055.

 -- Steve Langasek <email address hidden> Wed, 13 Apr 2016 22:08:12 -0700

Changed in lua5.1 (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lua5.2 - 5.2.4-1ubuntu1

---------------
lua5.2 (5.2.4-1ubuntu1) xenial; urgency=medium

  * debian/patches/0001-build-system.patch: never use libtool --quiet
    which is always the wrong answer for package builds.
  * debian/patches/0001-build-system.patch: do not pass a version script when
    building executables; this is meant for libraries, and its use appears to
    be breaking the build on powerpc (probably due to a toolchain regression).
    LP: #1570055.

 -- Steve Langasek <email address hidden> Wed, 13 Apr 2016 22:05:31 -0700

Changed in lua5.2 (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Paul Gevers (paul-climbing) wrote :

Alexander Shishkin suggested that this bug and the FTBFS of fpc¹ ² on powerpc are really the same bug in glibc because both of them segfault during program exit in _IO_wsetb after output to stdout.

¹ https://bugs.launchpad.net/ubuntu/+source/fpc/+bug/1562480
² http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=826300

Revision history for this message
John Paul Adrian Glaubitz (glaubitz) wrote :

Hi!

Just a heads-up, I'm currently looking into the FTBFS of firebird3.0 [1] on powerpc and the backtrace seems to tell the same story:

(sid_powerpc-dchroot)glaubitz@partch:~/firebird3.0-3.0.1.32609.ds4$ gdb /home/glaubitz/firebird3.0-3.0.1.32609.ds4/gen/Release/firebird/bin/gpre_boot
GNU gdb (Debian 7.11.1-2) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/glaubitz/firebird3.0-3.0.1.32609.ds4/gen/Release/firebird/bin/gpre_boot...done.
(gdb) run
Starting program: /home/glaubitz/firebird3.0-3.0.1.32609.ds4/gen/Release/firebird/bin/gpre_boot
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/powerpc-linux-gnu/libthread_db.so.1".
gpre: no source file named.

Program received signal SIGSEGV, Segmentation fault.
0x0fa79960 in _IO_wsetb () from /lib/powerpc-linux-gnu/libc.so.6
(gdb) bt
#0 0x0fa79960 in _IO_wsetb () from /lib/powerpc-linux-gnu/libc.so.6
#1 0x0fa88dac in ?? () from /lib/powerpc-linux-gnu/libc.so.6
#2 0x0fa3cd58 in ?? () from /lib/powerpc-linux-gnu/libc.so.6
#3 0x0fa3ce30 in exit () from /lib/powerpc-linux-gnu/libc.so.6
#4 0x10027f28 in CPR_exit (stat=263831632) at ./src/gpre/gpre.cpp:978
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

Anyone has reported this to glibc upstrem yet?

Adrian

> [1] https://buildd.debian.org/status/package.php?p=firebird3.0&suite=sid

Revision history for this message
John Paul Adrian Glaubitz (glaubitz) wrote :

Ok, the issue with firebird3.0 in Debian has been resolved, see [1]. And, this isn't actually a toolchain or glibc regression, the behavior is by design, see [2].

To cut a long story short, when using version scripts, an application **must** always export the _IO_stdin_used symbol as it is used by glibc to determine the libio ABI.

Adrian

> [1] https://bugs.debian.org/840666
> [2] http://lists.gnu.org/archive/html/bug-glibc/2001-12/msg00203.html

Revision history for this message
Steve Langasek (vorlon) wrote :

Thanks for the upstream reference, closing the glibc task as invalid.

Changed in glibc (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.