LTO overflows with 4.9 but succeeds with 4.8

Asked by Uwe Bonnes

Hello,

compiling the airspy firmware from
https://github.com/airspy/firmware.git
succeeds with 4.8 but fails with 4.9
/usr/local/cross/gcc-arm-none-eabi-4_9-2014q4/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: airspy_m0.elf section `.bss' will not fit in region `ram_ahb2'
/usr/local/cross/gcc-arm-none-eabi-4_9-2014q4/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: region `ram_ahb2' overflowed by 1476 bytes

This is probably due to the use of LTO.

When reproducing : I also had some problems with the makefile that tries to generate a version tag with python. Using the system python-GitPython on OpenSuse versus the proposed "python -m pip install GitPython" download solved that obstacle.

Question information

Language:
English Edit question
Status:
Answered
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
Terry Guo Edit question
Last query:
Last reply:
Revision history for this message
Terry Guo (terry.guo) said :
#1

I can reproduce this.

Revision history for this message
Terry Guo (terry.guo) said :
#2

The link step in the project Makefile also needs option "-flto -ffunction-sections -fdata-section" in order to use LTO properly.

This project has a big array with a big align requirement:

/* - must be aligned on 64-byte boundaries. */
typedef struct {
        volatile uint32_t capabilities;
        volatile usb_transfer_descriptor_t *current_dtd_pointer;
        volatile usb_transfer_descriptor_t *next_dtd_pointer;
        volatile uint32_t total_bytes;
        volatile uint32_t buffer_pointer_page[5];
        volatile uint32_t _reserved_0;
        volatile uint8_t setup[8];
        volatile uint32_t _reserved_1[4];
} usb_queue_head_t;

usb_queue_head_t usb_qh[12] ATTR_ALIGNED(2048);

From the final map file of LTO, I'd like to say there are problems to handle above array in bss section:

.bss.set_freq_params
                0x000000002000f004 0x4 /tmp/ccgcenLk.ltrans0.ltrans.o
                0x000000002000f004 set_freq_params
 *fill* 0x000000002000f008 0x7f8
 .bss.usb_qh 0x000000002000f800 0x800 /tmp/ccgcenLk.ltrans0.ltrans.o
                0x000000002000f800 usb_qh
 .bss.vendor_request_handler
                0x0000000020010000 0x64 /tmp/ccgcenLk.ltrans0.ltrans.o

1). In order to meet big align requirement, a rather large space 0x7f8 is wasted. But it is not lethal, it is just a chance I think we can improve in linker. Maybe the linker can rearrange those blocks to reduce what we have to waste.

2). The wrong size of usb_qh 0x800 causes the overflow issue. The size of usb_qh should be 0x300(768 bytes in decimal). I think the LTO must be confused with the big align requirement. I will dig more on this.

Revision history for this message
Uwe Bonnes (bon) said :
#3

Thanks for taking care!

As I am only a user of above project, please let me know if there is any need for me to communicate with the author.

Revision history for this message
Terry Guo (terry.guo) said :
#4

Since below array requires a big alignment, sometimes we have to waste some bytes to meet the align requirement.

usb_queue_head_t usb_qh[12] ATTR_ALIGNED(2048);

But if its preceeding section ends at a very good aligned address like 0xf000, then we can put this array right after its preceeding section. No wastes at all. But if its preceeding section ends at a very bad address like 0xf001, then we have to waste bytes to get an aligned address.

Without -fdata-sections, the array usb_qh is put into bss section as a block and its actual size will be 0x300 bytes, no changes. With -fdata-sections, the usb_qh will be put into a separated section and gas will promot the section size from 0x300 to 0x800 bytes. I still don't know the reason.

IMHO linker is not good at re-organzing the sections with different align requirements to minimize the waste. So for your case, please do below two things to help linker and avoid the overflow:

1) The link step in the project Makefile also needs option "-flto -ffunction-sections -fdata-section" in order to use LTO properly.

2) Do not comment out below line, we do need it.
//__attribute__ ((section(".bss_aligned")))
usb_queue_head_t usb_qh[12] ATTR_ALIGNED(2048);

Revision history for this message
Bill Dittmann (bill-dittmann) said :
#5

note than you can minimize memory wastage by the linker by telling it to sort sections by alignment in your .ld file as shown below

  *(SORT_BY_ALIGNMENT(.bss*))

or passing to GCC

-Wl.--sort-section=alignment

Revision history for this message
Terry Guo (terry.guo) said :
#6

I did tried this option but it caused other issue.

Revision history for this message
Terry Guo (terry.guo) said :
#7

Here is the output with the option:

arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -DTHUMB -flto -ffunction-sections -fdata-sections -L../common -L../libopencm3/lib -L../libopencm3/lib/lpc43xx_m0 -T../common/LPC4370_M0_ram_only.ld -nostartfiles -lc -lnosys -Wl,--gc-sections -Xlinker -Map=airspy_m0.map -Wl,--sort-section=alignment -Wl,--print-gc-sections -o airspy_m0.elf airspy_m0.o airspy_rx.o airspy_usb_req.o usb_descriptor.o usb_device.o usb_endpoint.o airspy_m0_conf.o airspy_core.o fault_handler.o si5351c.o r820t.o r820t_conf.o w25q80bv.o rom_iap.o signal_mcu.o usb.o usb_queue.o usb_request.o usb_standard_request.o -lopencm3_lpc43xx_m0
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.ssp_disable' in file '../libopencm3/lib/libopencm3_lpc43xx_m0.a(ssp.o)'
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.nvic_disable_irq' in file '../libopencm3/lib/libopencm3_lpc43xx_m0.a(nvic.o)'
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.nvic_get_pending_irq' in file '../libopencm3/lib/libopencm3_lpc43xx_m0.a(nvic.o)'
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.nvic_set_pending_irq' in file '../libopencm3/lib/libopencm3_lpc43xx_m0.a(nvic.o)'
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.nvic_clear_pending_irq' in file '../libopencm3/lib/libopencm3_lpc43xx_m0.a(nvic.o)'
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.nvic_get_irq_enabled' in file '../libopencm3/lib/libopencm3_lpc43xx_m0.a(nvic.o)'
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.usb0_isr' in file '/tmp/ccHMeRIb.ltrans1.ltrans.o'
/usr/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/bin/ld: Removing unused section '.text.m4core_isr' in file '/tmp/ccHMeRIb.ltrans2.ltrans.o'
/tmp/ccHMeRIb.ltrans0.ltrans.o: In function `hard_fault_handler':
ccHMeRIb.ltrans0.o:(.text.hard_fault_handler+0xc): relocation truncated to fit: R_ARM_THM_JUMP11 against symbol `hard_fault_handler_c' defined in .text.hard_fault_handler_c section in /tmp/ccHMeRIb.ltrans0.ltrans.o
/tmp/ccHMeRIb.ltrans0.ltrans.o: In function `_MSP':
ccHMeRIb.ltrans0.o:(.text.hard_fault_handler+0x12): relocation truncated to fit: R_ARM_THM_JUMP11 against symbol `hard_fault_handler_c' defined in .text.hard_fault_handler_c section in /tmp/ccHMeRIb.ltrans0.ltrans.o
collect2: error: ld returned 1 exit status
make[1]: *** [airspy_m0.elf] Error 1
make[1]: Leaving directory `/work/terguo01/tasks/lp-262160/firmware/airspy_m0'
make: *** [airspy_fw] Error 2

Revision history for this message
Terry Guo (terry.guo) said :
#8

But this method works:

 *(SORT_BY_ALIGNMENT(.bss*))

The section with big align requirement is smartly placed at the beginning of bss section which is aligned already. So no wastage at all.

.bss 0x000000002000e800 0xed8
                0x000000002000e800 _bss = .
 *(.bss_aligned*)
 *(SORT(.bss*))
 .bss.usb_qh 0x000000002000e800 0x800 /tmp/ccYN2Hij.ltrans0.ltrans.o
                0x000000002000e800 usb_qh
 .bss.usb_endpoint_bulk_in_transfers
                0x000000002000f000 0x80 /tmp/ccYN2Hij.ltrans2.ltrans.o
 .bss.usb_endpoint_control_out_transfers
                0x000000002000f080 0x200 /tmp/ccYN2Hij.ltrans2.ltrans.o

What I don't understand is that the size of usb_qh is promoted to 0x800 while its actual size is 0x300.

Revision history for this message
Uwe Bonnes (bon) said :
#9

The author still recommends gcc-4.7 and has seen several problems with 4.8 and 4.9.

I will forward the tip with *(SORT_BY_ALIGNMENT(.bss*)).

Can you help with this problem?

Provide an answer of your own, or ask Uwe Bonnes for more information if necessary.

To post a message you must log in.