Possible optimization bug with lambda and function returns

Asked by Baptiste Cartier on 2019-08-12

When calling a function or a lambda to create function arguments, the stack is not reused.
The following code replicates the problem. It there anything i missed ? Or is there a bug in ARM GCC?


#include <stdint.h>
#include <stdio.h>

   struct TestStruct
        uint32_t field1;
        uint32_t field2;
        uint32_t field3;
        uint32_t field4;

        uint32_t field11;
        uint32_t field21;
        uint32_t field31;
        uint32_t field41;
    } ;

    struct TestStruct initStructure(uint32_t f1, uint32_t f2, uint32_t f3, uint32_t f4,uint32_t f11, uint32_t f21, uint32_t f31, uint32_t f41)
        struct TestStruct myStruct;
        myStruct.field1 = f1;
        myStruct.field2 = f2;
        myStruct.field3 = f3;
        myStruct.field4 = f4;

        myStruct.field11 = f11;
        myStruct.field21 = f21;
        myStruct.field31 = f31;
        myStruct.field41 = f41;

        return myStruct;

#define MACROLAMBDA(f1,f2,f3,f4,f5,f6,f7,f8) \
    [&]() -> struct TestStruct { struct TestStruct ${}; \
    $.field1 = f1; \
    $.field2 = f2; \
    $.field3 = f3; \
    $.field4 = f4; \
    $.field11 = f5; \
    $.field21 = f6; \
    $.field31 = f7; \
    $.field41 = f8; \
    return $; \

    void __attribute__((noinline)) doStuff(struct TestStruct myStruct)
        printf("f1 = %d, f2 = %d, f3 = %d, f4 = %d, f1 = %d, f2 = %d, f3 = %d, f4 = %d", myStruct.field1, myStruct.field2, myStruct.field3, myStruct.field4,myStruct.field11, myStruct.field21, myStruct.field31, myStruct.field41);

    void __attribute__((noinline)) wrapper2LAMBDA(void)

    void __attribute__((noinline)) wrapper1LAMBDA(void)

    void __attribute__((noinline)) wrapper1func(void)

    void __attribute__((noinline)) wrapper2func(void)

    int main(void)



        return 0;


wrapper1LAMBDA uses 52 bytes of stack whereas wrapper2LAMBDA uses 80. The same happens for wrapper1func and wrapper2func.

The __attribute__((noinline)) is to simplify debug, but should not matter in this case.

The tests were compiled with and without the following options :

-mcpu=cortex-m4 -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -mapcs-frame -std=c++17 -D"STM32" -D"STM32L4" -D"STM32L476xx" -D"STM32L476ZG" -D"ARM_MATH_CM4" -D"USE_STM32L4XX_NUCLEO_64" -D"STD_REPLACE" -D"NO_LIBC" -Wall -Wextra -Wno-comment -Wno-ignored-qualifiers -Wno-implicit-fallthrough -Wno-missing-field-initializers -Wno-overflow -Wno-sign-compare -Wno-type-limits -Wno-unused-parameter -Wpointer-arith -save-temps=obj -c -Wno-register -x c++ -fno-rtti -fno-exceptions -fpermissive -ggdb -fsingle-precision-constant -fmessage-length=0 -fvar-tracking -O2 -ffreestanding -fdata-sections -ffunction-sections -fno-builtin -nodefaultlibs -nostartfiles -nostdlib

Question information

English Edit question
GNU Arm Embedded Toolchain Edit question
No assignee Edit question
Last query:
Last reply:

Can you help with this problem?

Provide an answer of your own, or ask Baptiste Cartier for more information if necessary.

To post a message you must log in.