Unexpected assembly generated with -Os

Asked by Manuel Pégourié-Gonnard

Hi,

/* BEGIN weird.c */
#include <stdint.h>

static const unsigned N = 42;

void clear( uint32_t x[N] )
{
    for( unsigned i = 0; i < N; i++ )
        x[i] = 0;
}
/* END weird.c */

% arm-none-eabi-gcc -std=c11 -march=armv7-m -mthumb weird.c -S -Os -o - | tail -n 14
clear:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        subs r3, r0, #4
        adds r0, r0, #164
.L2:
        movs r2, #0
        str r2, [r3, #4]!
        cmp r3, r0
        bne .L2
        bx lr
        .size clear, .-clear
        .ident "GCC: (GNU Tools for ARM Embedded Processors) 4.9.3 20141119 (release) [ARM/embedded-4_9-branch revision 218278]"

Observed behaviour: "mov r2, #0" is inside the loop, which hurts performance unless I'm really missing something.

Expected behaviour: this instruction should be outside the loop, yielding better performance without increasing the code size. (This is what we get with -O1 and -O2.)

Am I missing something big or is it a bug in some optimization pass enabled by -Os?

As a side note, in this case -O3 actually generates the most compact code for this function, ie a call to memset.

Thanks,
Manuel.

Question information

Language:
English Edit question
Status:
Answered
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Thomas Preud'homme (thomas-preudhomme) said :
#1

Hi Manuel,

I can reproduce the bug with our 4.9 toolchain and the good news is that the bug cannot be reproduced with the next GCC 5. I will track down which commit led to this change to make sure it's really fixed and not just luck because of some weird interaction. My guess is that it was really fixed.

I will update this bug when I find out.

Best regards.

Revision history for this message
Tony Liu (mrtoniliu) said :
#2

Hi Manuel,

Could you try to add -fira-loop-pressure while compiling, it should be able to solve this problem.

Regards,
Tony

Can you help with this problem?

Provide an answer of your own, or ask Manuel Pégourié-Gonnard for more information if necessary.

To post a message you must log in.