gcc-arm-none-eabi performance with redundant instructions

Asked by randy

hi all,
I try to use gcc-arm-none-eabi but find that it always gets redundant instructions compared to other compilers like armcc or aarch64-linux-android-gcc. And the average size compiled with gcc-arm-none-eabi 4.9 is more than double that compiled with other compilers.
For instance, the following c code:

typedef struct list_data_s {
 signed short data16;
 signed short idx;
} list_data;

void copy_info(list_data *to,list_data *from) {
 is dis-assembly with gcc-arm-none-eabi as:
00000db8 <copy_info>:
     db8: b480 push {r7}
     dba: b083 sub sp, #12
     dbc: af00 add r7, sp, #0
     dbe: 6078 str r0, [r7, #4]
     dc0: 6039 str r1, [r7, #0]
     dc2: 683b ldr r3, [r7, #0]
     dc4: 881a ldrh r2, [r3, #0]
     dc6: 687b ldr r3, [r7, #4]
     dc8: 801a strh r2, [r3, #0]
     dca: 683b ldr r3, [r7, #0]
     dcc: 885a ldrh r2, [r3, #2]
     dce: 687b ldr r3, [r7, #4]
     dd0: 805a strh r2, [r3, #2]
     dd2: 370c adds r7, #12
     dd4: 46bd mov sp, r7
     dd6: f85d 7b04 ldr.w r7, [sp], #4
     dda: 4770 bx lr

while complied with armcc as:
        0x000002d4: 880a .. LDRH r2,[r1,#0]
        0x000002d6: 8002 .. STRH r2,[r0,#0]
        0x000002d8: 8849 I. LDRH r1,[r1,#2]
        0x000002da: 8041 A. STRH r1,[r0,#2]
        0x000002dc: 4770 pG BX lr

and compiled with aarch64-linux-android-gcc as:
00000000 <copy_info>:
   0: e1d120b0 ldrh r2, [r1]
   4: e1d130b2 ldrh r3, [r1, #2]
   8: e52db004 push {fp} ; (str fp, [sp, #-4]!)
   c: e28db000 add fp, sp, #0
  10: e1c020b0 strh r2, [r0]
  14: e1c030b2 strh r3, [r0, #2]
  18: e24bd000 sub sp, fp, #0
  1c: e49db004 pop {fp} ; (ldr fp, [sp], #4)
  20: e12fff1e bx lr

I use compile option like:-g -flto -ffunction-sections -fdata-sections -fno-inline --specs=nano.specs --specs=nosys.specs -mthumb -mcpu=cortex-m3,
I don't know whether i use the non-reasonable compile options or anything wrong with this compiler,does anyone give suggestion about this?

Question information

English Edit question
GNU Arm Embedded Toolchain Edit question
No assignee Edit question
Solved by:
Last query:
Last reply:
Revision history for this message
Thomas Preud'homme (thomas-preudhomme) said :

Hi Randy,

The testcase you provided is not complete: it does not contain the definition of the list_data type. However, I have a two recommendations:

1) Try adding -Os when compiling
2) Try with the 2016Q2 release of GNU ARM Embedded Toolchain

Best regards,


Revision history for this message
randy (qiuxiaoyu) said :

Hi Thomas,
Thanks for your reply.
sorry for my mistake, and the list_data variable type is defined as:
typedef struct list_data_s {
 signed short data16;
 signed short idx;
} list_data;

Revision history for this message
randy (qiuxiaoyu) said :

Hi Thomas,
I tried with -Os under the release "gcc-arm-none-eabi-5_4-2016q2", but nothing has changed yet.

Revision history for this message
randy (qiuxiaoyu) said :

it seems that problem has been solved.
I used -o2/os as compile option, today i use -O2/Os, it is OK:
00000bf6 <copy_info>:
     bf6: f9b1 3000 ldrsh.w r3, [r1]
     bfa: f9b1 1002 ldrsh.w r1, [r1, #2]
     bfe: 8003 strh r3, [r0, #0]
     c00: 8041 strh r1, [r0, #2]
     c02: 4770 bx lr

so, what's fxxxing difference between lowercase 'o' and uppercase 'O'??
if the lowercase 'o' is useless, why doesn't the compiler warn us?