armcc vs gcc-arm-none-eabi performance
I want to see the performance difference between armcc and gcc, so i get a benchmark source project to test.
but the result gives me a big suprise: for the same code, gcc runs slow 2.9 times than armcc.
I don't know whether there is something woring with my benchmark c code or compile option, so I list some information here, and any suggestion is appreciated:
armcc(O3 Otime) cpu,bus:80M FPGA 0xeb1 0x92E2 0x1a8 0xbb 14.2k
gcc (O3 Otime) 0x2a9f 0x1AA2C0 0x3ab 0x3c8 17.1k
while i use compile option like:-g -flto -ffunction-sections -fdata-sections -fno-inline --specs=
update 2016/07/05:
i find something when convert 32bits to 16bits with unnecessary uxth:
160e: 2001 movs r0, #1
1610: f001 fad4 bl 2bbc <get_seed_32>
1614: 4603 mov r3, r0
1616: b29b uxth r3, r3
1618: f8a7 37d4 strh.w r3, [r7, #2004] ; 0x7d4
while armcc output like:
0x0000066c: 2001 . MOVS r0,#1
0x0000066e: f000fdd7 .... BL get_seed_32 ; 0x1220
0x00000672: f8ad07d8 .... STRH r0,[sp,#0x7d8]
armcc saves 2 instructions.
so, can i do anything to avoid this with any options?
update 2016/07/06:
another unnecessary instructions with gcc:
18d6: f7ff f941 bl b5c <glbcnt_read>
18da: 4603 mov r3, r0
18dc: 460c mov r4, r1
18de: 4a57 ldr r2, [pc, #348] ; (1a3c <main+0x5d8>)
18e0: e9c2 3400 strd r3, r4, [r2]
while armcc output like the following:
0x000007fa: f7fffcdd .... BL glbcnt_read ; 0x1b8
0x000007fe: 465d ]F MOV r5,r11
0x00000800: e9cb0102 .... STRD r0,r1,[r11,#8]
it seems like moving value from r0 to r3 and from r1 to r4 is not unnecessary, is this a bug?
Question information
- Language:
- English Edit question
- Status:
- Solved
- Assignee:
- No assignee Edit question
- Solved by:
- randy
- Solved:
- Last query:
- Last reply: