Arm's speculative fetch on instruction with link register causes cpu stall

Asked by Guru Prasad on 2017-07-07

Hi,

We are seeing an issue where arm's speculative fetch is causing a load from the link register. The link register at this point happens to have an invalid address for a valid reason (See example source/disassembly below).

In function foo(), there are 2 return points and these map to addresses 0x10039cfa and 0x10039da00 in the disassembly, and they happen to be one instruction apart. In the code path, where bar() is called from foo() (x != 0), bar() clobbers the link register, but the return from foo() is expected from 0x10039cfa, which loads the pc (ldmia) rather than branch to lr (bx lr). This is reasonable and works most of the time. But sometimes, although only the ldmia instruction at 0x10039cfa is expected to execute along this path, arm m7 is issuing a speculative fetch for the next instructions. At this point lr does not have a valid address in it and it is not expected to (at least by the compiler). This results in the cpu stalling as the peripheral to which the address in LR corresponds, does not respond.

We are compiling with GNU ARM Embedded toolchain version 6-2017-q1-update. Compiling with -ffixed-lr option restricts the compiler from using LR and the issue seems to get resolved.

The question I have is what would be the correct solution to this. Is the compiler required to know about the speculative fetches from arm and may be try and always generate code that does a pop/load LR followed by branch to LR, rather than pop/load PC? (This would come at the cost of an extra instruction though). Or should we forbid the compiler from using LR as a general purpose register?

c Source:
====
void bar(void)
{
    // This is a leaf function and does not call any functions
    // It clobbers LR here by using it as a general purpose register
    // LR is pushed to stack in the prolog
    // And PC is popped from stack in the epilog
}
int foo(int x)
{
    if (x == 0)
        return ERR_CODE1;

    bar();
    return 0;
}

Disassembly:
====
void bar(void)
{
10031e84: e92d 4df0 stmdb sp!, {r4, r5, r6, r7, r8, sl, fp, lr}
10031e88: b0d4 sub sp, #336 ; 0x150
...
10032526: ea4e 020a orr.w r2, lr, sl
1003252a: ea02 020c and.w r2, r2, ip
1003252e: ea0e 010a and.w r1, lr, sl
...
}
1003257c: b054 add sp, #336 ; 0x150
1003257e: e8bd 8df0 ldmia.w sp!, {r4, r5, r6, r7, r8, sl, fp, pc}
10032582: bf00 nop

10039ca8 <foo>:
int foo(int x)
{
    if(x == 0)
10039ca8: 2800 cmp r0, #0
10039caa: d028 beq.n 10039cfe <foox+0x56>
{
10039cac: e92d 41f0 stmdb sp!, {r4, r5, r6, r7, r8, lr}
...
...
...
    return 0;
10039cf6: 2000 movs r0, #0
}
10039cf8: b008 add sp, #32
10039cfa: e8bd 81f0 ldmia.w sp!, {r4, r5, r6, r7, r8, pc}
        return ERR_CODE1;
10039cfe: 4803 ldr r0, [pc, #12] ; (10039d0c <foo+0x64>)
10039d00: 4770 bx lr

regards,
guru.

Question information

Language:
English Edit question
Status:
Answered
For:
GNU ARM Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Last query:
2017-07-10
Last reply:
2017-07-10
Tejas Belagod (belagod-tejas) said : #1

Hi Guru Prasad,

I don't think the compiler is doing anything wrong here as it may not have the system memory map info or the runtime address to selectively schedule for speculation. I would encourage you to contact ARM support (https://www.arm.com/support/) with any queries you have about Cortex-M7's speculation machinery.

Thanks,
Tejas.

Guru Prasad (guruprasadsm) said : #2

Hi Tejas,

Thanks for the response. I agree that this is not a compiler bug. We have already reached out to the arm support team and are awaiting response. I wanted to check if this was a known issue that has been reported by others before and if there is a recommended solution for this issue.

regards,
guru.

Tejas Belagod (belagod-tejas) said : #3

Hi Guru Prasad,

Do you have a case reference number from ARM support?

Thanks,
Tejas.

Tejas Belagod (belagod-tejas) said : #4

Hi Guru Prasad,

Nevermind, I located it in our database.

Thanks,
Tejas.

Can you help with this problem?

Provide an answer of your own, or ask Guru Prasad for more information if necessary.

To post a message you must log in.