Optimization for size corrupt the stack

Asked by Brainless

Currently I'm developing firmware for STM32F105. Everything worked fine, until I enabled optimization for size.
I traced the problem and came to conclusion, that the stack get corrupted by one concrete function. At the end of the function there is a call to another function. Please see the disassembly:
.......................................................................................................
0800291a: adds r2, #12
0800291c: movs r3, #64 ; 0x40
738 }
0800291e: ldmia.w sp!, {r4, r5, r6, r7, r8, lr}
730 DCD_EP_Tx (pdev, 0x80 | epnum, (uint8_t *)(&CanRxReportBuffers[BufferIndex].ReportBuffer[CanRxReportBuffers[BufferIndex].ReadPointer]), 64);
08002922: b.w 0x8002dc4 <DCD_EP_Tx>
734 release_mutex(&HID_IN_EP_Mutexes[BufferIndex]);
08002926: mov r0, r6
0800292c: b.w 0x8002566 <release_mutex>
08002930: ldmia.w sp!, {r4, r5, r6, r7, r8, pc}
......................................................................................................
I omitted large part as it is not relevant for the moment. You can see, the branch to function DCD_EP_Tx(...) is OK, as the stack is properly cleaned up. But the branch to function release_mutex(...) is not correct, because the stack is leaved without popping previously stored values of the registers.
Is this a bug?
(p.s.: Sorry for my English.)

Question information

Language:
English Edit question
Status:
Solved
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Solved by:
Brainless
Solved:
Last query:
Last reply:
Revision history for this message
Terry Guo (terry.guo) said :
#1

Which registers do you mean for "because the stack is leaved without popping previously stored values of the registers"? Looks to me that function release_mutex would be a very small function and doesn't use any caller saved registers, then I think we don't need to restore them after the call to release_mutex.

Revision history for this message
Brainless (peterdonchev) said :
#2

I mean r4, r5, r6, r7, r8. Their values will remain in the stack, because release_mutex will return directly to previous function (not to this function).

Revision history for this message
Terry Guo (terry.guo) said :
#3

Sorry that I am still not clear of your question. Would you please give some pseudo code to demonstrate your expected code sequence?

Revision history for this message
Brainless (peterdonchev) said :
#4

I will post code of the functions later. To work everything correct, call of the release_mutex function must be with "branch with link" instruction, so after returning from it, the instruction " ldmia.w sp!, {r4, r5, r6, r7, r8, pc}" will be executed to clean up the stack.
In current situation release_mutex will return to the previous function, not to the caller(this) function.

Revision history for this message
Terry Guo (terry.guo) said :
#5

Below code sequence is extracted from your description:

0800291e: ldmia.w sp!, {r4, r5, r6, r7, r8, lr}
08002922: b.w 0x8002dc4 <DCD_EP_Tx>
08002926: mov r0, r6
0800292c: b.w 0x8002566 <release_mutex>
08002930: ldmia.w sp!, {r4, r5, r6, r7, r8, pc}

My understanding is that after the execution of instruction at 91e, the register lr should hold value 08002926, then we can call 'bx lr' in function DCD_EP_Tx to return. Right after the return from DCD_EP_Tx, we should continue to update the lr to value 08002930, then it is possible to call 'bx lr' in release_mutex and correctly return.

If we don't correctly update the lr register, then the b.w should be replaced with blr just like what you said.

Is my understanding correct?

Revision history for this message
Brainless (peterdonchev) said :
#6

DCD_EP_Tx is called on exit from this function. Rigister lr must contain address for return into the previous function.
Function release_mutex is called on alternative exit from this function, so instruction at 291e is not executed before calling release_mutex.
Here is the full source of this function. I hope now will be clear.
...............................................................................................................................................
void TryToStartUSBTx(void *pdev, uint8_t epnum)
{
 uint8_t BufferIndex = (epnum & 0x07) - 1;

 if (((USB_OTG_CORE_HANDLE*)pdev)->dev.device_status == USB_OTG_CONFIGURED )
 {
  if(acquire_mutex(&HID_IN_EP_Mutexes[BufferIndex]))
  {
   // This check must be done, after acquiring the mutex, to prevent race condition
   if((Usb2Can_STATUS == USB2CAN_STATUS_ACTIVE)&&(CanRxReportBuffers[BufferIndex].ReportCount > 0))
   {
    if(CanMonitorReportBuffer.ReportCount > 0)
     CanRxReportBuffers[BufferIndex].ReportBuffer[CanRxReportBuffers[BufferIndex].ReadPointer].Usb2CanState.MonitorBufferNotEmpty = 1;
    else
     CanRxReportBuffers[BufferIndex].ReportBuffer[CanRxReportBuffers[BufferIndex].ReadPointer].Usb2CanState.MonitorBufferNotEmpty = 0;

    DCD_EP_Tx (pdev, 0x80 | epnum, (uint8_t *)(&CanRxReportBuffers[BufferIndex].ReportBuffer[CanRxReportBuffers[BufferIndex].ReadPointer]), 64);
   }
   else
   {
    release_mutex(&HID_IN_EP_Mutexes[BufferIndex]);
   }
  }
 }
}
...............................................................................................................................................

Revision history for this message
Terry Guo (terry.guo) said :
#7

Thanks. Now I am clear of the issue. The assembly code is incorrect. But I am unable to reproduce it. Is there any way that you can help us to reproduce it?

Revision history for this message
Brainless (peterdonchev) said :
#8

I will work on that tonight, and hope I will make simpler code to reproduce it.

Revision history for this message
Brainless (peterdonchev) said :
#9

Hi Terry,
I found the problem and this is not the compiler. In fact, the debugger misled me. The proper instruction is there, simply the debugger not showing it.
08002926: mov r0, r6
0800292c: b.w 0x8002566 <release_mutex>
As you can see, there is a gap between the two addresses.
I found and what corrupt the stack. It is my function with inline assembly code. Improper addressing of char variable (I guess alignment is changed, when optimization is enabled).
Anyway, I'm sorry I wasted your time and thank you for your help.