malloc: out of memory issue

Asked by Emeric POUPON on 2013-01-11

Hello,

I experience a very annoying malloc problem with my STM32F103 (Cortex M3).
Here are the software things I have:
- gcc-arm-none-eabi-4_7-2012q4-20121208-linux.tar.bz2
- linker script based on gcc-arm-embedded examples (contains text, data, bss, heap and dummy_stack sections). I use the toolchain libc.
- startup file based on gcc-arm-embedded, +additional peripheral interrupt handler definitions

If I do too much malloc calls (let's say, 1000 calls * 400 bytes each), the system finally ends in Hardfault_Handler, instead of malloc calls just returning 0.
I think the _sbrk used is in the newlib file "mallocr.c".Tracing the execution shows the _sbrk function always return a growing value of the end of heap. It doesn't seem to successfully check the stack/heap overlap. So the end of the heap grows and go above the stack: hardfault!
(BTW, why is the sp register used in this check, and not the StackLimit ?)

Can you please tell me if it does work on your device, so that I can know I have done something wrong in my linker script or something?

Best Regards

Question information

Language:
English Edit question
Status:
Answered
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Last query:
2013-01-25
Last reply:
2013-05-08
Joey Ye (jinyun-ye) said : #1

Look the case is stack runs into heap after a successful sbrk. Gcc bare
metal doesn't have an efficient way detecting stack overrun into heap. It
doesn't matter if heap stops growing before a predefined limit or not. I
think that helps to explain why heap limit is not used in newlib.

- Joey
On Jan 12, 2013 2:56 AM, "Emeric POUPON" <
<email address hidden>> wrote:

> New question #218972 on GCC ARM Embedded:
> https://answers.launchpad.net/gcc-arm-embedded/+question/218972
>
> Hello,
>
> I experience a very annoying malloc problem with my STM32F103 (Cortex M3).
> Here are the software things I have:
> - gcc-arm-none-eabi-4_7-2012q4-20121208-linux.tar.bz2
> - linker script based on gcc-arm-embedded examples (contains text, data,
> bss, heap and dummy_stack sections). I use the toolchain libc.
> - startup file based on gcc-arm-embedded, +additional peripheral
> interrupt handler definitions
>
> If I do too much malloc calls (let's say, 1000 calls * 400 bytes each),
> the system finally ends in Hardfault_Handler, instead of malloc calls just
> returning 0.
> I think the _sbrk used is in the newlib file "mallocr.c".Tracing the
> execution shows the _sbrk function always return a growing value of the end
> of heap. It doesn't seem to successfully check the stack/heap overlap. So
> the end of the heap grows and go above the stack: hardfault!
> (BTW, why is the sp register used in this check, and not the StackLimit ?)
>
> Can you please tell me if it does work on your device, so that I can know
> I have done something wrong in my linker script or something?
>
> Best Regards
>
> --
> You received this question notification because you are an answer
> contact for GCC ARM Embedded.
>

Emeric POUPON (pepehumbert) said : #2

Yes, I do agree there is always the possibility for the stack to erase the heap section.
But in the newlib sbrk arm code, there *is* a check : "heap_end + incr < stack_pointer", heap_end refers to "end" and stack pointer refers to "sp".
At least it should tell the user there is no memory available so that the user can do the appropriate action, instead of just setting the whole system in a unpredictable way (silent data corruption, silent unwanted actions, various crashes, etc.)
My question is: why is this check not working?

Joey Ye (jinyun-ye) said : #3

Can you please share your kernel code or describe the program flow in brief?

It should work if sbrk finds overlapping to stack. It won't work if no overlapping at the time sbrk is called, but later stack grows down into heap.

Emeric POUPON (pepehumbert) said : #4

The test case is *very* simple, something like that:

----
int i;
for (i = 0; i < 1000; ++i) {
    char* res = malloc( 500 );
    myprintf( "res = %p\n\r", res);
}
----
Note: myprintf uses static variables to put chars using a UART peripheral.

I played a bit with available libc/libgloss libraries and found some interesting things:

- Case 1: Linking with nosys
_sbrk makes the heap continuously grow. It ends up in a HardFault somehwere inside malloc call.

- Case 2 : Linking with rdimon
Once memory is depleted, malloc fails as expected !

- Case 3: Linking with nosys, but implementing a _sbrk function that cares about stack_pointer (same implementation as in newlib/newlib/sys/arm/syscalls.c)
Once memory is depleted, malloc fails as expected !

What I do not understand is that we can find good implementations that check for stack overflow there :
newlib/newlib/sys/arm/syscalls.c
newlib/libgloss/arm/syscalls.c

However, these ones do not:
newlib/libgloss/libnosys/sbrk.c
newlib/libgloss/sbrk.c

I have the feeling that there is no generic way for newlib to detect the "out of memory" scenario, but I still do not understand why a gcc-arm-embedded toolchain, actually tuned for ARM based system, would not use a safe implementation of _sbrk in all cases, especially in a "nosys" configuration ?

As far I can see, the only viable solution for me would go on using libnosys+ copy/paste corect _sbrk implementation in my project. Am I correct?

Best Regards,

Joey Ye (jinyun-ye) said : #5

Emeric,

It is expected me that your case isn't working. I suspect it is the case that stack grow down into heap, rather than heap growing up to stack.

Let's assume total available memory for heap and stack is 4096 bytes.

At certain time, say after 7 loops, about 3,500 bytes were allocated by malloc. Now at 8th loop, assuming that current stack usage is only 16 bytes, malloc attempts to allocate another 500 bytes and will succeed. (4000 + 16 < 4096).

Then after malloc returns successfully, my_printf start to run, which usually need a lot more stack space than the 80 bytes left available. What will happen is stack of my_printf will quitely grow down to heap space, without any efficient mechanism to detect it. Undefined behavior will happen in such case.

So I'd suggest to change your case to saving return values of malloc in an array, then print the array after freeing heap.

Thanks - Joey

Emeric POUPON (pepehumbert) said : #6

Hello,

Thanks for your support.

Please dont' consider my previous example and focus on what I have said before:
 - the hardfault comes from malloc (only when linking on libnosys)
 - the libnosys _sbrk implementation seems not to check heap overflow.

The following test can help you reproduce the bug:

-----
int i;
for (i = 0; i < 10000; ++i) {
    char* res = malloc( 500 );
}
-----

Again, malloc never returns NULL, it just crashes the whole system.

Joey Ye (jinyun-ye) said : #7

Reproduced.

It is the sbrk implementation in libnosys that has no overflow detection. Libnosys is target independent, unlike rdimon. Overflow detection isn't very portable there. Refer to comments in http://sourceware.org/cgi-bin/cvsweb.cgi/src/libgloss/libnosys/sbrk.c?cvsroot=src

Suggestion is to retarget _sbrk by copy-pasting the one from libgloss/arm/syscalls.c

Emeric POUPON (pepehumbert) said : #8

Ok thanks for reproducing the problem!

As I asked before, how a ARM tuned toolchain can use such a "dangerous" target independent code? Is there nothing you can do about that? For example why not just patching the libnosys for this toolchain build?

Best Regards

Joey Ye (jinyun-ye) said : #9

Thanks for reporting this. I agree with you that it is dangerous, as we are
trying to give a better solution in next release.
On Jan 26, 2013 4:41 AM, "Emeric POUPON" <
<email address hidden>> wrote:

> Question #218972 on GCC ARM Embedded changed:
> https://answers.launchpad.net/gcc-arm-embedded/+question/218972
>
> Status: Answered => Open
>
> Emeric POUPON is still having a problem:
> Ok thanks for reproducing the problem!
>
> As I asked before, how a ARM tuned toolchain can use such a "dangerous"
> target independent code? Is there nothing you can do about that? For
> example why not just patching the libnosys for this toolchain build?
>
> Best Regards
>
> --
> You received this question notification because you are an answer
> contact for GCC ARM Embedded.
>

Joey Ye (jinyun-ye) said : #10

Emeric,

This issue should have been solved in 4.7 2013q1 update. Can you please verify and mark it as solved?

Thanks - Joey

Can you help with this problem?

Provide an answer of your own, or ask Emeric POUPON for more information if necessary.

To post a message you must log in.