newlib allocator - extremely inefficient use of memory

Asked by Kevin Bracey on 2019-04-25

We've been struggling a little with some tests using the GCC + newlib combination from these tools, that work fine with other toolchain and libraries.

The problem is that newlib is not able to make efficient use of the devices' limited memory for the heap. There are a few things that make it struggle.

The first is the reliance on the `sbrk` interface, which is poorly suited for small-memory devices with potentially discontiguous memory pools.

One example we saw fail was:

* Device with two discontiguous memory regions available for heap - 16K and 60K.
* Test requests 24K allocation.
* Allocator calls `sbrk` asking for 24K
* `sbrk` implementation is forced to advance from 16K region to 64K region, and returns pointer to start of second region.
* Test frees 24K block
* Test requests 40K block
* Allocator calls `sbrk` asking for 40K
* `sbrk` can't honour this, as 24K of the 60K is already handed over.
* Allocator gives up and fails - can't honour a 40K malloc, despite all 76K of RAM available.

The allocator knows how to combine contiguous sbrk returns, but it works on the pessimistic assumption that the next sbrk return may not be contiguous with its current top, so always asks for the full amount needed for a new block, and won't ever try to ask for "amount needed minus current free space at top".

The second issue is that "sbrk can never go back". In the above case, sbrk has to skip the first 16K totally to deliver 24K. And if asked for 8K later, it can't return a pointer within that 16K region, as the allocator interprets "sbrk return < current top" as a loss of memory.

Both the above issues could be resolved if there was a separate API to add free regions to the allocator up-front, which for many embedded applications could totally replace the `sbrk`. `sbrk` doesn't make a lot of sense in simple devices, as the memory usually isn't being "requested" from the system - there are no other users for not-yet-sbrked memory - the heap has been pre-assigned "all RAM not used by stack and static data", and sbrk only exists as an emulation.

Best current work around from outside the library I can see is to have a "heap preload loop" at startup that allocates lots of small (<pagesize) blocks forcing the allocator to make maximally-efficient small sbrk calls, then releases them all.

Which moves us on to the final issue: page size. The allocator insists on allocating in 4K lumps. If newlib had been built with the `SMALL_MEMORY` option, it would have been using 128-byte units, which would be an awful lot more reasonable for ARM-M class devices. Some devices may have only 10K of RAM, and the current "page size" renders up to 4K unusable.

Question information

Language:
English Edit question
Status:
Answered
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Last query:
2019-04-25
Last reply:
2019-04-29

Hi Kevin,

This sounds like a query for the newlib community. I suggest you email <email address hidden>

Kind Regards,
Andre

Can you help with this problem?

Provide an answer of your own, or ask Kevin Bracey for more information if necessary.

To post a message you must log in.