aligned_storage failure - any undefined behaviour here?

Asked by Kevin Bracey on 2019-12-10

While working on a std::function-like class, I managed to get incorrect code generation from GCC. As far as I can tell, the code is tricky, but I can't identify any undefined behaviour that would justify GCC failing.

In summary, the class layout is

    class Callback {
         using Store = std::aligned_storage_t<3 * sizeof(void *), alignof(void *)>;

         Store storage;
         void (*call)();

In the failing case, assignment into the Callback did

    memset(&storage, 0, sizeof(storage));
    call = X;
    new (&storage) F(std::move(f));

So zero-fill complete storage, then placement-new copy-or-move construct into (some of) the storage. F is trivially-copyable and suitably sized and aligned.

This Callback was then copied to another Callback. The stored object and Callback are both trivially-copyable, so that was using the defaulted (trivial) copy assignment operator.

The complete sequence is then

    memset(&, 0, sizeof(; = X;
    new (& F(std::move(f));
    cb2 = cb;

But the compiler reorders to:

    memset(&, 0, sizeof(; = X;
    cb2 = cb;
    new (& F(std::move(f));

So cb2 ends up with incorrect contents.

Some interaction of the memset, the placement new and the trivial copy apparently falls foul of some sort of aliasing problem - GCC decides the placement new and the trivial copy don't interact.

Many hours of staring at the C++ standard makes me think this should be fine.

1) You can store trivially-copyable objects in character arrays, if aligned enough.
2) aligned_storage is specified as being usable as a store (although it isn't explicitly a character array, and changing it to be a character array doesn't change anything).
3) Copying a trivially-copyable object as its component characters preserves its value.
4) The defaulted copy assignment of Callback should copy the characters of storage, and hence the value of the stored object
5) Performing a placement new into storage creates a new object with new lifetime that should persist as long as the Callback holding the storage.

The problem has gone away in my tests since I've removed the full-storage memset, replacing it with a memset of padding only, but I want to pin down if I'm doing anything wrong, or this is a GCC bug. The failing code worked fine with ARMC6 and IAR, fwiw.

Replacing the placement-new with `memcpy(_storage, &f, sizeof f)` also made the problem go away while I had the memset.

Failing shown here on compiler explorer:

        mov r0, r4 // r4 = sp+24
        bl mbed::detail::CallbackBase::clear()
        ldr r3, .L16+12
        str r3, [sp, #36]
        ldm r4, {r0, r1, r2, r3}
        stm r5, {r0, r1, r2, r3}
        mov r0, r4
        str r6, [sp, #24] // XXX these two stores should have happened
        str sp, [sp, #28] // XXX before the ldm r4, {} copy
        bl mbed::detail::CallbackBase::clear() // XXX and this clears what we just wrote - reusing the temporary space

Failure seen on both GCC 8.3 and 9.2. It appears to be A32/T32-specific. This may be because ARM32 code generation for trivial structures passed by value is markedly inferior to ARM64 or x64 - the copy seen here doesn't occur at all for them.

GCC performs far more copying when generating ARM32 code for things like `std::span` or this `Callback` passed by value, suggesting some backend small-structure-handling optimisation is not available or working for ARM32.

In Godbolt, GCC 8.2 produces apparently OK code.

Same basic failing code seen with Cortex-M4 and arm7tdmi (arm or thumb). Cortex-A7 (arm or thumb) reorders a bit due to scheduling, but still half-wrong.

Question information

English Edit question
GNU Arm Embedded Toolchain Edit question
No assignee Edit question
Solved by:
Kevin Bracey
Last query:
Last reply:
Launchpad Janitor (janitor) said : #1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Kevin Bracey (kevin-bracey) said : #2

Related discussion found here:

GCC's std::function doesn't use aligned_storage, and has a may_alias attribute, apparently to avoid a problem in this area:

That seems to be relevant, although I remain unclear why an attribute should be necessary. I believe the language rules force a compiler to assume possible aliasing when copying a plain char array, as it could be holding an object.

Kevin Bracey (kevin-bracey) said : #3

More searching reveals someone who has hit the same problem, for the same reason, and written a good article on it:

The problem is subtle - it's legal to copy a `T` into a *real* `T` as chars, and then access it, as a `T`. But it is NOT legal to copy a `T` into a char array as chars, and then access it as a `T`.

It fails at point 4 in my list above - the implicit trivial copy operation indeed copies the representation as chars, but it doesn't actually make the array BE a T.

There is actually a WG paper on this:

I believe the proposed Defect Report would make my code legal, as long as `aligned_storage` contained a char array.

So marking problem solved, at least in as much as "understood". Not sure if I'll be able to achieve the effect I was aiming for of trivial-copy std::function-lite with current compilers though.