Neon and fmaf together is not supported?

Asked by genadi on 2017-07-23

Then defined -march=armv7-a -mfloat-abi=hard -mfpu=neon , name __ARM_FEATURE_FMA or FP_FAST_FMAF is undefined. It's right? Last checked on Version 6-2017-q2-update.

Question information

Language:
English Edit question
Status:
Solved
For:
GNU ARM Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Solved by:
Tejas Belagod
Solved:
2017-07-24
Last query:
2017-07-24
Last reply:
2017-07-24
genadi (genaspb) said : #1

I use in code like this:

 #if (__ARM_FP & 0x08)

  typedef double FLOAT_t;

  #define LOG10F log10
  #define LOGF log
  #define POWF pow
  #define SINF sin
  #define COSF cos
  #define ATAN2F atan2
  #define ATANF atan
  #define EXPF exp
  #define FABSF fabs
  #define SQRTF sqrt
  #define FMAXF fmax
  #define FMINF fmin
  #if defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMA)
   #define FMAF fma
  #endif /* defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMA) */
  #define DSP_FLOAT_BITSMANTISSA 54

 #elif (__ARM_FP & 0x04)

  typedef float FLOAT_t;

  #define LOG10F log10f
  #define LOGF logf
  #define POWF powf
  #define SINF sinf
  #define COSF cosf
  #define ATAN2F atan2f
  #define ATANF atanf
  #define EXPF expf
  #define FABSF fabsf
  #define SQRTF sqrtf
  #define FMAXF fmaxf
  #define FMINF fminf
  #if defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMAF)
   #define FMAF fmaf
  #endif /* defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMAF) */
  #define DSP_FLOAT_BITSMANTISSA 24

 #else

  #error No floating point support

 #endif

Tejas Belagod (belagod-tejas) said : #2

Try -mfpu=neon-vfpv4.

genadi (genaspb) said : #3

I'm use Cortex-A9 CPU (Renesas RZ/A1L), in readme.txt recommended options is -mfpu=vfpv3 - can I use your version without run-time problems?
Option like -mfpu=neon-vfpv3 is wrong.

genadi (genaspb) said : #4

Ii my CPU not exist half (16-bit) floating point, instead of vfpv3

genadi (genaspb) said : #5

Code comiled with -mfpu=neon-vfpv4 fail running on my test project.

genadi (genaspb) said : #6

Fail running with "Undefined" exception after start-up passing... I think, not all of emitted instructions valid for my -mfpu=vfpv3-d16 CPU.

Best Tejas Belagod (belagod-tejas) said : #7

VFMA is a VFPv4 feature. Therefore __ARM_FEATURE_FMA will be undefined on Cortex-A9 which implements VFPv3-D16. Also, -mcpu-Cortex-A9 will default to its VFP and AdvSIMD features it implements unless you explicitly disable using +nofp or +nosimd.

genadi (genaspb) said : #8

With recommended connand line __ARM_FEATURE_FMAF is defined and work fine on listed CPU (if NEON not used).

Sample of gcc command line:
arm-none-eabi-gcc -c -mcpu=cortex-a9 -march=armv7-a -mfloat-abi=hard -mfpu=neon -ftree-vectorize -fno-math-errno -funroll-loops -fgraphite-identity -ffunction-sections -fdata-sections -ffat-lto-objects -Ofast -flto -gdwarf-2 -fomit-frame-pointer -Wall -Wstrict-prototypes -DNDEBUG=1 -DCPUSTYLE_R7S721=1 -DCPUSTYLE_R7S721020=1 -MD -MP -MF ./dep/audio.o.d -I.. -I../rza1x_inc -I.. ../audio.c -o audio.o

If NEON not used - __ARM_FEATURE_FMAF is defined.

BUT:

200000e0 <fmaf>:
200000e0: eeb7 6ac0 vcvt.f64.f32 d6, s0
200000e4: eeb7 7ae0 vcvt.f64.f32 d7, s1
200000e8: eeb7 1ac1 vcvt.f64.f32 d1, s2
200000ec: ee06 1b07 vmla.f64 d1, d6, d7
200000f0: eeb7 0bc1 vcvt.f32.f64 s0, d1
200000f4: 4770 bx lr

genadi (genaspb) said : #9

Thanks Tejas Belagod, that solved my question.

genadi (genaspb) said : #10

Sorry, current version of gcc in any cases for Cortex-A9 not define __ARM_FEATURE_FMA, FP_FAST_FMA or FP_FAST_FMAF