Neon and fmaf together is not supported?

Asked by genadi on 2017-07-23

Then defined -march=armv7-a -mfloat-abi=hard -mfpu=neon , name __ARM_FEATURE_FMA or FP_FAST_FMAF is undefined. It's right? Last checked on Version 6-2017-q2-update.

Question information

English Edit question
GNU Arm Embedded Toolchain Edit question
No assignee Edit question
Solved by:
Tejas Belagod
Last query:
Last reply:
genadi (genaspb) said : #1

I use in code like this:

 #if (__ARM_FP & 0x08)

  typedef double FLOAT_t;

  #define LOG10F log10
  #define LOGF log
  #define POWF pow
  #define SINF sin
  #define COSF cos
  #define ATAN2F atan2
  #define ATANF atan
  #define EXPF exp
  #define FABSF fabs
  #define SQRTF sqrt
  #define FMAXF fmax
  #define FMINF fmin
  #if defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMA)
   #define FMAF fma
  #endif /* defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMA) */

 #elif (__ARM_FP & 0x04)

  typedef float FLOAT_t;

  #define LOG10F log10f
  #define LOGF logf
  #define POWF powf
  #define SINF sinf
  #define COSF cosf
  #define ATAN2F atan2f
  #define ATANF atanf
  #define EXPF expf
  #define FABSF fabsf
  #define SQRTF sqrtf
  #define FMAXF fmaxf
  #define FMINF fminf
  #if defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMAF)
   #define FMAF fmaf
  #endif /* defined (__ARM_FEATURE_FMA) || defined (FP_FAST_FMAF) */


  #error No floating point support


Tejas Belagod (belagod-tejas) said : #2

Try -mfpu=neon-vfpv4.

genadi (genaspb) said : #3

I'm use Cortex-A9 CPU (Renesas RZ/A1L), in readme.txt recommended options is -mfpu=vfpv3 - can I use your version without run-time problems?
Option like -mfpu=neon-vfpv3 is wrong.

genadi (genaspb) said : #4

Ii my CPU not exist half (16-bit) floating point, instead of vfpv3

genadi (genaspb) said : #5

Code comiled with -mfpu=neon-vfpv4 fail running on my test project.

genadi (genaspb) said : #6

Fail running with "Undefined" exception after start-up passing... I think, not all of emitted instructions valid for my -mfpu=vfpv3-d16 CPU.

Best Tejas Belagod (belagod-tejas) said : #7

VFMA is a VFPv4 feature. Therefore __ARM_FEATURE_FMA will be undefined on Cortex-A9 which implements VFPv3-D16. Also, -mcpu-Cortex-A9 will default to its VFP and AdvSIMD features it implements unless you explicitly disable using +nofp or +nosimd.

genadi (genaspb) said : #8

With recommended connand line __ARM_FEATURE_FMAF is defined and work fine on listed CPU (if NEON not used).

Sample of gcc command line:
arm-none-eabi-gcc -c -mcpu=cortex-a9 -march=armv7-a -mfloat-abi=hard -mfpu=neon -ftree-vectorize -fno-math-errno -funroll-loops -fgraphite-identity -ffunction-sections -fdata-sections -ffat-lto-objects -Ofast -flto -gdwarf-2 -fomit-frame-pointer -Wall -Wstrict-prototypes -DNDEBUG=1 -DCPUSTYLE_R7S721=1 -DCPUSTYLE_R7S721020=1 -MD -MP -MF ./dep/audio.o.d -I.. -I../rza1x_inc -I.. ../audio.c -o audio.o

If NEON not used - __ARM_FEATURE_FMAF is defined.


200000e0 <fmaf>:
200000e0: eeb7 6ac0 vcvt.f64.f32 d6, s0
200000e4: eeb7 7ae0 vcvt.f64.f32 d7, s1
200000e8: eeb7 1ac1 vcvt.f64.f32 d1, s2
200000ec: ee06 1b07 vmla.f64 d1, d6, d7
200000f0: eeb7 0bc1 vcvt.f32.f64 s0, d1
200000f4: 4770 bx lr

genadi (genaspb) said : #9

Thanks Tejas Belagod, that solved my question.

genadi (genaspb) said : #10

Sorry, current version of gcc in any cases for Cortex-A9 not define __ARM_FEATURE_FMA, FP_FAST_FMA or FP_FAST_FMAF