sqrt() missing potential VFP optimization

Asked by Bill Dittmann

from experience, i know that with when using sqrt() with VFPv4 that calls to sqrt() are not inlined. this is because the argument x to sqrt() may be negative. and the square root function may need to set __errno to EDOM (aka domain error). this is checked in the lib functions sqrt() and sqrtf().

even using __builtin_sqrt() the code generator makes a call out to sqrt().

this "limitation" can be overridden using GCC -fno-math-errno option in which case, sqrt is properly converted to use VFP insn, fsqrtd as desired.

However, use of the -fno-math-errno switch is a pretty big (and scary hammer) to use globally for this issue.

i asked myself - could i do better?

in an attempt to guarantee performance with a prior knowledge that argument x was positive, i tried by passing fabs(x) or even __builtin_fabs(x) to __builtin_sqrt().

the optimizer did not pick up on the now guaranteed positive native of the sqrt() argument, and use fsqrt insn.

in a similar fashion, i called __builtin_sqrt() with square of x as sqrt(x * x) which of course is also always results in a positive argument to sqrt(), and still code called out to the sqrt lib function.

reaching back into my tool pouch, i pulled out __builtin_unreachable(), and added it before the call to sqrt(x)

if (x < 0) __builtin_unreachable();

still no pleasure on my side - still a library fn call out

i looked but could not find a compile time pragma to allow me to locally disable errno updates.

the __promise() "keyword" supported on some compilers sounded promising , but it appeared to be unimplemented on the latest 2014-q3 release.

my question is whether i am expecting too much from the compiler for it to be able to recognize (even with hints) when arguments to certain __builtin math functions such as sqrt() cannot result in errno updates, and thus inlined insns could be used.

the other possibility of my failure to achieve max perf is that x could be NAN or INF which may be why the compiler stubbornly, calls lib funcs and does not inline. (my argument was actually an int cast to double so there is actually no path to NAN or INF values either. changing to unsigned int arg cast to double also had no impact.) regardless, i don't know of any means to convey the validity of x, eg, x is positive and not NAN | INF info using "if () __builtin_unreachable()" method.

could someone brilliant please shed some of your expert light on my dark plight?

PS: the use of sqrt() or other such math function from an interrupt handler will/might update errno in the current task reent context which is probably not what you want. a simple runtime check for domain errors done at the application (interrupt) level code could avoid a random task inheriting a random errno from an interrupt.

Question information

Language:
English Edit question
Status:
Answered
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Terry Guo (terry.guo) said :
#1

Issue confirmed. Could be an optimization chance.

Revision history for this message
Terry Guo (terry.guo) said :
#2

Documented in internal system and plan to handle it as an optimization chance.

Can you help with this problem?

Provide an answer of your own, or ask Bill Dittmann for more information if necessary.

To post a message you must log in.