# Speed up expression evaluations with vectorized functions

Hi,

I would like to speed up the evaluation of my Expression implementation (which I can not express as an JIT-compiled expression).
My current code overloads the Expression class and implements the eval function. However, the main computational cost is
caused by the overhead of calling the eval function for each DOF. The attached Python code takes about 110s to interpolate a
simple Expression onto a 1000201 DOFs.

One easy way to speed up such loops is to use vectorized Python functions (a speed up of 10x - 200x is expected,
see e.g. http://technicaldiscovery.blogspot.co.uk/2011/06/speeding-up-python-numpy-cython-and.html).

I was wondering if there is a way if/how I can make use such vectorized functions in Expressions. For that, the
Expression class would need an evaluation function which evaluates at a whole list of (x, y) coordinates. I imagine this to
look like:

class Turbines(Expression):
def eval_all(self, values, x, y):
# x = [x0, x1, x2, ...])
# y = [y0, y1, y2, ...]
values[0] = exp(-1.0/(1.0-numpy.array(x)**2) - 1.0/(1.0-numpy.array(y)**2) + 2) # This is a simple vectorized function

I am happy for any hints.

Thanks very much,

Simon

## Question information

Language:
English Edit question
Status:
Solved
For:
DOLFIN Edit question
Assignee:
No assignee Edit question
Solved by:
Simon Funke
Solved:
Last query:
 Revision history for this message Simon Funke (simon-funke) said on 2013-02-07: #1

Here is the Python test code I used:

from dolfin import *
from math import exp
import numpy

class Turbines(Expression):
def eval(self, values, x):
if (x[0]**2 < 1.0) and (x[1]**2 < 1.0):
values[0] = exp(-1.0/(1.0-x[0]**2) - 1.0/(1.0-x[1]**2) + 2)
else:
values[0] = 0.0

if __name__ == "__main__":

mesh = RectangleMesh(-1, -1, 1, 1, 1000, 1000)
V = FunctionSpace(mesh, "CG", 1)
f = Function(V)
print "DOFs: ", len(f.vector().array())

t = Timer("Project expression")
turbine = Turbines()
f.interpolate(turbine)
print "Time to project expression: %f s." % t.stop()

 Revision history for this message Johan Hake (johan-hake) said on 2013-02-07: #2

Have you tried the extended compile expression interface?

Johan

from dolfin import *
class Turbines(Expression):
def eval(self, values, x):
if (x[0]**2 < 1.0) and (x[1]**2 < 1.0):
values[0] = exp(-1.0/(1.0-x[0]**2) - 1.0/(1.0-x[1]**2) + 2)
else:
values[0] = 0.0

code = '''
class MyFunc : public Expression
{
public:

MyFunc() : Expression()
{
}

void eval(Array<double>& values, const Array<double>& x,
const ufc::cell& c) const
{

const double x2 = x[0]*x[0];
const double y2 = x[1]*x[1];
if ((x2 < 1.0) && (y2 < 1.0))
values[0] = exp(-1.0/(1.0-x2) - 1.0/(1.0-y2) + 2);
else
values[0] = 0.0;
}
};
'''

if __name__ == "__main__":

mesh = RectangleMesh(-1, -1, 1, 1, 100, 100)
V = FunctionSpace(mesh, "CG", 1)
f = Function(V)
print "DOFs: ", len(f.vector().array())

t = Timer("Project expression")
turbine = Turbines()
f.interpolate(turbine)
print "Time to project expression: %f s." % t.stop()
t = Timer("Project C++ expression")
turbine = Expression(code)
f.interpolate(turbine)
print "Time to project expression: %f s." % t.stop()

On 02/07/2013 02:06 PM, Simon Funke wrote:
> Question #221284 on DOLFIN changed:
>
> Here is the Python test code I used:
>
>
> from dolfin import *
> from math import exp
> import numpy
>
> class Turbines(Expression):
> def eval(self, values, x):
> if (x[0]**2 < 1.0) and (x[1]**2 < 1.0):
> values[0] = exp(-1.0/(1.0-x[0]**2) - 1.0/(1.0-x[1]**2) + 2)
> else:
> values[0] = 0.0
>
> if __name__ == "__main__":
>
> mesh = RectangleMesh(-1, -1, 1, 1, 1000, 1000)
> V = FunctionSpace(mesh, "CG", 1)
> f = Function(V)
> print "DOFs: ", len(f.vector().array())
>
> t = Timer("Project expression")
> turbine = Turbines()
> f.interpolate(turbine)
> print "Time to project expression: %f s." % t.stop()
>

 Revision history for this message Mikael Mortensen (mikael-mortensen) said on 2013-02-07: #3

Hi Simon,

You should probably implement this in C++. Type help(Expression) in a Python shell to see how this can be done. The suggested vectorized approach is a bit tricky, but this should do the trick (not tested):

from numpy import exp
x = interpolate(Expression("x[0]"), V)
y = interpolate(Expression("x[1]"), V)
ff = exp(-1/(1-x.vector().array()**2) - 1./(1-y.vector().array()**2)+2)
f = Function(V)
f.vector().set_local(ff)

Best regards

Mikael

Den Feb 7, 2013 kl. 2:06 PM skrev Simon Funke:

> Question #221284 on DOLFIN changed:
>
> Here is the Python test code I used:
>
>
> from dolfin import *
> from math import exp
> import numpy
>
> class Turbines(Expression):
> def eval(self, values, x):
> if (x[0]**2 < 1.0) and (x[1]**2 < 1.0):
> values[0] = exp(-1.0/(1.0-x[0]**2) - 1.0/(1.0-x[1]**2) + 2)
> else:
> values[0] = 0.0
>
> if __name__ == "__main__":
>
> mesh = RectangleMesh(-1, -1, 1, 1, 1000, 1000)
> V = FunctionSpace(mesh, "CG", 1)
> f = Function(V)
> print "DOFs: ", len(f.vector().array())
>
> t = Timer("Project expression")
> turbine = Turbines()
> f.interpolate(turbine)
> print "Time to project expression: %f s." % t.stop()
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.

 Revision history for this message Simon Funke (simon-funke) said on 2013-02-07: #4

I benchmarked the three solutions:
1) A pure Python implementation
2) A vectorized version (with numpy)
3) A C++ version
They all produced bitwise the same answer.

The timing benchmark for a Function with 1002001 DOF is:
>> Time to interpolate Python expression: 328.061709 s.
>> Time to interpolate vectorized expression: 1.293245 s.
>> Time to interpolate C++ expression: 0.806163 s.

With MPI on 2 processors I get the following:
>> Time to project Python expression: 164.069351 s.
>> Time to project vectorized expression: 1.638153 s.
>> Time to project C++ expression: 0.514831 s.

Thanks again,

Simon

 Revision history for this message Mikael Mortensen (mikael-mortensen) said on 2013-02-07: #5

Interesting. I assume the numpy timing includes the two interpolations of x[0] and x[1]?
If so, can you also give us the timing with those two interpolations outside of the time loop? Just curious.

Best regards

Mikael

Den Feb 7, 2013 kl. 3:46 PM skrev Simon Funke:

> Question #221284 on DOLFIN changed:
>
>
> Simon Funke confirmed that the question is solved:
>
> I benchmarked the three solutions:
> 1) A pure Python implementation
> 2) A vectorized version (with numpy)
> 3) A C++ version
> They all produced bitwise the same answer.
>
> The timing benchmark for a Function with 1002001 DOF is:
>>> Time to interpolate Python expression: 328.061709 s.
>>> Time to interpolate vectorized expression: 1.293245 s.
>>> Time to interpolate C++ expression: 0.806163 s.
>
> With MPI on 2 processors I get the following:
>>> Time to project Python expression: 164.069351 s.
>>> Time to project vectorized expression: 1.638153 s.
>>> Time to project C++ expression: 0.514831 s.
>
> Thanks again,
>
> Simon
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.