python crash with Intel MKL - DLL issues with Intel Compiler

Asked by Chris Richardson

I have compiled dolfin (latest development version) and installed, along with latest ffc,ufc,ufl,fiat,instant etc.
using Intel compilers, icc, icpc and Intel BLAS (known as MKL).

C++ programs compile perfectly, and work.
Python crashes:

python: symbol lookup error: /usr/local/Cluster-Apps/intel/mkl/10.3.10.319/composer_xe_2011_sp1.10.319/mkl/lib/intel64/libmkl_avx.so: undefined symbol: mkl_serv_allocate

On investigation, it seems to be a python library loading issue, and it seems that the MKL runtime library needs to be loaded
with RTLD_GLOBAL, which is not the python default.
Inserting the following line at the start of a python script cures the problem:

import ctypes
ctypes.CDLL('libmkl_rt.so', ctypes.RTLD_GLOBAL)

Would it be possible to add this to the "site-packages/dolfin/importhandler" module, when compiling with Intel compilers?

.....

Another Intel issue, which may be related, is shown below! Calls to throw() from c++ result in a SEGV in python.
This time, the problem can be avoided by preloading libstdc++ in python, but I am not sure why that should help...
Any ideas appreciated.

Thanks

Chris R.

[cnr12@login-sand1 python]$ gdb python
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
Missing separate debuginfos, use: debuginfo-install python-2.6.6-29.el6_2.2.x86_64
(gdb) run convection_example.py
Starting program: /usr/bin/python convection_example.py
[Thread debugging using libthread_db enabled]

[New Thread 0x7fffd7be3700 (LWP 53568)]
[New Thread 0x7fffd71e2700 (LWP 53569)]

Program received signal SIGSEGV, Segmentation fault.
0x00007fffdf492774 in __cxa_allocate_exception () from /usr/lib64/libstdc++.so.6
(gdb)
(gdb) bt
#0 0x00007fffdf492774 in __cxa_allocate_exception () from /usr/lib64/libstdc++.so.6
#1 0x00007fffeb8fc220 in dolfin::Logger::dolfin_error (this=0x7ffff7fd2010, location=Cannot access memory at address 0x0
)
    at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/Logger.cpp:123
#2 0x00007fffeb8f9f1e in dolfin::dolfin_error (location=
    Traceback (most recent call last):
  File "/usr/lib64/../share/gdb/python/libstdcxx/v6/printers.py", line 558, in to_string
    return self.val['_M_dataplus']['_M_p'].lazy_string (length = len)
RuntimeError: Cannot access memory at address 0xffffffffffffffeb
, task=Cannot access memory at address 0x0
) at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/log.cpp:135
#3 0x00007fffeb53c5d0 in dolfin::File::File (this=0x1134b20, filename=Cannot access memory at address 0x0
)
    at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/io/File.cpp:88
#4 0x00007fffda6f403a in _wrap_new_File__SWIG_0 (nobjs=0, swig_obj=0x6)
    at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/swig/modules/io/modulePYTHON_wrap.cxx:21942
#5 0x00007fffda6f33f4 in _wrap_new_File (self=0x7ffff7fd2010, args=0x0)
    at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/swig/modules/io/modulePYTHON_wrap.cxx:22075
#6 0x00007ffff7b01706 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#7 0x00007ffff7b03797 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#8 0x00007ffff7a91db0 in ?? () from /usr/lib64/libpython2.6.so.1.0
#9 0x00007ffff7a67303 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#10 0x00007ffff7a7c70f in ?? () from /usr/lib64/libpython2.6.so.1.0
#11 0x00007ffff7a67303 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#12 0x00007ffff7abfa7e in ?? () from /usr/lib64/libpython2.6.so.1.0
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) list
1 // class template array -*- C++ -*-
2
3 // Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc.
4 //
5 // This file is part of the GNU ISO C++ Library. This library is free
6 // software; you can redistribute it and/or modify it under the
7 // terms of the GNU General Public License as published by the
8 // Free Software Foundation; either version 3, or (at your option)
9 // any later version.
10
(gdb) up
#1 0x00007fffeb8fc220 in dolfin::Logger::dolfin_error (this=0x7ffff7fd2010, location=Cannot access memory at address 0x0
)
    at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/Logger.cpp:123
123 throw std::runtime_error(s.str());
(gdb)

Question information

Language:
English Edit question
Status:
Answered
For:
DOLFIN Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Johan Hake (johan-hake) said :
#1

If it is possible to runtime, within Python, figure out what compiler
DOLFIN has been compiled with runtime it should be possible to add such
conditional imports.

If I am not wrong, I think we already use pkg-config for this, so it
should be possible. Not sure we would like to use pkg-config for this
particular case though...

Johan

On 08/07/2012 03:26 PM, Chris Richardson wrote:
> New question #205219 on DOLFIN:
> https://answers.launchpad.net/dolfin/+question/205219
>
> I have compiled dolfin (latest development version) and installed, along with latest ffc,ufc,ufl,fiat,instant etc.
> using Intel compilers, icc, icpc and Intel BLAS (known as MKL).
>
> C++ programs compile perfectly, and work.
> Python crashes:
>
> python: symbol lookup error: /usr/local/Cluster-Apps/intel/mkl/10.3.10.319/composer_xe_2011_sp1.10.319/mkl/lib/intel64/libmkl_avx.so: undefined symbol: mkl_serv_allocate
>
> On investigation, it seems to be a python library loading issue, and it seems that the MKL runtime library needs to be loaded
> with RTLD_GLOBAL, which is not the python default.
> Inserting the following line at the start of a python script cures the problem:
>
> import ctypes
> ctypes.CDLL('libmkl_rt.so', ctypes.RTLD_GLOBAL)
>
> Would it be possible to add this to the "site-packages/dolfin/importhandler" module, when compiling with Intel compilers?
>
> .....
>
> Another Intel issue, which may be related, is shown below! Calls to throw() from c++ result in a SEGV in python.
> This time, the problem can be avoided by preloading libstdc++ in python, but I am not sure why that should help...
> Any ideas appreciated.
>
> Thanks
>
> Chris R.
>
>
>
> [cnr12@login-sand1 python]$ gdb python
> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
> Missing separate debuginfos, use: debuginfo-install python-2.6.6-29.el6_2.2.x86_64
> (gdb) run convection_example.py
> Starting program: /usr/bin/python convection_example.py
> [Thread debugging using libthread_db enabled]
>
> [New Thread 0x7fffd7be3700 (LWP 53568)]
> [New Thread 0x7fffd71e2700 (LWP 53569)]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007fffdf492774 in __cxa_allocate_exception () from /usr/lib64/libstdc++.so.6
> (gdb)
> (gdb) bt
> #0 0x00007fffdf492774 in __cxa_allocate_exception () from /usr/lib64/libstdc++.so.6
> #1 0x00007fffeb8fc220 in dolfin::Logger::dolfin_error (this=0x7ffff7fd2010, location=Cannot access memory at address 0x0
> )
> at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/Logger.cpp:123
> #2 0x00007fffeb8f9f1e in dolfin::dolfin_error (location=
> Traceback (most recent call last):
> File "/usr/lib64/../share/gdb/python/libstdcxx/v6/printers.py", line 558, in to_string
> return self.val['_M_dataplus']['_M_p'].lazy_string (length = len)
> RuntimeError: Cannot access memory at address 0xffffffffffffffeb
> , task=Cannot access memory at address 0x0
> ) at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/log.cpp:135
> #3 0x00007fffeb53c5d0 in dolfin::File::File (this=0x1134b20, filename=Cannot access memory at address 0x0
> )
> at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/io/File.cpp:88
> #4 0x00007fffda6f403a in _wrap_new_File__SWIG_0 (nobjs=0, swig_obj=0x6)
> at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/swig/modules/io/modulePYTHON_wrap.cxx:21942
> #5 0x00007fffda6f33f4 in _wrap_new_File (self=0x7ffff7fd2010, args=0x0)
> at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/swig/modules/io/modulePYTHON_wrap.cxx:22075
> #6 0x00007ffff7b01706 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
> #7 0x00007ffff7b03797 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
> #8 0x00007ffff7a91db0 in ?? () from /usr/lib64/libpython2.6.so.1.0
> #9 0x00007ffff7a67303 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
> #10 0x00007ffff7a7c70f in ?? () from /usr/lib64/libpython2.6.so.1.0
> #11 0x00007ffff7a67303 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
> #12 0x00007ffff7abfa7e in ?? () from /usr/lib64/libpython2.6.so.1.0
> ---Type <return> to continue, or q <return> to quit---q
> Quit
> (gdb) list
> 1 // class template array -*- C++ -*-
> 2
> 3 // Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc.
> 4 //
> 5 // This file is part of the GNU ISO C++ Library. This library is free
> 6 // software; you can redistribute it and/or modify it under the
> 7 // terms of the GNU General Public License as published by the
> 8 // Free Software Foundation; either version 3, or (at your option)
> 9 // any later version.
> 10
> (gdb) up
> #1 0x00007fffeb8fc220 in dolfin::Logger::dolfin_error (this=0x7ffff7fd2010, location=Cannot access memory at address 0x0
> )
> at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/Logger.cpp:123
> 123 throw std::runtime_error(s.str());
> (gdb)
>
>
>
>
>
>

Revision history for this message
Garth Wells (garth-wells) said :
#2

On Thursday, 9 August 2012, Johan Hake wrote:

> Question #205219 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/205219
>
> Status: Open => Answered
>
> Johan Hake proposed the following answer:
> If it is possible to runtime, within Python, figure out what compiler
> DOLFIN has been compiled with runtime it should be possible to add such
> conditional imports.

It's a bit trickier because we would need to check that BLAS version, not
the compiler.

Garth

>
> If I am not wrong, I think we already use pkg-config for this, so it
> should be possible. Not sure we would like to use pkg-config for this
> particular case though...
>
> Johan
>
> On 08/07/2012 03:26 PM, Chris Richardson wrote:
> > New question #205219 on DOLFIN:
> > https://answers.launchpad.net/dolfin/+question/205219
> >
> > I have compiled dolfin (latest development version) and installed, along
> with latest ffc,ufc,ufl,fiat,instant etc.
> > using Intel compilers, icc, icpc and Intel BLAS (known as MKL).
> >
> > C++ programs compile perfectly, and work.
> > Python crashes:
> >
> > python: symbol lookup error:
> /usr/local/Cluster-Apps/intel/mkl/10.3.10.319/composer_xe_2011_sp1.10.319/mkl/lib/intel64/libmkl_avx.so:
> undefined symbol: mkl_serv_allocate
> >
> > On investigation, it seems to be a python library loading issue, and it
> seems that the MKL runtime library needs to be loaded
> > with RTLD_GLOBAL, which is not the python default.
> > Inserting the following line at the start of a python script cures the
> problem:
> >
> > import ctypes
> > ctypes.CDLL('libmkl_rt.so', ctypes.RTLD_GLOBAL)
> >
> > Would it be possible to add this to the
> "site-packages/dolfin/importhandler" module, when compiling with Intel
> compilers?
> >
> > .....
> >
> > Another Intel issue, which may be related, is shown below! Calls to
> throw() from c++ result in a SEGV in python.
> > This time, the problem can be avoided by preloading libstdc++ in python,
> but I am not sure why that should help...
> > Any ideas appreciated.
> >
> > Thanks
> >
> > Chris R.
> >
> >
> >
> > [cnr12@login-sand1 python]$ gdb python
> > GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
> > Copyright (C) 2010 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <
> http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> > and "show warranty" for details.
> > This GDB was configured as "x86_64-redhat-linux-gnu".
> > For bug reporting instructions, please see:
> > <http://www.gnu.org/software/gdb/bugs/>...
> > Reading symbols from /usr/bin/python...(no debugging symbols
> found)...done.
> > Missing separate debuginfos, use: debuginfo-install
> python-2.6.6-29.el6_2.2.x86_64
> > (gdb) run convection_example.py
> > Starting program: /usr/bin/python convection_example.py
> > [Thread debugging using libthread_db enabled]
> >
> > [New Thread 0x7fffd7be3700 (LWP 53568)]
> > [New Thread 0x7fffd71e2700 (LWP 53569)]
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x00007fffdf492774 in __cxa_allocate_exception () from
> /usr/lib64/libstdc++.so.6
> > (gdb)
> > (gdb) bt
> > #0 0x00007fffdf492774 in __cxa_allocate_exception () from
> /usr/lib64/libstdc++.so.6
> > #1 0x00007fffeb8fc220 in dolfin::Logger::dolfin_error
> (this=0x7ffff7fd2010, location=Cannot access memory at address 0x0
> > )
> > at
> /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/Logger.cpp:123
> > #2 0x00007fffeb8f9f1e in dolfin::dolfin_error (location=
> > Traceback (most recent call last):
> > File "/usr/lib64/../share/gdb/python/libstdcxx/v6/printers.py", line
> 558, in to_string
> > return self.val['_M_dataplus']['_M_p'].lazy_string (length = len)
> > RuntimeError: Cannot access memory at address 0xffffffffffffffeb
> > , task=Cannot access memory at address 0x0
> > ) at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/log/log.cpp:135
> > #3 0x00007fffeb53c5d0 in dolfin::File::File (this=0x1134b20,
> filename=Cannot access memory at address 0x0
> > )
> > at /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/io/File.cpp:88
> > #4 0x00007fffda6f403a in _wrap_new_File__SWIG_0 (nobjs=0, swig_obj=0x6)
> > at
> /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/swig/modules/io/modulePYTHON_wrap.cxx:21942
> > #5 0x00007fffda6f33f4 in _wrap_new_File (self=0x7ffff7fd2010, args=0x0)
> > at
> /home/cnr12/stable/FEniCS/src/dolfin.phdf5/dolfin/swig/modules/io/modulePYTHON_wrap.cxx:22075
> > #6 0x00007ffff7b01706 in PyEval_EvalFrameEx () from
> /usr/lib64/libpython2.6.so.1.0
> > #7 0x00007ffff7b03797 in PyEval_EvalCodeEx () from
> /usr/lib64/libpython2.6.so.1.0
> > #8

--
Garth N. Wells
Department of Engineering, University of Cambridge
http://www.eng.cam.ac.uk/~gnw20

Revision history for this message
Chris Richardson (chris-bpi) said :
#3

Is it possible to do anything during configuration with cmake?

Can you help with this problem?

Provide an answer of your own, or ask Chris Richardson for more information if necessary.

To post a message you must log in.