OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.

Asked by Damiaan

Why I run this test script:

from dolfin import *
print 'got dolfin'
mesh = UnitCube(10,10,10)
print 'mesh ok'
V = VectorFunctionSpace(mesh, "CG", 1)
print 'Vector fcn space ok'
v = Function(V)

it works fine on 1 node, but if I use mpirun as follows:

mpirun --bynode -np 16 python test.py

then it stops with:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    V = VectorFunctionSpace(mesh, "CG", 1)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 505, in __init__
    FunctionSpaceBase.__init__(self, mesh, element)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 77, in __init__
    ufc_element, ufc_dofmap = jit(self._ufl_element)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 70, in mpi_jit
    output = local_jit(*args, **kwargs)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 102, in jit
    raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.

swig is installed, I tried setting swig_binary and swig_path without luck. How do I force it to look in the right location?

Question information

Language:
English Edit question
Status:
Solved
For:
DOLFIN Edit question
Assignee:
No assignee Edit question
Solved by:
Damiaan
Solved:
Last query:
Last reply:

This question was reopened

Revision history for this message
Damiaan (dhabets) said :
#1

Set:

parameters["swig_binary"] = "swig"
parameters["swig_path"] = "/home/whereveritisinstalled/FEniCS/bin"

Revision history for this message
Damiaan (dhabets) said :
#2

I spoke too soon, that doesn't work either:

[gpc-f135n041:30706] *** Process received signal ***
[gpc-f135n041:30706] Signal: Segmentation fault (11)
[gpc-f135n041:30706] Signal code: Address not mapped (1)
[gpc-f135n041:30706] Failing at address: 0xfbc128
[gpc-f135n041:30706] [ 0] /lib64/libpthread.so.0(+0xf4a0) [0x7ffff5f1e4a0]
[gpc-f135n041:30706] [ 1] /scinet/gpc/mpi/openmpi/1.4.4-gcc-v4.6.1/lib/libopen-pal.so.0(opal_memory_ptmalloc2_int_malloc+0x16a) [0x7ffff6c9af5a]
[gpc-f135n041:30706] [ 2] /scinet/gpc/mpi/openmpi/1.4.4-gcc-v4.6.1/lib/libopen-pal.so.0(+0x3e576) [0x7ffff6c9c576]
[gpc-f135n041:30706] [ 3] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(_PyObject_GC_Malloc+0x19) [0x7ffff7b2d6d9]
[gpc-f135n041:30706] [ 4] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(_PyObject_GC_NewVar+0x2e) [0x7ffff7b2d83e]
[gpc-f135n041:30706] [ 5] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyFrame_New+0x3c5) [0x7ffff7a79425]
[gpc-f135n041:30706] [ 6] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6244) [0x7ffff7af5324]
[gpc-f135n041:30706] [ 7] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855) [0x7ffff7af6175]
[gpc-f135n041:30706] [ 8] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(+0x72cac) [0x7ffff7a79cac]
[gpc-f135n041:30706] [ 9] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyObject_Call+0x53) [0x7ffff7a520f3]
[gpc-f135n041:30706] [10] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(+0x4b1cb) [0x7ffff7a521cb]
[gpc-f135n041:30706] [11] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyObject_CallMethod+0xc1) [0x7ffff7a524e1]
[gpc-f135n041:30706] [12] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_ReInitThreads+0x7d) [0x7ffff7aee37d]
[gpc-f135n041:30706] [13] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyOS_AfterFork+0xe) [0x7ffff7b3016e]
[gpc-f135n041:30706] [14] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(+0x12c6b6) [0x7ffff7b336b6]
[gpc-f135n041:30706] [15] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4ffa) [0x7ffff7af40da]
[gpc-f135n041:30706] [16] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855) [0x7ffff7af6175]
[gpc-f135n041:30706] [17] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5245) [0x7ffff7af4325]
[gpc-f135n041:30706] [18] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855) [0x7ffff7af6175]
[gpc-f135n041:30706] [19] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(+0x72da3) [0x7ffff7a79da3]
[gpc-f135n041:30706] [20] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyObject_Call+0x53) [0x7ffff7a520f3]
[gpc-f135n041:30706] [21] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(+0x5881f) [0x7ffff7a5f81f]
[gpc-f135n041:30706] [22] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyObject_Call+0x53) [0x7ffff7a520f3]
[gpc-f135n041:30706] [23] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(+0xabaf0) [0x7ffff7ab2af0]
[gpc-f135n041:30706] [24] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(+0xa78e8) [0x7ffff7aae8e8]
[gpc-f135n041:30706] [25] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyObject_Call+0x53) [0x7ffff7a520f3]
[gpc-f135n041:30706] [26] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x3e7d) [0x7ffff7af2f5d]
[gpc-f135n041:30706] [27] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855) [0x7ffff7af6175]
[gpc-f135n041:30706] [28] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5245) [0x7ffff7af4325]
[gpc-f135n041:30706] [29] /home/s/steinman/dhabets/Root/FEniCS/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855) [0x7ffff7af6175]
[gpc-f135n041:30706] *** End of error message ***
Traceback (most recent call last):
  File "test.py", line 10, in <module>
    V = VectorFunctionSpace(mesh, "CG", 1)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 505, in __init__
    FunctionSpaceBase.__init__(self, mesh, element)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 77, in __init__
    ufc_element, ufc_dofmap = jit(self._ufl_element)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 70, in mpi_jit
    output = local_jit(*args, **kwargs)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 102, in jit
    raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.

Revision history for this message
Damiaan (dhabets) said :
#3

swig is 2.0.5

Revision history for this message
Damiaan (dhabets) said :
#4

Ok, ignore the previous one; only problem left seems to be the swig issue:

Traceback (most recent call last):
  File "test.py", line 10, in <module>
    V = VectorFunctionSpace(mesh, "CG", 1)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 505, in __init__
    FunctionSpaceBase.__init__(self, mesh, element)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 77, in __init__
    ufc_element, ufc_dofmap = jit(self._ufl_element)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 70, in mpi_jit
    output = local_jit(*args, **kwargs)
  File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 102, in jit
    raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.

Revision history for this message
Kent-Andre Mardal (kent-and) said :
#5

On 15 January 2013 21:41, Damiaan <email address hidden>wrote:

> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Damiaan gave more information on the question:
> Ok, ignore the previous one; only problem left seems to be the swig
> issue:
>

You can adjust the PATH variable to make sure it points to the directory
where swig is:
e.g. in bash
export PATH=$PATH:path_to_swig

Kent

>
> Traceback (most recent call last):
> File "test.py", line 10, in <module>
> V = VectorFunctionSpace(mesh, "CG", 1)
> File
> "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py",
> line 505, in __init__
> FunctionSpaceBase.__init__(self, mesh, element)
> File
> "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py",
> line 77, in __init__
> ufc_element, ufc_dofmap = jit(self._ufl_element)
> File
> "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py",
> line 70, in mpi_jit
> output = local_jit(*args, **kwargs)
> File
> "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py",
> line 102, in jit
> raise OSError, "Could not find swig installation. Pass an existing "\
> OSError: Could not find swig installation. Pass an existing swig binary or
> install SWIG version 2.0 or higher.
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>

Revision history for this message
Damiaan (dhabets) said :
#6

Hi Kent,

   path to swig is already set and is correct. version is 2.0.5.

thanks!

Revision history for this message
Johan Hake (johan-hake) said :
#7

You need to set the swig parameters before you do anything as these are
cashed by instant. So:

from dolfin import *
parameters["swig_path"] = "somewhere"

Also it is always good practice to try to run a new file in serial first
generating all JIT compiled code and then re run it in parallel.

Johan

On 01/15/2013 09:21 PM, Damiaan wrote:
> New question #219270 on DOLFIN:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Why I run this test script:
>
> from dolfin import *
> print 'got dolfin'
> mesh = UnitCube(10,10,10)
> print 'mesh ok'
> V = VectorFunctionSpace(mesh, "CG", 1)
> print 'Vector fcn space ok'
> v = Function(V)
>
>
> it works fine on 1 node, but if I use mpirun as follows:
>
> mpirun --bynode -np 16 python test.py
>
> then it stops with:
>
> Traceback (most recent call last):
> File "test.py", line 6, in <module>
> V = VectorFunctionSpace(mesh, "CG", 1)
> File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 505, in __init__
> FunctionSpaceBase.__init__(self, mesh, element)
> File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 77, in __init__
> ufc_element, ufc_dofmap = jit(self._ufl_element)
> File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 70, in mpi_jit
> output = local_jit(*args, **kwargs)
> File "/home/s/steinman/dhabets/Root/FEniCS/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 102, in jit
> raise OSError, "Could not find swig installation. Pass an existing "\
> OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
>
> swig is installed, I tried setting swig_binary and swig_path without luck. How do I force it to look in the right location?
>
>

Revision history for this message
Damiaan (dhabets) said :
#8

Thanks Johan; I already did that; it works serially, it fails when going through MPI.

Revision history for this message
Johan Hake (johan-hake) said :
#9

Try using instead of just mpirun

  mpirun -x PATH python

You can try adding -x PYTHONPATH and -x LD_LIBRARY_PATH too.

Johan

On 01/15/2013 11:06 PM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Thanks Johan; I already did that; it works serially, it fails when going
> through MPI.
>

Revision history for this message
Damiaan (dhabets) said :
#10

No difference; note, this only occurs when I try to run it over more than 1 node. mpirun on 1 node just works fine.

What exactly does the code check for to determine if swig is available or not?

Revision history for this message
Kent-Andre Mardal (kent-and) said :
#11

Maybe you can modify the instant/config.py file to print out what the
difference is when
running on one and two nodes?
The check_and_set_swig_binary is relatively small and straightforward.

Kent

On 15 January 2013 23:35, Damiaan <email address hidden>wrote:

> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> No difference; note, this only occurs when I try to run it over more
> than 1 node. mpirun on 1 node just works fine.
>
> What exactly does the code check for to determine if swig is available
> or not?
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>

Revision history for this message
Damiaan (dhabets) said :
#12

Kent, yes, will try that once the cluster comes back from maintenance and will post back.

Revision history for this message
Damiaan (dhabets) said :
#13

Ok, when running on a single node:

check_and_set_swig_binary has:
swig_binary=/scratch/s/steinman/dhabets/Root/FEniCS/bin/swig
result=
SWIG Version 2.0.3

Compiled with g++ [x86_64-unknown-linux-gnu]

Configured options: +pcre

Please see http://www.swig.org for reporting bugs and further information

output=0

when running on more than 1 node:

check_and_set_swig_binary has:
swig_binary=/scratch/s/steinman/dhabets/Root/FEniCS/bin/swig
result=
output=-11

Any ideas?

Revision history for this message
Johan Hake (johan-hake) said :
#14

Try put the following in your run script as far up as possible.

Johan

#####################################################
# Are the directory where you have your SWIG binary
# available on the compute nodes?

import os
swig_path = "/scratch/s/steinman/dhabets/Root/FEniCS/bin"
print "YES" if os.path.isdir(swig_path) else "NO"

# Is the PATH environment variable the same on the
# compute nodes as on the front node?
print os.environ["PATH"]

# Let instant check your SWIG path directly:
import
print "FOUND" if instant.check_and_set_swig_binary(binary="swig",
path=swig_path) else "NOT FOUND"

On 01/22/2013 01:55 AM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Ok, when running on a single node:
>
> check_and_set_swig_binary has:
> swig_binary=/scratch/s/steinman/dhabets/Root/FEniCS/bin/swig
> result=
> SWIG Version 2.0.3
>
> Compiled with g++ [x86_64-unknown-linux-gnu]
>
> Configured options: +pcre
>
> Please see http://www.swig.org for reporting bugs and further
> information
>
> output=0
>
> when running on more than 1 node:
>
> check_and_set_swig_binary has:
> swig_binary=/scratch/s/steinman/dhabets/Root/FEniCS/bin/swig
> result=
> output=-11
>
> Any ideas?
>

Revision history for this message
Damiaan (dhabets) said :
#15

Single node:

YES
/home/s/steinman/dhabets/Root/FEniCS/bin:/scinet/gpc/mpi/openmpi/otpo/1.0.0//bin:/scinet/gpc/mpi/openmpi/1.4.4-gcc-v4.6.1/bin:/scinet/gpc/compilers/gcc-4.6.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lpp/mmfs/bin:/opt/torque/bin:/opt/torque/sbin:/scinet/gpc/tools/editors/nano/nano-2.2.4/bin:/scinet/gpc/bin6:/home/s/steinman/dhabets/Root/bin
FOUND

Multi-node:

YES
/home/s/steinman/dhabets/Root/FEniCS/bin:/scinet/gpc/mpi/openmpi/otpo/1.0.0//bin:/scinet/gpc/mpi/openmpi/1.4.4-gcc-v4.6.1/bin:/scinet/gpc/compilers/gcc-4.6.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lpp/mmfs/bin:/opt/torque/bin:/opt/torque/sbin:/scinet/gpc/tools/editors/nano/nano-2.2.4/bin:/scinet/gpc/bin6:/home/s/steinman/dhabets/Root/bin
FOUND

it prints the above for 16 processes and then goes through the computation of the partitions and then it prints out 16 times
mesh ok

after that it all fails with:

swig

False

It seems that here: V = VectorFunctionSpace(mesh, "CG", 1)
it forgot about the path?

Revision history for this message
Johan Hake (johan-hake) said :
#16

Can you also confirm that:

   parameters.swig_binary = "swig"
   parameters.swig_path = "/scratch/s/steinman/dhabets/Root/FEniCS/bin"

is set before you compile your VectorFunctionSpace?

Johan

On 01/22/2013 04:31 PM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Single node:
>
> YES
> /home/s/steinman/dhabets/Root/FEniCS/bin:/scinet/gpc/mpi/openmpi/otpo/1.0.0//bin:/scinet/gpc/mpi/openmpi/1.4.4-gcc-v4.6.1/bin:/scinet/gpc/compilers/gcc-4.6.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lpp/mmfs/bin:/opt/torque/bin:/opt/torque/sbin:/scinet/gpc/tools/editors/nano/nano-2.2.4/bin:/scinet/gpc/bin6:/home/s/steinman/dhabets/Root/bin
> FOUND
>
> Multi-node:
>
> YES
> /home/s/steinman/dhabets/Root/FEniCS/bin:/scinet/gpc/mpi/openmpi/otpo/1.0.0//bin:/scinet/gpc/mpi/openmpi/1.4.4-gcc-v4.6.1/bin:/scinet/gpc/compilers/gcc-4.6.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lpp/mmfs/bin:/opt/torque/bin:/opt/torque/sbin:/scinet/gpc/tools/editors/nano/nano-2.2.4/bin:/scinet/gpc/bin6:/home/s/steinman/dhabets/Root/bin
> FOUND
>
> it prints the above for 16 processes and then goes through the computation of the partitions and then it prints out 16 times
> mesh ok
>
> after that it all fails with:
>
> swig
>
> False
>
> It seems that here: V = VectorFunctionSpace(mesh, "CG", 1)
> it forgot about the path?
>

Revision history for this message
Damiaan (dhabets) said :
#17

Did that, now it fails with:

OSError: SWIG is not installed on the system.

Revision history for this message
Damiaan (dhabets) said :
#18

the problem is in instant/config.py

def get_swig_version():
    """ Return the current swig version in a 'str'"""
    global _swig_version_cache
# if _swig_version_cache is None:
# # Check for swig installation
# result, output = get_status_output("%s -version"%get_swig_binary())
# if result != 0:
# raise OSError("SWIG is not installed on the system according to get_swig_version, result="+str(result))
# pattern = "SWIG Version (.*)"
# r = re.search(pattern, output)
# _swig_version_cache = r.groups(0)[0]
    return _swig_version_cache

hardcoding 2.0.3 as the returned value forces everything to work. So why can't it find swig or the path at this point?

It's failing in get_status_output or get_swig_binary()

Revision history for this message
Damiaan (dhabets) said :
#19

---
# Taken from http://ivory.idyll.org/blog/mar-07/replacing-commands-with-subprocess
from subprocess import Popen, PIPE, STDOUT
def get_status_output(cmd, input=None, cwd=None, env=None):
    pipe = Popen(cmd, shell=True, cwd=cwd, env=env, stdout=PIPE, stderr=STDOUT)

    (output, errout) = pipe.communicate(input=input)
    assert not errout

    status = pipe.returncode

    return (status, output)
---

I'm guessing Popen isn't getting the PATH, etc.?

Revision history for this message
Johan Hake (johan-hake) said :
#20

Instead of commenting out that region coud you print what output gives you?

Johan

On 01/23/2013 05:01 PM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Damiaan gave more information on the question:
> the problem is in instant/config.py
>
> def get_swig_version():
> """ Return the current swig version in a 'str'"""
> global _swig_version_cache
> # if _swig_version_cache is None:
> # # Check for swig installation
> # result, output = get_status_output("%s -version"%get_swig_binary())
> # if result != 0:
> # raise OSError("SWIG is not installed on the system according to get_swig_version, result="+str(result))
> # pattern = "SWIG Version (.*)"
> # r = re.search(pattern, output)
> # _swig_version_cache = r.groups(0)[0]
> return _swig_version_cache
>
>
> hardcoding 2.0.3 as the returned value forces everything to work. So why can't it find swig or the path at this point?
>
> It's failing in get_status_output or get_swig_binary()
>

Revision history for this message
Damiaan (dhabets) said :
#21

result=
output=-11

Revision history for this message
Damiaan (dhabets) said :
#22

Also, print env shows all the paths are ok in get_status_output.

Revision history for this message
Kent-Andre Mardal (kent-and) said :
#23

You can set cmd = "echo $PATH"
to check the path variable, or even
better cmd = "env" to print out all env variables.

Kent

On 23 January 2013 17:06, Damiaan <email address hidden>wrote:

> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Damiaan gave more information on the question:
> ---
> # Taken from
> http://ivory.idyll.org/blog/mar-07/replacing-commands-with-subprocess
> from subprocess import Popen, PIPE, STDOUT
> def get_status_output(cmd, input=None, cwd=None, env=None):
> pipe = Popen(cmd, shell=True, cwd=cwd, env=env, stdout=PIPE,
> stderr=STDOUT)
>
> (output, errout) = pipe.communicate(input=input)
> assert not errout
>
> status = pipe.returncode
>
> return (status, output)
> ---
>
> I'm guessing Popen isn't getting the PATH, etc.?
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>

Revision history for this message
Kent-Andre Mardal (kent-and) said :
#24

Are all the environment variables the same?
Maybe Popen is sometimes rotten....

Kent

On 23 January 2013 17:31, Damiaan <email address hidden>wrote:

> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Damiaan gave more information on the question:
> Also, print env shows all the paths are ok in get_status_output.
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>

Revision history for this message
Damiaan (dhabets) said :
#25

Hi Kent,

   yes they are the same; popen just triggers the error, but os.environ shows they're all set.

It's opening a sub process, yes? I'm just wondering if that's the issue.

thanks,
Damiaan

Revision history for this message
Kent-Andre Mardal (kent-and) said :
#26

On 23 January 2013 18:26, Damiaan <email address hidden>wrote:

> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Hi Kent,
>
> yes they are the same; popen just triggers the error, but os.environ
> shows they're all set.
>
> It's opening a sub process, yes? I'm just wondering if that's the issue.
>

Yes, maybe the not all sub processes are done (?).

you can try
import time
time.sleep(10)

just after Popen to let it sleep for 10 s such that the process on
the other machine surely has finished. This is of course
not the proper way of doing it, but a simple hack to test.

Kent

>
> thanks,
> Damiaan
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>

Revision history for this message
Damiaan (dhabets) said :
#27

for what it's worth, the same problem occurs with pkg_config, which is also checked for using the get_status_output function.

Same error, -11.

Revision history for this message
Damiaan (dhabets) said :
#28

Credit goes to Scott Northrup at scinet:

---
I think I have tracked down the source of your problem. The key, as you had already tracked down, was the issue with python's popen(). popen uses system() and/or fork() calls and apparently this is a known issue when using infiniband's ofed openib communication.

http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork

This is why you haven't had problems on other ethernet based clusters as it is specific to infiniband using openib.

It is supposed to be resolved in the versions we are using, but apparently it isn't. The reason it works on one node and not on two is that openmpi is smart enough use shared memory ( or sm) on node and then only uses the infiniband to communicate offnode (openib).

<<removed>>

A possible better option would be to try and use the newer openmpi-1.6.0 mpi we have, however you will need to recompile your python/mpi4py to use it as the 1.4.x and 1.6.x openmpi's are not compatible.

<<removed>>
---

Revision history for this message
Paul Constantine (paul-g-constantine) said :
#29

I'm experiencing the same issue, though even when running in serial. I set swig_path appropriately.

I'm using openmpi 1.6.1, and I get an interesting related message.

====
$ python demo_poisson.py
--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.

The process that invoked fork was:

  Local host: jhf-a.local (PID 1812)
  MPI_COMM_WORLD rank: 0

If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
Traceback (most recent call last):
  File "demo_poisson.py", line 42, in <module>
    V = FunctionSpace(mesh, "Lagrange", 1)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 381, in __init__
    FunctionSpaceBase.__init__(self, mesh, element)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 78, in __init__
    ufc_element, ufc_dofmap = jit(self._ufl_element)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 66, in mpi_jit
    return local_jit(*args, **kwargs)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 102, in jit
    raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
====

Why would this happen in serial as well? Am I *absolutely sure* I can survive a call to fork()? If so, can I ignore this warning?

Revision history for this message
Johan Hake (johan-hake) said :
#30

We need to get rid of run time check of swig binary. I think this should
be checked when dolfin is compiled.

That said I am quite sure you will survive a fork as we are only running
the JIT compiler on process 0, but then my knowledge of MPI is limited.

In the mean time you might want to comment out the test in
instant/configure.py which breaks. We will soon commit a fix for this,
which uses CMake compile-time configuration information for the JIT
compilation, avoiding runtime system calls, which seems to be fragile.

Johan

On 01/31/2013 08:11 PM, Paul Constantine wrote:
> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Paul Constantine posted a new comment:
> I'm experiencing the same issue, though even when running in serial. I
> set swig_path appropriately.
>
> I'm using openmpi 1.6.1, and I get an interesting related message.
>
> ====
> $ python demo_poisson.py
> --------------------------------------------------------------------------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process. Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption. The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.
>
> The process that invoked fork was:
>
> Local host: jhf-a.local (PID 1812)
> MPI_COMM_WORLD rank: 0
>
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> --------------------------------------------------------------------------
> Traceback (most recent call last):
> File "demo_poisson.py", line 42, in <module>
> V = FunctionSpace(mesh, "Lagrange", 1)
> File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 381, in __init__
> FunctionSpaceBase.__init__(self, mesh, element)
> File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 78, in __init__
> ufc_element, ufc_dofmap = jit(self._ufl_element)
> File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 66, in mpi_jit
> return local_jit(*args, **kwargs)
> File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 102, in jit
> raise OSError, "Could not find swig installation. Pass an existing "\
> OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
> ====
>
> Why would this happen in serial as well? Am I *absolutely sure* I can
> survive a call to fork()? If so, can I ignore this warning?
>

Revision history for this message
Damiaan (dhabets) said :
#31

Agreed, the swig, etc. binary checks shouldn't have to be done over and over again.

Paul, I'm not sure why it would happen in serial; in my case it was only when I crossed nodes and it was related to the popen() call used to check for binaries. Maybe try editing that code?

Revision history for this message
Paul Constantine (paul-g-constantine) said :
#32

I'm having a similar issue with header_and_libs_from_pkgconfig() in instant/config.py.

====
Traceback (most recent call last):
  File "demo_poisson.py", line 42, in <module>
    V = FunctionSpace(mesh, "Lagrange", 1)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 381, in __init__
    FunctionSpaceBase.__init__(self, mesh, element)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 78, in __init__
    ufc_element, ufc_dofmap = jit(self._ufl_element)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 66, in mpi_jit
    return local_jit(*args, **kwargs)
  File "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 154, in jit
    return jit_compile(form, parameters=p, common_cell=common_cell)
  File "/home/paulcon/local/lib/python2.7/site-packages/ffc/jitcompiler.py", line 71, in jit
    return jit_element(ufl_object, parameters)
  File "/home/paulcon/local/lib/python2.7/site-packages/ffc/jitcompiler.py", line 177, in jit_element
    compiled_form, module, form_data, prefix = jit_form(form, parameters)
  File "/home/paulcon/local/lib/python2.7/site-packages/ffc/jitcompiler.py", line 145, in jit_form
    cache_dir = cache_dir)
  File "/home/paulcon/local/lib/python2.7/site-packages/ufc_utils/build.py", line 60, in build_ufc_module
    configure_instant(swig_binary, swig_path)
  File "/home/paulcon/local/lib/python2.7/site-packages/ufc_utils/build.py", line 78, in configure_instant
    (path, dummy, dummy, dummy) = instant.header_and_libs_from_pkgconfig("ufc-1")
  File "/home/paulcon/local/lib/python2.7/site-packages/instant/config.py", line 153, in header_and_libs_from_pkgconfig
    raise OSError("The pkg-config file %s does not exist" % pack)
OSError: The pkg-config file ufc-1 does not exist
====

I can hard code _pkg_config_installed=True, but I'm not sure how to set _header_and_library_cache. What is a safe way around this one?

Revision history for this message
Kent-Andre Mardal (kent-and) said :
#33

On 31 January 2013 21:31, Paul Constantine <
<email address hidden>> wrote:

> Question #219270 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/219270
>
> Paul Constantine posted a new comment:
> I'm having a similar issue with header_and_libs_from_pkgconfig() in
> instant/config.py.
>
> ====
> Traceback (most recent call last):
> File "demo_poisson.py", line 42, in <module>
> V = FunctionSpace(mesh, "Lagrange", 1)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py",
> line 381, in __init__
> FunctionSpaceBase.__init__(self, mesh, element)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/dolfin/functions/functionspace.py",
> line 78, in __init__
> ufc_element, ufc_dofmap = jit(self._ufl_element)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py",
> line 66, in mpi_jit
> return local_jit(*args, **kwargs)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/dolfin/compilemodules/jit.py",
> line 154, in jit
> return jit_compile(form, parameters=p, common_cell=common_cell)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/ffc/jitcompiler.py", line
> 71, in jit
> return jit_element(ufl_object, parameters)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/ffc/jitcompiler.py", line
> 177, in jit_element
> compiled_form, module, form_data, prefix = jit_form(form, parameters)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/ffc/jitcompiler.py", line
> 145, in jit_form
> cache_dir = cache_dir)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/ufc_utils/build.py", line
> 60, in build_ufc_module
> configure_instant(swig_binary, swig_path)
> File
> "/home/paulcon/local/lib/python2.7/site-packages/ufc_utils/build.py", line
> 78, in configure_instant
> (path, dummy, dummy, dummy) =
> instant.header_and_libs_from_pkgconfig("ufc-1")
> File
> "/home/paulcon/local/lib/python2.7/site-packages/instant/config.py", line
> 153, in header_and_libs_from_pkgconfig
> raise OSError("The pkg-config file %s does not exist" % pack)
> OSError: The pkg-config file ufc-1 does not exist
> ====
>
> I can hard code _pkg_config_installed=True, but I'm not sure how to set
> _header_and_library_cache. What is a safe way around this one?
>

We are switching to cmake in instant. Working code can be found under
cmake-work
branches of dolfin/ufc/instant/ffc, but it is not yet finished.

Kent

>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>