OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
Why I run this test script:
from dolfin import *
print 'got dolfin'
mesh = UnitCube(10,10,10)
print 'mesh ok'
V = VectorFunctionS
print 'Vector fcn space ok'
v = Function(V)
it works fine on 1 node, but if I use mpirun as follows:
mpirun --bynode -np 16 python test.py
then it stops with:
Traceback (most recent call last):
File "test.py", line 6, in <module>
V = VectorFunctionS
File "/home/
FunctionSpa
File "/home/
ufc_element, ufc_dofmap = jit(self.
File "/home/
output = local_jit(*args, **kwargs)
File "/home/
raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
swig is installed, I tried setting swig_binary and swig_path without luck. How do I force it to look in the right location?
Question information
- Language:
- English Edit question
- Status:
- Solved
- For:
- DOLFIN Edit question
- Assignee:
- No assignee Edit question
- Solved by:
- Damiaan
- Solved:
- Last query:
- Last reply:
Revision history for this message
|
#1 |
Set:
parameters[
parameters[
Revision history for this message
|
#2 |
I spoke too soon, that doesn't work either:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
[gpc-f135n041:
Traceback (most recent call last):
File "test.py", line 10, in <module>
V = VectorFunctionS
File "/home/
FunctionSpa
File "/home/
ufc_element, ufc_dofmap = jit(self.
File "/home/
output = local_jit(*args, **kwargs)
File "/home/
raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
Revision history for this message
|
#4 |
Ok, ignore the previous one; only problem left seems to be the swig issue:
Traceback (most recent call last):
File "test.py", line 10, in <module>
V = VectorFunctionS
File "/home/
FunctionSpa
File "/home/
ufc_element, ufc_dofmap = jit(self.
File "/home/
output = local_jit(*args, **kwargs)
File "/home/
raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
Revision history for this message
|
#5 |
On 15 January 2013 21:41, Damiaan <email address hidden>wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Damiaan gave more information on the question:
> Ok, ignore the previous one; only problem left seems to be the swig
> issue:
>
You can adjust the PATH variable to make sure it points to the directory
where swig is:
e.g. in bash
export PATH=$PATH:
Kent
>
> Traceback (most recent call last):
> File "test.py", line 10, in <module>
> V = VectorFunctionS
> File
> "/home/
> line 505, in __init__
> FunctionSpaceBa
> File
> "/home/
> line 77, in __init__
> ufc_element, ufc_dofmap = jit(self.
> File
> "/home/
> line 70, in mpi_jit
> output = local_jit(*args, **kwargs)
> File
> "/home/
> line 102, in jit
> raise OSError, "Could not find swig installation. Pass an existing "\
> OSError: Could not find swig installation. Pass an existing swig binary or
> install SWIG version 2.0 or higher.
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>
Revision history for this message
|
#6 |
Hi Kent,
path to swig is already set and is correct. version is 2.0.5.
thanks!
Revision history for this message
|
#7 |
You need to set the swig parameters before you do anything as these are
cashed by instant. So:
from dolfin import *
parameters[
Also it is always good practice to try to run a new file in serial first
generating all JIT compiled code and then re run it in parallel.
Johan
On 01/15/2013 09:21 PM, Damiaan wrote:
> New question #219270 on DOLFIN:
> https:/
>
> Why I run this test script:
>
> from dolfin import *
> print 'got dolfin'
> mesh = UnitCube(10,10,10)
> print 'mesh ok'
> V = VectorFunctionS
> print 'Vector fcn space ok'
> v = Function(V)
>
>
> it works fine on 1 node, but if I use mpirun as follows:
>
> mpirun --bynode -np 16 python test.py
>
> then it stops with:
>
> Traceback (most recent call last):
> File "test.py", line 6, in <module>
> V = VectorFunctionS
> File "/home/
> FunctionSpaceBa
> File "/home/
> ufc_element, ufc_dofmap = jit(self.
> File "/home/
> output = local_jit(*args, **kwargs)
> File "/home/
> raise OSError, "Could not find swig installation. Pass an existing "\
> OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
>
> swig is installed, I tried setting swig_binary and swig_path without luck. How do I force it to look in the right location?
>
>
Revision history for this message
|
#8 |
Thanks Johan; I already did that; it works serially, it fails when going through MPI.
Revision history for this message
|
#9 |
Try using instead of just mpirun
mpirun -x PATH python
You can try adding -x PYTHONPATH and -x LD_LIBRARY_PATH too.
Johan
On 01/15/2013 11:06 PM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Thanks Johan; I already did that; it works serially, it fails when going
> through MPI.
>
Revision history for this message
|
#10 |
No difference; note, this only occurs when I try to run it over more than 1 node. mpirun on 1 node just works fine.
What exactly does the code check for to determine if swig is available or not?
Revision history for this message
|
#11 |
Maybe you can modify the instant/config.py file to print out what the
difference is when
running on one and two nodes?
The check_and_
Kent
On 15 January 2013 23:35, Damiaan <email address hidden>wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> No difference; note, this only occurs when I try to run it over more
> than 1 node. mpirun on 1 node just works fine.
>
> What exactly does the code check for to determine if swig is available
> or not?
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>
Revision history for this message
|
#12 |
Kent, yes, will try that once the cluster comes back from maintenance and will post back.
Revision history for this message
|
#13 |
Ok, when running on a single node:
check_and_
swig_binary=
result=
SWIG Version 2.0.3
Compiled with g++ [x86_64-
Configured options: +pcre
Please see http://
output=0
when running on more than 1 node:
check_and_
swig_binary=
result=
output=-11
Any ideas?
Revision history for this message
|
#14 |
Try put the following in your run script as far up as possible.
Johan
#######
# Are the directory where you have your SWIG binary
# available on the compute nodes?
import os
swig_path = "/scratch/
print "YES" if os.path.
# Is the PATH environment variable the same on the
# compute nodes as on the front node?
print os.environ["PATH"]
# Let instant check your SWIG path directly:
import
print "FOUND" if instant.
path=swig_path) else "NOT FOUND"
On 01/22/2013 01:55 AM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Ok, when running on a single node:
>
> check_and_
> swig_binary=
> result=
> SWIG Version 2.0.3
>
> Compiled with g++ [x86_64-
>
> Configured options: +pcre
>
> Please see http://
> information
>
> output=0
>
> when running on more than 1 node:
>
> check_and_
> swig_binary=
> result=
> output=-11
>
> Any ideas?
>
Revision history for this message
|
#15 |
Single node:
YES
/home/s/
FOUND
Multi-node:
YES
/home/s/
FOUND
it prints the above for 16 processes and then goes through the computation of the partitions and then it prints out 16 times
mesh ok
after that it all fails with:
swig
False
It seems that here: V = VectorFunctionS
it forgot about the path?
Revision history for this message
|
#16 |
Can you also confirm that:
parameters.
parameters.
is set before you compile your VectorFunctionS
Johan
On 01/22/2013 04:31 PM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Single node:
>
> YES
> /home/s/
> FOUND
>
> Multi-node:
>
> YES
> /home/s/
> FOUND
>
> it prints the above for 16 processes and then goes through the computation of the partitions and then it prints out 16 times
> mesh ok
>
> after that it all fails with:
>
> swig
>
> False
>
> It seems that here: V = VectorFunctionS
> it forgot about the path?
>
Revision history for this message
|
#17 |
Did that, now it fails with:
OSError: SWIG is not installed on the system.
Revision history for this message
|
#18 |
the problem is in instant/config.py
def get_swig_version():
""" Return the current swig version in a 'str'"""
global _swig_version_cache
# if _swig_version_cache is None:
# # Check for swig installation
# result, output = get_status_
# if result != 0:
# raise OSError("SWIG is not installed on the system according to get_swig_version, result=
# pattern = "SWIG Version (.*)"
# r = re.search(pattern, output)
# _swig_version_cache = r.groups(0)[0]
return _swig_version_cache
hardcoding 2.0.3 as the returned value forces everything to work. So why can't it find swig or the path at this point?
It's failing in get_status_output or get_swig_binary()
Revision history for this message
|
#19 |
---
# Taken from http://
from subprocess import Popen, PIPE, STDOUT
def get_status_
pipe = Popen(cmd, shell=True, cwd=cwd, env=env, stdout=PIPE, stderr=STDOUT)
(output, errout) = pipe.communicat
assert not errout
status = pipe.returncode
return (status, output)
---
I'm guessing Popen isn't getting the PATH, etc.?
Revision history for this message
|
#20 |
Instead of commenting out that region coud you print what output gives you?
Johan
On 01/23/2013 05:01 PM, Damiaan wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Damiaan gave more information on the question:
> the problem is in instant/config.py
>
> def get_swig_version():
> """ Return the current swig version in a 'str'"""
> global _swig_version_cache
> # if _swig_version_cache is None:
> # # Check for swig installation
> # result, output = get_status_
> # if result != 0:
> # raise OSError("SWIG is not installed on the system according to get_swig_version, result=
> # pattern = "SWIG Version (.*)"
> # r = re.search(pattern, output)
> # _swig_version_cache = r.groups(0)[0]
> return _swig_version_cache
>
>
> hardcoding 2.0.3 as the returned value forces everything to work. So why can't it find swig or the path at this point?
>
> It's failing in get_status_output or get_swig_binary()
>
Revision history for this message
|
#22 |
Also, print env shows all the paths are ok in get_status_output.
Revision history for this message
|
#23 |
You can set cmd = "echo $PATH"
to check the path variable, or even
better cmd = "env" to print out all env variables.
Kent
On 23 January 2013 17:06, Damiaan <email address hidden>wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Damiaan gave more information on the question:
> ---
> # Taken from
> http://
> from subprocess import Popen, PIPE, STDOUT
> def get_status_
> pipe = Popen(cmd, shell=True, cwd=cwd, env=env, stdout=PIPE,
> stderr=STDOUT)
>
> (output, errout) = pipe.communicat
> assert not errout
>
> status = pipe.returncode
>
> return (status, output)
> ---
>
> I'm guessing Popen isn't getting the PATH, etc.?
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>
Revision history for this message
|
#24 |
Are all the environment variables the same?
Maybe Popen is sometimes rotten....
Kent
On 23 January 2013 17:31, Damiaan <email address hidden>wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Damiaan gave more information on the question:
> Also, print env shows all the paths are ok in get_status_output.
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>
Revision history for this message
|
#25 |
Hi Kent,
yes they are the same; popen just triggers the error, but os.environ shows they're all set.
It's opening a sub process, yes? I'm just wondering if that's the issue.
thanks,
Damiaan
Revision history for this message
|
#26 |
On 23 January 2013 18:26, Damiaan <email address hidden>wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Status: Answered => Open
>
> Damiaan is still having a problem:
> Hi Kent,
>
> yes they are the same; popen just triggers the error, but os.environ
> shows they're all set.
>
> It's opening a sub process, yes? I'm just wondering if that's the issue.
>
Yes, maybe the not all sub processes are done (?).
you can try
import time
time.sleep(10)
just after Popen to let it sleep for 10 s such that the process on
the other machine surely has finished. This is of course
not the proper way of doing it, but a simple hack to test.
Kent
>
> thanks,
> Damiaan
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>
Revision history for this message
|
#27 |
for what it's worth, the same problem occurs with pkg_config, which is also checked for using the get_status_output function.
Same error, -11.
Revision history for this message
|
#28 |
Credit goes to Scott Northrup at scinet:
---
I think I have tracked down the source of your problem. The key, as you had already tracked down, was the issue with python's popen(). popen uses system() and/or fork() calls and apparently this is a known issue when using infiniband's ofed openib communication.
http://
This is why you haven't had problems on other ethernet based clusters as it is specific to infiniband using openib.
It is supposed to be resolved in the versions we are using, but apparently it isn't. The reason it works on one node and not on two is that openmpi is smart enough use shared memory ( or sm) on node and then only uses the infiniband to communicate offnode (openib).
<<removed>>
A possible better option would be to try and use the newer openmpi-1.6.0 mpi we have, however you will need to recompile your python/mpi4py to use it as the 1.4.x and 1.6.x openmpi's are not compatible.
<<removed>>
---
Revision history for this message
|
#29 |
I'm experiencing the same issue, though even when running in serial. I set swig_path appropriately.
I'm using openmpi 1.6.1, and I get an interesting related message.
====
$ python demo_poisson.py
-------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.
The process that invoked fork was:
Local host: jhf-a.local (PID 1812)
MPI_COMM_WORLD rank: 0
If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
-------
Traceback (most recent call last):
File "demo_poisson.py", line 42, in <module>
V = FunctionSpace(mesh, "Lagrange", 1)
File "/home/
FunctionSpa
File "/home/
ufc_element, ufc_dofmap = jit(self.
File "/home/
return local_jit(*args, **kwargs)
File "/home/
raise OSError, "Could not find swig installation. Pass an existing "\
OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
====
Why would this happen in serial as well? Am I *absolutely sure* I can survive a call to fork()? If so, can I ignore this warning?
Revision history for this message
|
#30 |
We need to get rid of run time check of swig binary. I think this should
be checked when dolfin is compiled.
That said I am quite sure you will survive a fork as we are only running
the JIT compiler on process 0, but then my knowledge of MPI is limited.
In the mean time you might want to comment out the test in
instant/
which uses CMake compile-time configuration information for the JIT
compilation, avoiding runtime system calls, which seems to be fragile.
Johan
On 01/31/2013 08:11 PM, Paul Constantine wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Paul Constantine posted a new comment:
> I'm experiencing the same issue, though even when running in serial. I
> set swig_path appropriately.
>
> I'm using openmpi 1.6.1, and I get an interesting related message.
>
> ====
> $ python demo_poisson.py
> -------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process. Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption. The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.
>
> The process that invoked fork was:
>
> Local host: jhf-a.local (PID 1812)
> MPI_COMM_WORLD rank: 0
>
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> -------
> Traceback (most recent call last):
> File "demo_poisson.py", line 42, in <module>
> V = FunctionSpace(mesh, "Lagrange", 1)
> File "/home/
> FunctionSpaceBa
> File "/home/
> ufc_element, ufc_dofmap = jit(self.
> File "/home/
> return local_jit(*args, **kwargs)
> File "/home/
> raise OSError, "Could not find swig installation. Pass an existing "\
> OSError: Could not find swig installation. Pass an existing swig binary or install SWIG version 2.0 or higher.
> ====
>
> Why would this happen in serial as well? Am I *absolutely sure* I can
> survive a call to fork()? If so, can I ignore this warning?
>
Revision history for this message
|
#31 |
Agreed, the swig, etc. binary checks shouldn't have to be done over and over again.
Paul, I'm not sure why it would happen in serial; in my case it was only when I crossed nodes and it was related to the popen() call used to check for binaries. Maybe try editing that code?
Revision history for this message
|
#32 |
I'm having a similar issue with header_
====
Traceback (most recent call last):
File "demo_poisson.py", line 42, in <module>
V = FunctionSpace(mesh, "Lagrange", 1)
File "/home/
FunctionSpa
File "/home/
ufc_element, ufc_dofmap = jit(self.
File "/home/
return local_jit(*args, **kwargs)
File "/home/
return jit_compile(form, parameters=p, common_
File "/home/
return jit_element(
File "/home/
compiled_form, module, form_data, prefix = jit_form(form, parameters)
File "/home/
cache_dir = cache_dir)
File "/home/
configure_
File "/home/
(path, dummy, dummy, dummy) = instant.
File "/home/
raise OSError("The pkg-config file %s does not exist" % pack)
OSError: The pkg-config file ufc-1 does not exist
====
I can hard code _pkg_config_
Revision history for this message
|
#33 |
On 31 January 2013 21:31, Paul Constantine <
<email address hidden>> wrote:
> Question #219270 on DOLFIN changed:
> https:/
>
> Paul Constantine posted a new comment:
> I'm having a similar issue with header_
> instant/config.py.
>
> ====
> Traceback (most recent call last):
> File "demo_poisson.py", line 42, in <module>
> V = FunctionSpace(mesh, "Lagrange", 1)
> File
> "/home/
> line 381, in __init__
> FunctionSpaceBa
> File
> "/home/
> line 78, in __init__
> ufc_element, ufc_dofmap = jit(self.
> File
> "/home/
> line 66, in mpi_jit
> return local_jit(*args, **kwargs)
> File
> "/home/
> line 154, in jit
> return jit_compile(form, parameters=p, common_
> File
> "/home/
> 71, in jit
> return jit_element(
> File
> "/home/
> 177, in jit_element
> compiled_form, module, form_data, prefix = jit_form(form, parameters)
> File
> "/home/
> 145, in jit_form
> cache_dir = cache_dir)
> File
> "/home/
> 60, in build_ufc_module
> configure_
> File
> "/home/
> 78, in configure_instant
> (path, dummy, dummy, dummy) =
> instant.
> File
> "/home/
> 153, in header_
> raise OSError("The pkg-config file %s does not exist" % pack)
> OSError: The pkg-config file ufc-1 does not exist
> ====
>
> I can hard code _pkg_config_
> _header_
>
We are switching to cmake in instant. Working code can be found under
cmake-work
branches of dolfin/
Kent
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>