Yade in PBS

Asked by Carl B. Kissell on 2014-07-23

Yade of version 1.0 is installed successfully on our cluster by compiling it from the source.
It could able to run interactively when it is submitted,
Yade is getting the exit signal strangely when we were trying to assign the Yade job through PBS, The following log shows that the Yade is executing the part of our code but not fully.

>>From error file KM-MKR.e47892
[user99@login1 test]$ cat KM-MKR.e47892
Welcome to Yade Unknown
TCP python prompt on localhost:9000, auth cookie `eksyau'
Running script tBed30.py

>>From out file
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
XMLRPC info provider on http://localhost:21000
5.18508611496e-08
[[ ^L clears screen, ^U kills line. F8 plot. ]]
Yade [1]: 
Do you really want to exit ([y]/n)?

I could see that my output files are created and left empty.
The same program could able to run the job interactively on the particular node.
We have tried all the other methods to hold the yade not to be killing itself but somehow yade is processing the kill signal.
PBS SCRIPTS

[user99@login1 test]$ cat yade_pbsrun.sh
#! /bin/tcsh
#PBS -l walltime=48:00:00
#PBS -N KM-MKR
#PBS -q workq
#PBS -l select=1:ncpus=16
##PBS -l place=scatter:excl
#PBS -V

# Go to the directory from which you submitted the job
#cd $PBS_O_WORKDIR
cd /scratch/user99/test
/home/user99/build_7/YADE/bin/yade-Unknown tBed30.py

#./run_sin.sh && echo 1 > logfile &
#while [ ! -s $logfile ]; do sleep 1; done
#PID=$! #catch the last PID
#wait $PID
#/home/user99/build_7/YADE/bin/yade-Unknown tBed30.py

Looking forward your suggestions.
Thanks.

Question information

Language:
English Edit question
Status:
Answered
For:
Yade Edit question
Assignee:
No assignee Edit question
Last query:
2014-07-23
Last reply:
2015-12-31
Christian Jakob (jakob-ifgt) said : #1

Hi Carl,

Nice to hear that you get yade running on your cluster. We are using yade on HPC in Freiberg too.

It is hard to say what the problem is. Your pbs script looks fine. But you should check where the problem is:

* error in the script?
* problem with your yade version?
* problem on the cluster (missing/wrong libraries, compilation problem, ...)?
* problem with pbs?

Try to run the script on another computer (same yade version)
Try to run the script on the cluster without pbs
Try to run the script with another yade version

Good luck!

Christian

Anton Gladky (gladky-anton) said : #2

Hi,

It looks like the problem is not with Yade, but with your
script. Please, test it on your local machine first.

There is my PBS-script for yade
===================================================
#!/bin/bash
#PBS -N YADERhmLiqConf
#PBS -l select=1:ncpus=12:mem=10gb
#PBS -l walltime=167:59:00
#PBS -m abe
#PBS -o YADERheomLiqMHom.Out
#PBS -e YADERheomLiqMHom.err

. /etc/profile.d/modules.sh
module load vtk/5.10.1
module load openmpi/gcc/64/1.4.5

PBS_O_WORKDIR=$HOME/yade/simulation/2014_07_17_RheomLiqMHom
cd $PBS_O_WORKDIR

$HOME/yade/inst/bin/yade-trunk-batch -j12 multy-test rheometer_yade_batch.py

===================================================

Can you run yade --test and yade --check in that environment?

Cheers

Anton

2014-07-23 12:22 GMT+02:00 Carl B. Kissell
<email address hidden>:
> New question #252053 on Yade:
> https://answers.launchpad.net/yade/+question/252053
>
> Yade of version 1.0 is installed successfully on our cluster by compiling it from the source.
> It could able to run interactively when it is submitted,
> Yade is getting the exit signal strangely when we were trying to assign the Yade job through PBS, The following log shows that the Yade is executing the part of our code but not fully.
>
>>>From error file KM-MKR.e47892
> [user99@login1 test]$ cat KM-MKR.e47892
> Welcome to Yade Unknown
> TCP python prompt on localhost:9000, auth cookie `eksyau'
> Running script tBed30.py
>
>>>From out file
> Warning: no access to tty (Bad file descriptor).
> Thus no job control in this shell.
> XMLRPC info provider on http://localhost:21000
> 5.18508611496e-08
> [[ ^L clears screen, ^U kills line. F8 plot. ]]
> Yade [ 1 ]:
> Do you really want to exit ([y]/n)?
>
> I could see that my output files are created and left empty.
> The same program could able to run the job interactively on the particular node.
> We have tried all the other methods to hold the yade not to be killing itself but somehow yade is processing the kill signal.
> PBS SCRIPTS
>
> [user99@login1 test]$ cat yade_pbsrun.sh
> #! /bin/tcsh
> #PBS -l walltime=48:00:00
> #PBS -N KM-MKR
> #PBS -q workq
> #PBS -l select=1:ncpus=16
> ##PBS -l place=scatter:excl
> #PBS -V
>
> # Go to the directory from which you submitted the job
> #cd $PBS_O_WORKDIR
> cd /scratch/user99/test
> /home/user99/build_7/YADE/bin/yade-Unknown tBed30.py
>
> #./run_sin.sh && echo 1 > logfile &
> #while [ ! -s $logfile ]; do sleep 1; done
> #PID=$! #catch the last PID
> #wait $PID
> #/home/user99/build_7/YADE/bin/yade-Unknown tBed30.py
>
> Looking forward your suggestions.
> Thanks.
>
>
> --
> You received this question notification because you are a member of
> yade-users, which is an answer contact for Yade.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~yade-users
> Post to : <email address hidden>
> Unsubscribe : https://launchpad.net/~yade-users
> More help : https://help.launchpad.net/ListHelp

Klaus Thoeni (klaus.thoeni) said : #3

I am running PBS on our cluster without any troubles. After trying Anton's suggestion maybe try to run one of our example scripts in batch mode:

#!/bin/bash
#
#PBS -l select=1:ncpus=1
#PBS -l walltime=100:00:00
#PBS -k oe
cd /home/user/YADE/example/concrete/
/home/user/YADE-git/master/install/bin/yade-batch confined.table uniax.py
exit 0

HTH
Klaus

Carl B. Kissell (tempmapmail) said : #4

Many thanks for your suggestions, Please let me run the suggested steps and get back to with more details

Carl

Carl B. Kissell (tempmapmail) said : #5

Please find the logs of --check and --test given below which is obtained in the PBS envirnment.

cat KM-MKR.o48414
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
###################################
running: checkGravity.py
Status: success
___________________________________
###################################
Skipping DEM-PFV-check.py, because it is in SkipScripts
###################################
Skipping checkList.py, because it is in SkipScripts
###################################
running: checkTestTriax.py
checkTestTriax.py failure
###################################
running: checkTestDummy.py
checkTestDummy.py failure
###################################
running: checkWeight.py
Precalculated weight 4536.560323
Obtained weight 4536.560322
Status: success
___________________________________
###################################
running: checkWirePM.py
Status: success
___________________________________
###################################
running: checkTestNormalInelasticity.py
Status: success
___________________________________

 cat KM-MKR.e48414*
Traceback (most recent call last):
  File "/home/user99/build_7/YADE/bin/yade-Unknown", line 83, in <module>
    import yade.tests
  File "/home2/user99/build_7/YADE/lib64/yade-Unknown/py/yade/tests/__init__.py", line 10, in <module>
    import yade.export,yade.linterpolation,yade.pack,yade.plot,yade.post2d,yade.timing,yade.utils,yade.ymport,yade.geom
  File "/home2/user99/build_7/YADE/lib64/yade-Unknown/py/yade/plot.py", line 13, in <module>
    import mtTkinter as Tkinter
  File "/home2/user99/build_7/YADE/lib64/yade-Unknown/py/mtTkinter.py", line 41, in <module>
    from Tkinter import *
ImportError: No module named Tkinter
Welcome to Yade Unknown

Carl

Anton Gladky (gladky-anton) said : #6

Hi,

it looks like you do not have python-tk installed [1]. I am not sure,
whether you need it on a cluster as well as GUI.

[1] https://yade-dem.org/doc/installation.html#prerequisites

Regards

Anton

2014-08-04 8:16 GMT+02:00 Carl B. Kissell <
<email address hidden>>:

> Question #252053 on Yade changed:
> https://answers.launchpad.net/yade/+question/252053
>
> Carl B. Kissell posted a new comment:
> Please find the logs of --check and --test given below which is obtained
> in the PBS envirnment.
>
>
> cat KM-MKR.o48414
> Warning: no access to tty (Bad file descriptor).
> Thus no job control in this shell.
> ###################################
> running: checkGravity.py
> Status: success
> ___________________________________
> ###################################
> Skipping DEM-PFV-check.py, because it is in SkipScripts
> ###################################
> Skipping checkList.py, because it is in SkipScripts
> ###################################
> running: checkTestTriax.py
> checkTestTriax.py failure
> ###################################
> running: checkTestDummy.py
> checkTestDummy.py failure
> ###################################
> running: checkWeight.py
> Precalculated weight 4536.560323
> Obtained weight 4536.560322
> Status: success
> ___________________________________
> ###################################
> running: checkWirePM.py
> Status: success
> ___________________________________
> ###################################
> running: checkTestNormalInelasticity.py
> Status: success
> ___________________________________
>
>
> cat KM-MKR.e48414*
> Traceback (most recent call last):
> File "/home/user99/build_7/YADE/bin/yade-Unknown", line 83, in <module>
> import yade.tests
> File
> "/home2/user99/build_7/YADE/lib64/yade-Unknown/py/yade/tests/__init__.py",
> line 10, in <module>
> import
> yade.export,yade.linterpolation,yade.pack,yade.plot,yade.post2d,yade.timing,yade.utils,yade.ymport,yade.geom
> File "/home2/user99/build_7/YADE/lib64/yade-Unknown/py/yade/plot.py",
> line 13, in <module>
> import mtTkinter as Tkinter
> File "/home2/user99/build_7/YADE/lib64/yade-Unknown/py/mtTkinter.py",
> line 41, in <module>
> from Tkinter import *
> ImportError: No module named Tkinter
> Welcome to Yade Unknown
>
> Carl
>
> --
> You received this question notification because you are a member of
> yade-users, which is an answer contact for Yade.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~yade-users
> Post to : <email address hidden>
> Unsubscribe : https://launchpad.net/~yade-users
> More help : https://help.launchpad.net/ListHelp
>

Anton Gladky (gladky-anton) said : #7

It is actually strange, that you could compile Yade without python-tk.
It is required to build Yade [1]. Did you disable that checking?

[1] https://github.com/yade/trunk/blob/master/CMakeLists.txt#L460

Gary Pekmezi (gpekmezi) said : #8

Please forgive me for resurrecting an old thread, but I would like to propose a solution for posterity.

I managed to compile Yade (daily) on an SGI ICE X with RHEL 6. The test would work fine if I ran Yade from the login node. However, when logged in interactively to a compute node or if I submitted a PBS job (to the compute node) I received the exact same error, (missing Tkinter), and had the same problem where Yade would just seem to hang.

1) Eventually I tracked down the issue to a missing library, specifically libXss.so.1. This library exists at the login node, but not at the compute node. I don't believe any sysadmin ever imagined this lib would cause any issues as it is a screensaver extension library, but for some reason I have not the time to track down, it breaks Tkinter import. Once I copied the library to my yade directory and added it to my $LD_LIBRARY_PATH, everything seems to work fine.

2) The hanging/unresponsiveness seems to be a separate issue altogether. The solution to this is to add sys.stdout.flush() in the input file after every O.run(). Otherwise the sys.stdout gets lost in the aether.

Anton Gladky (gladky-anton) said : #9

Thanks for the feedback!

Anton

Can you help with this problem?

Provide an answer of your own, or ask Carl B. Kissell for more information if necessary.

To post a message you must log in.