run Escript on Savanna with srun

Asked by Zhenyu Han

Hi,

I used ".sh" file to submit Escript missions to Savanna.

For example:

----------------------------------------

#!/bin/bash -l
#SBATCH --job-name=esys_tr1
#SBATCH --comment="esctry1"
#SBATCH --partition=workq
#SBATCH --mail-type=FAIL,TIME_LIMIT_80
#SBATCH --mail-user=xxxxx
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=20
#SBATCH --mem=10000
#SBATCH --time=16:00:00

srun run-escript -n 1 -p 2 ./regula.py

----------------------------------------

Escript did not work.

And "srun" will output errors "srun: Job step creation temporarily disabled, retrying"

Why did that happen?

While, if I delete "srun" and ran "run-escript -n 1 -p 2 ./regula.py " directly in ".sh" files, Escript could work well.

Thanks in advance.

Question information

Language:
English Edit question
Status:
Solved
For:
esys-escript Edit question
Assignee:
No assignee Edit question
Solved by:
Bob
Solved:
Last query:
Last reply:
Revision history for this message
Joel Fenwick (j-fenwick1) said :
#1

You would be better contacting your system administrators directly about specific machine issues.
If srun is not queueing a job for you, then you should try again later.

I do not recommend dropping srun from job submission.
This means you are running your job on the service node rather submitting it to a queue to run elsewhere on the cluster.
This may well interfere with other users trying to use the cluster.

Revision history for this message
Joel Fenwick (j-fenwick1) said :
#2

Please ignore my previous response.
I hadn't read the source carefully enough

Revision history for this message
Best Bob (caltinay) said :
#3

Zhenyu,

While this question is specific to your environment as Joel pointed out please note that escript can be configured to have 'srun' (or any other runner) included in the launcher script 'run-escript'.
This has been done on the HPC cluster Savanna as well so you need to leave out srun when launching escript on this cluster.

Revision history for this message
Zhenyu Han (z-han2) said :
#4

Thanks Cihan Altinay, that solved my question.