change TMPDIR from /tmp

Asked by Daniele Barducci

Hi,
I am trying to run MadGraph on a cluster and I'm facing a problem that at first sight looks that was solved.
Generating events I keep getting the following error

At line 111 of file rw_events.f (unit = 27, file = '/tmp/gfortrantmpPAsDrr')
Fortran runtime error: Bad address

talking with the people of the server that I use the told me that is because /tmp directory is quickly filled (is 100 MB), so I need to change /tmp directory that gfortran ( so MG ) is using to another one.
They suggest me to write, in the bash script that I use to submit the jobs to the server, the following line

export TMPDIR=/scratch/db3e11/tmp

where this /tmp folder is a one that I have created in my directory.
Actually this doesn't work, since MadGraph still give the same error and so it looks like the gfortran compiler is not keeping track of the export command.

They told me that it might be possible that is because in the script that i submit to the server I'm invoking a perl script that then runs madgraph and then the exported variable is not preserved.
So they suggested me to ask in the MG launchpad about this, and if there is a way to face this problem.
They have suggested me to mention specifically environment variables.

I link also here the discussion that we are having in the other forum.
https://cmg.soton.ac.uk/community/boards/32/topics/show/2762

Hope this could be solved,

thanks

Daniele

Question information

Language:
English Edit question
Status:
Solved
For:
MadGraph5_aMC@NLO Edit question
Assignee:
Olivier Mattelaer Edit question
Solved by:
Daniele Barducci
Solved:
Last query:
Last reply:
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

Hi Daniele,

I don't see which perl script you are talking about.
But I suppose that you are running in cluster mode, and therefore part of the code are run on different machine.
If I'm correct on condor cluster all environment variables are passed to each nodes.
So could you tell me which type of cluster are you using?

Cheers,

Olivier

On Sep 27, 2013, at 7:11 PM, Daniele Barducci <email address hidden> wrote:

> New question #236483 on MadGraph5:
> https://answers.launchpad.net/madgraph5/+question/236483
>
> Hi,
> I am trying to run MadGraph on a cluster and I'm facing a problem that at first sight looks that was solved.
> Generating events I keep getting the following error
>
> At line 111 of file rw_events.f (unit = 27, file = '/tmp/gfortrantmpPAsDrr')
> Fortran runtime error: Bad address
>
> talking with the people of the server that I use the told me that is because /tmp directory is quickly filled (is 100 MB), so I need to change /tmp directory that gfortran ( so MG ) is using to another one.
> They suggest me to write, in the bash script that I use to submit the jobs to the server, the following line
>
> export TMPDIR=/scratch/db3e11/tmp
>
> where this /tmp folder is a one that I have created in my directory.
> Actually this doesn't work, since MadGraph still give the same error and so it looks like the gfortran compiler is not keeping track of the export command.
>
> They told me that it might be possible that is because in the script that i submit to the server I'm invoking a perl script that then runs madgraph and then the exported variable is not preserved.
> So they suggested me to ask in the MG launchpad about this, and if there is a way to face this problem.
> They have suggested me to mention specifically environment variables.
>
> I link also here the discussion that we are having in the other forum.
> https://cmg.soton.ac.uk/community/boards/32/topics/show/2762
>
> Hope this could be solved,
>
> thanks
>
> Daniele
>
>
>
>
> --
> You received this question notification because you are a member of
> MadTeam, which is an answer contact for MadGraph5.

Revision history for this message
Daniele Barducci (db3e11) said :
#2

Hi Olivier,
yes sorry I wasn't really clear at all.
So, I wrote a perl script just loop on different param_card in an automatic way in order not to do everything by hand;
the perl script is just doing a loop to copy the Template in a new directory, generate diagrams, copy a new param_card and the generate events, and then do the same for a different process or param_card

Actually I'm not running in cluster mode, but in local mode on a cluster; In fact I'm just submitting to the our server (pbs batch) my perl script which does the loops, but then MadGraph is run locally.

In the script that I'm submitting to the queue I'm adding three lines to try to set TMPDIR in a different folder (see below), but this seems not the be preserved during the run, since the error are still regarding the /tmp directory...

#!/bin/bash
#PBS -l walltime=50:00:00
#PBS -l nodes=1:ppn=12
#PBS -N testjob
#PBS -V
export PBS_O_WORKDIR=""
rm -r /scratch/db3e11/tmp
mkdir /scratch/db3e11/tmp
export TMPDIR=/scratch/db3e11/tmp
cd /scratch/db3e11/MSSM_focus_point/MadGraph5_v1_5_11/focus_point_scripts
./running_events.pl

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#3

Ok now I understand,

Nothing that I can do in MG obviously.

I would personally have define it in the other way.
so define in the shell the following:
> export TMPDIR=/scratch/db3e11/tmp
and not in the script.
Indeed the line
> #PBS -V
should pass the local environment to the jobs. so not sure that it make sense to overwrite it in the script.
In addition, it might makes sense to compile the code with such environment variables. (But this is just a guess)

Cheers,

Olivier

On Sep 30, 2013, at 10:06 AM, Daniele Barducci <email address hidden> wrote:

> Question #236483 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/236483
>
> Daniele Barducci posted a new comment:
> Hi Olivier,
> yes sorry I wasn't really clear at all.
> So, I wrote a perl script just loop on different param_card in an automatic way in order not to do everything by hand;
> the perl script is just doing a loop to copy the Template in a new directory, generate diagrams, copy a new param_card and the generate events, and then do the same for a different process or param_card
>
> Actually I'm not running in cluster mode, but in local mode on a
> cluster; In fact I'm just submitting to the our server (pbs batch) my
> perl script which does the loops, but then MadGraph is run locally.
>
> In the script that I'm submitting to the queue I'm adding three lines to
> try to set TMPDIR in a different folder (see below), but this seems not
> the be preserved during the run, since the error are still regarding the
> /tmp directory...
>
>
> #!/bin/bash
> #PBS -l walltime=50:00:00
> #PBS -l nodes=1:ppn=12
> #PBS -N testjob
> #PBS -V
> export PBS_O_WORKDIR=""
> rm -r /scratch/db3e11/tmp
> mkdir /scratch/db3e11/tmp
> export TMPDIR=/scratch/db3e11/tmp
> cd /scratch/db3e11/MSSM_focus_point/MadGraph5_v1_5_11/focus_point_scripts
> ./running_events.pl
>
> --
> You received this question notification because you are a member of
> MadTeam, which is an answer contact for MadGraph5.

Revision history for this message
Daniele Barducci (db3e11) said :
#4

HI Olivier,
thanks for the tip, I'll try this.
Actually since is not a problem related to MG I'll thank you again for the answers and close the topic.

Cheers

Daniele

Revision history for this message
Marc Escalier (escalier) said :
#5

Hello,

i have the same problem, but the solution does not work for me.
At line 111 of file rw_events.f (unit = 27, file = '/tmp/gfortrantmpPAsDrr')
 Fortran runtime error: Bad address

the program tries to write on /tmp directory
but it is full

how to fix that ?

i tried to change TMPDIR but it does not change anything.

thank you

Revision history for this message
Marc Escalier (escalier) said :
#6

remark : i use madgraph a mcNLO in a interactive session.
When doing a lot of events, it crashes due to the /tmp directory that is full
thanks

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#7

Hi,

The solution is indeed then to change your /tmp path to a place where you have more space.
The solution describe above works for a pbs cluster. Are you using such kind of cluster?

If you are running locally, then you probably have to run something like
export TMPDIR=

Cheers,

Olivier

Revision history for this message
Marc Escalier (escalier) said :
#8

Hello,

i run in interactive session.
I confirm that using export TMPDIR=A Given Directory that is not /tmp
does not fix the problem
still the program tries to access /tmp of the computer, which is full.
So it is impossible to run on a computer where we don't have the administrative power to put more space on /tmp directory

This is related to the fortran instructions 'open' with scratch options

using the environvment variable TMPDIR does not fix the problem

thanks

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#9

Hi,

Looks like the following link might help you:
http://publib.boulder.ibm.com/infocenter/comphelp/v101v121/index.jsp?topic=/com.ibm.xlf121.aix.doc/proguide/scrname.html

According to :
http://docs.oracle.com/cd/E19957-01/805-4939/6j4m0vnaf/index.html
the set of tmpdir should work.
So I guess that the way you set that environment variable doesn’t work and/or that it depends of your compiler.

I would actually propose that you discuss this point with your local IT team, since this is clearly not something that is fixable on our side.

Cheers,

Olivier

On Jun 12, 2014, at 12:51 PM, Marc Escalier <email address hidden> wrote:

> Question #236483 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/236483
>
> Marc Escalier posted a new comment:
> Hello,
>
> i run in interactive session.
> I confirm that using export TMPDIR=A Given Directory that is not /tmp
> does not fix the problem
> still the program tries to access /tmp of the computer, which is full.
> So it is impossible to run on a computer where we don't have the administrative power to put more space on /tmp directory
>
> This is related to the fortran instructions 'open' with scratch options
>
> using the environvment variable TMPDIR does not fix the problem
>
> thanks
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Marc Escalier (escalier) said :
#10

thank you

so IT guys say that madgraph doesn't manage TMPDIR.
But i'm not completely convinced by their answer.
So for the moment, i'm not sure for any hypothesis on the explanation of the problem.
I still investigate (since a week).

thank you