Parallel running and runnning errors

Asked by Huanian Zhang

Hi Professor,

Recently I am working on university's HPC account (High Performance Computer), but each time when I run the MG5, it has some errors like below:

Command "generate_events " interrupted with error:
error : can't start new thread
Please report this bug on https://bugs.launchpad.net/madgraph5
More information is found in '/gsfs1/xdisk/fantasyzhn/MG5_aMC_v2_2_2/ttz/run_03_tag_1_debug.log'.
Please attach this file to your report.

The compiler on machine is gcc (gfortran) 4.4.4, python 2.7.3, how do I solve this problem?

I have another question about the parallel running. I need to generate a huge number of events. For instance, I need to generate 100 million ttbar events, if I use the multi_run command, it will take a few weeks and also it can not make the most use of the cpu on the node. I can request tens of Gb cpu memory on one node, but the serial running only use less than 1 Gb. How do I run the MG to make the most of cpu to speed it up?

Thank you very much.

Huanian Zhang

Question information

Language:
English Edit question
Status:
Solved
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Solved by:
Olivier Mattelaer
Solved:
Last query:
Last reply:
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

Hi,

We do not support HPC type of cluster. We actually need someone with access to this kind of cluster to make the implementation.

Cheers,

Olivier
On 02 Jun 2015, at 02:07, Huanian Zhang <email address hidden> wrote:

> New question #267654 on MadGraph5_aMC@NLO:
> https://answers.launchpad.net/mg5amcnlo/+question/267654
>
> Hi Professor,
>
> Recently I am working on university's HPC account (High Performance Computer), but each time when I run the MG5, it has some errors like below:
>
> Command "generate_events " interrupted with error:
> error : can't start new thread
> Please report this bug on https://bugs.launchpad.net/madgraph5
> More information is found in '/gsfs1/xdisk/fantasyzhn/MG5_aMC_v2_2_2/ttz/run_03_tag_1_debug.log'.
> Please attach this file to your report.
>
> The compiler on machine is gcc (gfortran) 4.4.4, python 2.7.3, how do I solve this problem?
>
> I have another question about the parallel running. I need to generate a huge number of events. For instance, I need to generate 100 million ttbar events, if I use the multi_run command, it will take a few weeks and also it can not make the most use of the cpu on the node. I can request tens of Gb cpu memory on one node, but the serial running only use less than 1 Gb. How do I run the MG to make the most of cpu to speed it up?
>
> Thank you very much.
>
> Huanian Zhang
>
>
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Huanian Zhang (fantasyzhn) said :
#2

Hi Olivier,

Thanks a lot. But how do I speed it up in my local computer? Since I need to run enormous events, the speed means a lot to me.

My local computer has 8 processors, each process has 4 cores, total cpu memory is 32 Gb, it is Intel Core i7-3770 @ 3.4GHz.

Thank you very much.
Huanian

Revision history for this message
Best Olivier Mattelaer (olivier-mattelaer) said :
#3

Hi,

The default running mode is multi-core with the maximum number of core available.
If the code fails to find that number of core. You can force it by editing the file
input/mg5_configuration.txt

and change the line
# nb_core = None
to
nb_core = 12

also checks that you have either the line defining “run_mode” is either commented or on “2“

Cheers,

Olivier

On 02 Jun 2015, at 22:16, Huanian Zhang <email address hidden> wrote:

> Question #267654 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/267654
>
> Status: Answered => Open
>
> Huanian Zhang is still having a problem:
> Hi Olivier,
>
> Thanks a lot. But how do I speed it up in my local computer? Since I
> need to run enormous events, the speed means a lot to me.
>
> My local computer has 8 processors, each process has 4 cores, total cpu
> memory is 32 Gb, it is Intel Core i7-3770 @ 3.4GHz.
>
> Thank you very much.
> Huanian
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#4

Hi,

The default running mode is multi-core with the maximum number of core available.
If the code fails to find that number of core. You can force it by editing the file
input/mg5_configuration.txt

and change the line
# nb_core = None
to
nb_core = 12

also checks that you have either the line defining “run_mode” is either commented or on “2“

Cheers,

Olivier

On 02 Jun 2015, at 22:16, Huanian Zhang <email address hidden> wrote:

> Question #267654 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/267654
>
> Status: Answered => Open
>
> Huanian Zhang is still having a problem:
> Hi Olivier,
>
> Thanks a lot. But how do I speed it up in my local computer? Since I
> need to run enormous events, the speed means a lot to me.
>
> My local computer has 8 processors, each process has 4 cores, total cpu
> memory is 32 Gb, it is Intel Core i7-3770 @ 3.4GHz.
>
> Thank you very much.
> Huanian
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Huanian Zhang (fantasyzhn) said :
#5

Thanks Olivier Mattelaer, that solved my question.

Revision history for this message
Juan Carlos Vasquez Carmona (juancarlos8866) said :
#6

Hi,

I have access to a HPC cluster in my Institution. In fact, I'm interested in this possibility since I need to generate a large number of events and it would be perfect if I can use the HPC cluster of my institution.

Bests,
Juan

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#7

Hi Juan,

It depends of the type of job scheduler that you have in your HPC cluster.
we support PBS, SGE, condor, LSF, …

If yours is not in the list, it might be possible to implement it following the instructions:
https://answers.launchpad.net/mg5amcnlo/+faq/2249
Note if you have an infrastructure type BlueGen then this might be another business

Cheers,

Olivier

> On Jan 27, 2016, at 11:13, Juan Carlos Vasquez Carmona <email address hidden> wrote:
>
> Question #267654 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/267654
>
> Juan Carlos Vasquez Carmona posted a new comment:
> Hi,
>
> I have access to a HPC cluster in my Institution. In fact, I'm
> interested in this possibility since I need to generate a large number
> of events and it would be perfect if I can use the HPC cluster of my
> institution.
>
> Bests,
> Juan
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.