Optimizing NLO merged/matched calculation on cluster

Asked by Michele Lupattelli on 2020-03-09

Hi,

I am trying to generate 300K events for the process tt+0,1,2 jets @ NLO merged and matched to a parton shower, using MadSpin to decay particles. I am running it into a big cluster. I managed to generate, as a test, 10K events: it took around 7 hours. I was worried that to generate 300K events it would take ages.

The compilation, the grid setup, the upper envelope and the event generation took "just" around 14 hours. For 300K events I set nevt_job=2000, such that the main job would submit a lot of secondary job, in order to speed up the computation.

The decay part and the shower part work on just one core, instead. Since for 10k events Madspin took just 6 minutes to decay events, I expected that for 300k events it would take around 3 hours. Instead, after 3 hours it decayed only 40k events. The pace was 10k events/50 minutes.

How can I improve the whole event generation? I would like to be able to generate 1M events, also because the number of events after the parton shower is around 1/3 of the original one.

Thank you in advance.
Cheers,

Michele

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Last query:
2020-03-10
Last reply:
2020-03-10

Hi,

If you have issue with MadSpin/Shower.
The only option is to run them in parralel.
Since those code does not offer internal option to run in parralel (which makes sense since they are IO bound and not cpu bound).
If you want to run them in parralel you need to split (or generate them separatly) your lhe event
and run madspin/shower in parralel.

Be carefull about your IO impact on other user if you do that since it will not be surprising if the load on your disk blows up and impact all user (long time to do an ls/...)

Cheers,

Olivier

> On 9 Mar 2020, at 10:42, Michele Lupattelli <email address hidden> wrote:
>
> New question #689219 on MadGraph5_aMC@NLO:
> https://answers.launchpad.net/mg5amcnlo/+question/689219
>
> Hi,
>
> I am trying to generate 300K events for the process tt+0,1,2 jets @ NLO merged and matched to a parton shower, using MadSpin to decay particles. I am running it into a big cluster. I managed to generate, as a test, 10K events: it took around 7 hours. I was worried that to generate 300K events it would take ages.
>
> The compilation, the grid setup, the upper envelope and the event generation took "just" around 14 hours. For 300K events I set nevt_job=2000, such that the main job would submit a lot of secondary job, in order to speed up the computation.
>
> The decay part and the shower part work on just one core, instead. Since for 10k events Madspin took just 6 minutes to decay events, I expected that for 300k events it would take around 3 hours. Instead, after 3 hours it decayed only 40k events. The pace was 10k events/50 minutes.
>
> How can I improve the whole event generation? I would like to be able to generate 1M events, also because the number of events after the parton shower is around 1/3 of the original one.
>
> Thank you in advance.
> Cheers,
>
> Michele
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Hi,

do you mean that I have to split the events.lhe file produced after the MadEvents event generation? And after that, do I have to run MadSpin and the Shower? How do I do that? In the bin directory I have the "shower" executable, but not a MadSpin one.

Cheers,

Michele

Hi,

The easiest is to have different directory or run multiple "generate_events" in a row.
Other method are possible but since you run with nb_core set to one it will not be practical anyway.

Cheers,

Olivier

Hi,

otherwise I could submit the job in multicore mode.

#!/bin/bas

#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --ntasks=1
#SBATCH --time=01:00:00

./mg5_aMC cmd

where cmd contains

set run_mode 2
set nb_core 48
launch my_dir

In doing so, I send the job to a node and it uses 48 cores of this node, right?
If I am right, how can I optimize the event generation?

Cheers,
Michele

Hi,

No this is is a bad idea since most of the time you will only use one core...
This is the best idea to be black-listed from your IT guy.

Cheers,

Olivier

> On 10 Mar 2020, at 11:09, Michele Lupattelli <email address hidden> wrote:
>
> Question #689219 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/689219
>
> Status: Answered => Open
>
> Michele Lupattelli is still having a problem:
> Hi,
>
> otherwise I could submit the job in multicore mode.
>
> #!/bin/bas
>
> #SBATCH --job-name=test
> #SBATCH --output=res.txt
> #SBATCH --ntasks=1
> #SBATCH --time=01:00:00
>
> ./mg5_aMC cmd
>
> where cmd contains
>
> set run_mode 2
> set nb_core 48
> launch my_dir
>
> In doing so, I send the job to a node and it uses 48 cores of this node, right?
> If I am right, how can I optimize the event generation?
>
> Cheers,
> Michele
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Can you help with this problem?

Provide an answer of your own, or ask Michele Lupattelli for more information if necessary.

To post a message you must log in.