Terminated ../madevent_mintMC > log.txt < input_app.txt

Asked by George Uttley

Hi,

I am trying to launch the process p p > h1 h3 [QCD] in the 2HDM_NLO model with CP conserving and flavour symmetry restrictions. I am getting the follow errors (also the case in the 2HDMtII_NLO model):

WARNING: program /vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_ucx_h2h3/ajob1 3 F 0 0 launch ends with non zero status: 1. Stop all computation ^[[0m
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_uux_h2h3/ajob1: line 50: 11950 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: Idle: 237, Running: 15, Completed: 2 [ 39.5s ]
INFO: Idle: 237, Running: 14, Completed: 3 [ 39.5s ]
INFO: Idle: 237, Running: 13, Completed: 4 [ 39.7s ]
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_uux_h2h3/ajob1: line 50: 11944 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: Idle: 237, Running: 12, Completed: 5 [ 39.9s ]
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_uux_h2h3/ajob1: line 50: 11952 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: Idle: 237, Running: 11, Completed: 6 [ 40s ]
INFO: Idle: 237, Running: 10, Completed: 7 [ 40.2s ]
INFO: Idle: 237, Running: 9, Completed: 8 [ 40.2s ]
INFO: Idle: 237, Running: 8, Completed: 9 [ 40.3s ]
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_uux_h2h3/ajob1: line 50: 11953 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: Idle: 237, Running: 6, Completed: 11 [ 40.5s ]
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_uux_h2h3/ajob1: line 50: 11960 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: Idle: 237, Running: 5, Completed: 12 [ 40.6s ]
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_ucx_h2h3/ajob1: line 50: 11955 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: Idle: 237, Running: 4, Completed: 13 [ 40.7s ]
date: write error: Broken pipe
INFO: Idle: 237, Running: 3, Completed: 14 [ 40.8s ]
INFO: Idle: 237, Running: 2, Completed: 15 [ 40.8s ]
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_ucx_h2h3/ajob1: line 50: 12567 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: Idle: 237, Running: 0, Completed: 17 [ 40.9s ]

Please can you help me to figure out the problem. Many Thanks!

Question information

Language:
English Edit question
Status:
Solved
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Solved by:
Olivier Mattelaer
Solved:
Last query:
Last reply:
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

The file to look at here is
/vols/cms/gu18/4tau_v3/genproductions/bin/MadGraph5_aMCatNLO/phi500A500To4Tau/phi500A500To4Tau_gridpack/work/processtmp/SubProcesses/P0_uux_h2h3/all_G1/log.txt

Cheers,

Olivier

Revision history for this message
George Uttley (gputtley) said :
#2

The log file finishes:

 Collider parameters:
 --------------------

 Running at P P machine @ 13000.000000000000 GeV
 PDF set = nn23nlo
 alpha_s(Mz)= 0.1190 running at 2 loops.
 alpha_s(Mz)= 0.1190 running at 2 loops.
 Renormalization scale set on event-by-event basis
 Factorization scale set on event-by-event basis

 Diagram information for clustering has been set-up for nFKSprocess 1
 Diagram information for clustering has been set-up for nFKSprocess 2
 Diagram information for clustering has been set-up for nFKSprocess 3
 Diagram information for clustering has been set-up for nFKSprocess 4
 getting user params
Enter number of events and iterations:
 Number of events and iterations -1 12
Enter desired fractional accuracy:
 Desired fractional accuracy: 2.9999999999999999E-002
 Enter alpha, beta for G_soft
   Enter alpha<0 to set G_soft=1 (no ME soft)
 for G_soft: alpha= 1.0000000000000000 , beta= -0.10000000000000001
 Enter alpha, beta for G_azi
   Enter alpha>0 to set G_azi=0 (no azi corr)
 for G_azi: alpha= -1.0000000000000000 , beta= -0.10000000000000001
 Doing the S and H events together
Suppress amplitude (0 no, 1 yes)?
 Using suppressed amplitude.
Exact helicity sum (0 yes, n = number/event)?
 Do MC over helicities for the virtuals
Enter Configuration Number:
Running Configuration Number: 1
Enter running mode for MINT:
0 to set-up grids, 1 to integrate, 2 to generate events
 MINT running mode: 0
Set the three folding parameters for MINT
xi_i, phi_i, y_ij
           1 1 1
 'all ', 'born', 'real', 'virt', 'novi' or 'grid'?
 Enter 'born0' or 'virt0' to perform
  a pure n-body integration (no S functions)
 doing the all of this channel
 Normal integration (Sfunction != 1)
 about to integrate 7 -1 12 1
 imode is 0
channel 1 : 1 T 0 0 0.1000E+01 0.0000E+00 0.1000E+01
Error: Status code 143

Revision history for this message
George Uttley (gputtley) said :
#3

This is the log in GF1 by the way as I do not have a all_G1 folder.

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#4

So error 143 means (according to google):
Exit Code 143 happens due to multiple reasons and one of them is related to Memory/GC issues.

So looks like that you cluster has some strong/small default limit for jobs and that such limitation are enforce by c-group.
You will likely need to update the cluster submission mechanism (potentially via a plugin) to relax such limitation.

Cheers,

Olivier

Revision history for this message
George Uttley (gputtley) said :
#5

After adding some fixes for memory issues. I still get the same error on the terminal and the log file is the same minus the Error: Status code 143

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#6

Then I would advice to check the cluster related log, to see for which reason those jobs are cancelled.
Such error are infrastructure related and therefore I can not really help you on those.

Cheers,

Olivier

Revision history for this message
George Uttley (gputtley) said :
#7

Hi Olivier,

I am unsure if this is the case as I am also getting this error when running locally and we are no longer getting memory exit code errors.

I also noticed that in P0_uux_h2h3/GF4/log.txt we have an error code:

ERROR: INTEGRAL APPEARS TO BE ZERO.
 TRIED 100352 PS POINTS AND ONLY 0 GAVE A NON-ZERO INTEGRAND.
Error: Status code 1

Presumably this is the reason it is failing. I could not see any error codes in any of the other log files. What can cause this error and why does it only appear in one specific log?

Many thanks for your help.

Cheers,
George

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#8

OK,

I guess then, that the issue is that GF4 is crashing, and then this trigger a cluster cancellation of all other jobs leading to the error code 143 of the others directory.

So the question here, is why you do have that contribution to zero.
If this is because some coupling are zero, the solution is to use a restricted model to remove the zero contribution from the start.
If this is due to too hard cut, then the solution is to use softer cut.

Cheers,

Olivier

Revision history for this message
George Uttley (gputtley) said :
#9

Hi Olivier,

Many thanks for your previous help.

So it seems this crash comes from vertices with zero couplings. Just testing it for individual quark initial states I found the following:

u u~ > h2 h3 [QCD] - crashes
u u~ > h2 h3 / a h1 h2 h3 [QCD] - crashes
u u~ > h2 h3 / c t [QCD] - crashes
u u~ > h2 h3 / c t a h1 h2 h3 [QCD] - runs fine and gives non zero cross-section

Excluding these intermediate particles, excludes any diagrams with vertices with zero couplings (set deliberately in the param_card). So my question is why doesn't it work when there is a mixture of diagrams with no zero couplings and diagrams with some zero couplings?

Cheers,
George

Revision history for this message
George Uttley (gputtley) said :
#10

Hi Olivier,

Sorry to ask again. I was wondering if you had any idea why this is the case?

Many thanks for your help.

Cheers,
George

Revision history for this message
Best Olivier Mattelaer (olivier-mattelaer) said :
#11

Hi,

NLO code is quite conservative against zero cross-section.
The issue with zero cross-section is that often they are a sign of an error in the code.
So the choice was here to make the code to crash if any occurs.
Looks like on your case, those are "genuine" since you want those to be zero.

The solution is to use an optimized model where such coupling are removed from the model:
See https://answers.launchpad.net/mg5amcnlo/+faq/2312

Cheers,

Olivier

Revision history for this message
George Uttley (gputtley) said :
#12

Thanks Olivier Mattelaer, that solved my question.