"CRITICAL: Fail to run correctly job" when running VBF on lemaitre3 cluster

Asked by matteo maltoni

Dear MadGraph experts,

I'm trying to run a fixed order analysis for (p p > l+ l- j j QCD=0 NP=2 [QCD]) on the lemaitre3 cluster, but the computation repeatedly fails with a few jobs to go.

However, I cannot spot any error in the log.txt file of the SubProcesses/P0_uu_uuemep/all_G1 folder: you can find it pasted below, together with the log file of the whole run.

Is there a problem with one of my madgraph settings? This error never occurred before with other processes on the same cluster.

Thank you for any help,

Matteo

Here's the P0_uu_uuemep/all_G1/log.txt:

===============================================================
 INFO: MadFKS read these parameters from FKS_params.dat
 ===============================================================
  > IRPoleCheckThreshold = -1.0000000000000000
  > PrecisionVirtualAtRunTime = -1.0000000000000000
  > SelectedContributionTypes = All
  > VetoedContributionTypes = None
  > QCD_squared_selected = All
  > QED_squared_selected = All
  > SelectedCouplingOrders = All
  > NHelForMCoverHels = 4
  > VirtualFraction = 1.0000000000000000
  > MinVirtualFraction = 5.0000000000000001E-003
  > SeparateFlavourConfigs = F
  > UsePolyVirtual = F
 ===============================================================
 SPLIT TYPE USED: F T F
 A PDF is used, so alpha_s(MZ) is going to be modified
 Old value of alpha_s from param_card: 0.11839999999999999
  ****************************************

       NNPDFDriver version 1.0.3
   Grid: NNPDF23nlo_as_0119_qed_mem0.grid
  ****************************************
 New value of alpha_s from PDF nn23nlo: 0.11899999999999999
WARNING: the value of maxjetflavorspecified in the run_card ( 4) is inconsistent with the number of light flavours inthe model. Hence it will be set to: 5
 *****************************************************
 * MadGraph/MadEvent *
 * -------------------------------- *
 * http://madgraph.hep.uiuc.edu *
 * http://madgraph.phys.ucl.ac.be *
 * http://madgraph.roma2.infn.it *
 * -------------------------------- *
 * *
 * PARAMETER AND COUPLING VALUES *
 * *
 *****************************************************

  External Params
  ---------------------------------

 MU_R = 91.188000000000002
 mdl_Lambda = 1000.0000000000000
 mdl_cWWW = 1.1000000000000001
 mdl_mueft = 91.188000000000002
 mdl_Gf = 1.1663700000000000E-005
 aS = 0.11839999999999999
 mdl_ymt = 172.00000000000000
 mdl_MZ = 91.187600000000003
 mdl_MW = 79.824399999999997
 mdl_MT = 172.00000000000000
 mdl_MH = 125.00000000000000
 mdl_WZ = 2.4160230000000000
 mdl_WW = 2.0029499999999998
 mdl_WT = 1.4708000000000001
 mdl_WH = 4.0879999999999996E-003
  Internal Params
  ---------------------------------

 mdl_dlam = 0.0000000000000000
 mdl_dWB = 0.0000000000000000
 mdl_dv = 0.0000000000000000
 mdl_dT = 0.0000000000000000
 mdl_cw0 = 0.87538656571726847
 mdl_sqrt__2 = 1.4142135623730951
 mdl_muH0 = 88.388347648318430
 mdl_MW__exp__2 = 6371.9348353599999
 mdl_MZ__exp__2 = 8315.1783937600012
 mdl_sw0 = 0.48342358295983689
 mdl_nb__2__exp__0_25 = 1.1892071150027210
 mdl_Lambda__exp__2 = 1000000.0000000000
 mdl_cw0__exp__2 = 0.76630163943827356
 mdl_sw0__exp__2 = 0.23369836056172630
 mdl_MH__exp__2 = 15625.000000000000
 mdl_complexi = (0.0000000000000000,1.0000000000000000)
 mdl_sw0__exp__3 = 0.11297529879458956
 mdl_cw0__exp__3 = 0.67081016045138286
 mdl_MT__exp__2 = 29584.000000000000
 mdl_MT__exp__3 = 5088448.0000000000
 mdl_mueft__exp__2 = 8315.2513440000002
 mdl_vev0 = 246.22056907348588
 mdl_vev0__exp__2 = 60624.568634871226
 mdl_ee0 = 0.31345063981313520
 mdl_cw = 0.87538656571726847
 mdl_muH = 88.388347648318430
 mdl_sw = 0.48342358295983689
 mdl_g1 = 0.35807111062562641
 mdl_gw = 0.64839749416853931
 mdl_lam = 0.12886689630821146
 mdl_vev = 246.22056907348588
 mdl_ee = 0.31345063981313520
 mdl_ee__exp__2 = 9.8251303599263817E-002
 mdl_aEW = 7.8185903165226816E-003
 aEWM1 = 127.90029398096792
 mdl_ee0__exp__2 = 9.8251303599263817E-002
 mdl_ee0__exp__3 = 3.0796933975663836E-002
 mdl_vev0__exp__3 = 14927015.789112596
  Internal Params evaluated point by point
  ----------------------------------------

 mdl_sqrt__aS = 0.34409301068170506
 mdl_G__exp__2 = 1.4878582807401259
 mdl_G__exp__3 = 1.8148567439626970
 mdl_G__exp__4 = 2.2137222635669636
 mdl_MU_R__exp__2 = 8315.2513440000002
  Couplings of SMEFTatNLO-loopWWW
  ---------------------------------

        GC_11 = 0.00000E+00 0.12198E+01
 R2GC_1684_1173 0.00000E+00 0.12563E-01
 R2GC_1685_1174 -0.00000E+00 -0.52504E-02
 R2GC_1707_1193 0.00000E+00 0.93051E-02
 R2GC_1713_1195 -0.00000E+00 -0.28995E-02
 R2GC_2315_1626 -0.00000E+00 -0.11520E-01
 R2GC_639_1821 = 0.00000E+00 0.26252E-02
 R2GC_663_1839 = 0.00000E+00 0.14497E-02
 UVGC_2315_794_1 0.00000E+00 0.28799E-02
 UVGC_2315_795_1 -0.00000E+00 -0.57598E-02
         GC_1 = -0.00000E+00 -0.10448E+00
       GC_195 = 0.00000E+00 0.45849E+00
       GC_196 = -0.00000E+00 -0.28380E+00
       GC_198 = -0.00000E+00 -0.56760E+00
         GC_2 = 0.00000E+00 0.20897E+00
       GC_253 = 0.00000E+00 0.28850E-01
       GC_254 = -0.00000E+00 -0.86550E-01
         GC_3 = -0.00000E+00 -0.31345E+00
       GC_124 = -0.00000E+00 -0.57776E-05
       GC_258 = 0.00000E+00 0.31906E-05

 Collider parameters:
 --------------------

 Running at P P machine @ 13000.000000000000 GeV
 PDF set = nn23nlo
 alpha_s(Mz)= 0.1190 running at 2 loops.
 alpha_s(Mz)= 0.1190 running at 2 loops.
 Renormalization scale set on event-by-event basis
 Factorization scale set on event-by-event basis

 Diagram information for clustering has been set-up for nFKSprocess 1
 Diagram information for clustering has been set-up for nFKSprocess 2
 Diagram information for clustering has been set-up for nFKSprocess 3
 Diagram information for clustering has been set-up for nFKSprocess 4
 Diagram information for clustering has been set-up for nFKSprocess 5
INFO: orders_tag_plot is computed as: + NP * 1 + QCD * 100 + QED * 10000
 orders_tag_plot= 80000 for NP,QCD,QED, = 0 , 0 , 8 ,
 AMP_SPLIT: 1 correspond to S.O. 0 0 8
 orders_tag_plot= 80200 for NP,QCD,QED, = 0 , 2 , 8 ,
 AMP_SPLIT: 2 correspond to S.O. 0 2 8
 getting user params
 Number of phase-space points per iteration: -1
 Maximum number of iterations is: 6
 Desired accuracy is: 5.0000000000000003E-002
 Using adaptive grids: 2
 Using Multi-channel integration
 Do MC over helicities for the virtuals
 Number of channels to integrate together: 20
 Running Configuration Number(s): 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10
 initial-or-final 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
 Splitting channel: 0
 Weight multiplier: 1.0000000000000000
 doing the all of this channel
 Normal integration (Sfunction != 1)
 RESTART: Fresh run
 about to integrate 13 -1 6
 imode is 0
channel 1 : 1 T 0 0 0.1000E+01 0.0000E+00 0.1000E+01
channel 2 : 1 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 3 : 2 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 4 : 2 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 5 : 3 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 6 : 3 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 7 : 4 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 8 : 4 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 9 : 5 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 10 : 5 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 11 : 6 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 12 : 6 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 13 : 7 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 14 : 7 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 15 : 8 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 16 : 8 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 17 : 9 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 18 : 9 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 19 : 10 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
channel 20 : 10 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
#--------------------------------------------------------------------------
# FastJet release 3.1.3 [fjcore]
# M. Cacciari, G.P. Salam and G. Soyez
# A software package for jet finding and analysis at colliders
# http://fastjet.fr
#
# Please cite EPJC72(2012)1896 [arXiv:1111.6097] if you use this package
# for scientific work and optionally PLB641(2006)57 [hep-ph/0512210].
#
# FastJet is provided without warranty under the terms of the GNU GPLv2.
# It uses T. Chan's closest pair algorithm, S. Fortune's Voronoi code
# and 3rd party plugin jet algorithms. See COPYING file for details.
#--------------------------------------------------------------------------
 ------- iteration 1
 Update # PS points (even_rn): 364 --> 364
Using random seed offsets: 0 , 1 , 0
  with seed 35
 Ranmar initialization seeds 14386 9410
 initial-final FKS maps:
           0 : 5 1 2 3 4 5
           1 : 1 3 0 0 0 0
           2 : 4 1 2 4 5 0
 Total number of FKS directories is 5
 For the Born we use nFKSprocesses:
           1 2 3 1 2
tau_min 1 1 : 0.00000E+00 -- 0.13120E+03
tau_min 2 1 : 0.00000E+00 -- 0.13120E+03
tau_min 3 1 : 0.13120E+03 0.13120E+03 0.13120E+03
tau_min 4 1 : 0.00000E+00 -- 0.13120E+03
tau_min 5 1 : 0.00000E+00 -- 0.13120E+03
 Scale values (may change event by event):
 muR, muR_reference: 0.966858D+02 0.966858D+02 1.00
 muF1, muF1_reference: 0.966858D+02 0.966858D+02 1.00
 muF2, muF2_reference: 0.966858D+02 0.966858D+02 1.00
 QES, QES_reference: 0.966858D+02 0.966858D+02 1.00

 muR_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state
 muF1_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state
 muF2_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state
 QES_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state

 alpha_s= 0.11794967187592105
 BORN: keeping split order 1
 counterterm S.O 1 NP
 BORN: keeping split order 1
 counterterm S.O 2 QCD
 BORN: keeping split order 1
 counterterm S.O 3 QED
 BORN: keeping split order 1
 Charge-linked born are not used
 Color-linked born are used
 alpha_s value used for the virtuals is (for the first PS point): 0.11794967187592105
  ==========================================================================================
 { }
 {   }
 {  ,,  }
 { `7MMM. ,MMF' `7MM `7MMF'  }
 {  MMMb dPMM MM MM  }
 {  M YM ,M MM ,6"Yb. ,M""bMM MM ,pW"Wq. ,pW"Wq.`7MMpdMAo.  }
 {  M Mb M' MM 8) MM ,AP MM MM 6W' `Wb 6W' `Wb MM `Wb  }
 {  M YM.P' MM ,pm9MM 8MI MM MM , 8M M8 8M M8 MM M8  }
 {  M `YM' MM 8M MM `Mb MM MM ,M YA. ,A9 YA. ,A9 MM ,AP  }
 { .JML. `' .JMML.`Moo9^Yo.`Wbmd"MML..JMMmmmmMMM `Ybmd9' `Ybmd9' MMbmmd'  }
 {  MM  }
 {  .JMML.  }
 { v3.1.0 (2021-03-30), Ref: arXiv:1103.0621v2, arXiv:1405.0301  }
 {   }
 { }
  ==========================================================================================
 ===============================================================
 INFO: MadLoop read these parameters from ../MadLoop5_resources/MadLoopParams.dat

  +----------------------------------------------------------------+
  | |
  | Ninja - version 1.1.0 |
  | |
  | Author: Tiziano Peraro |
  | |
  | Based on: |
  | |
  | P. Mastrolia, E. Mirabella and T. Peraro, |
  | "Integrand reduction of one-loop scattering amplitudes |
  | through Laurent series expansion," |
  | JHEP 1206 (2012) 095 [arXiv:1203.0291 [hep-ph]]. |
  | |
  | T. Peraro, |
  | "Ninja: Automated Integrand Reduction via Laurent |
  | Expansion for One-Loop Amplitudes," |
  | Comput.Phys.Commun. 185 (2014) [arXiv:1403.1229 [hep-ph]] |
  | |
  +----------------------------------------------------------------+

 ===============================================================
  > MLReductionLib = 6|7|1
  > CTModeRun = -1
  > MLStabThres = 1.0000000000000000E-003
  > NRotations_DP = 0
  > NRotations_QP = 0
  > CTStabThres = 1.0000000000000000E-002
  > CTLoopLibrary = 2
  > CTModeInit = 1
  > CheckCycle = 3
  > MaxAttempts = 10
  > UseLoopFilter = F
  > HelicityFilterLevel = 2
  > ImprovePSPoint = 2
  > DoubleCheckHelicityFilter = T
  > LoopInitStartOver = F
  > HelInitStartOver = F
  > ZeroThres = 1.0000000000000001E-009
  > OSThres = 1.0000000000000000E-008
  > WriteOutFilters = T
  > UseQPIntegrandForNinja = T
  > UseQPIntegrandForCutTools = T
  > IREGIMODE = 2
  > IREGIRECY = T
  > COLLIERMode = 1
  > COLLIERRequiredAccuracy = 1.0000000000000000E-008
  > COLLIERCanOutput = F
  > COLLIERComputeUVpoles = T
  > COLLIERComputeIRpoles = T
  > COLLIERGlobalCache = -1
  > COLLIERUseCacheForPoles = F
  > COLLIERUseInternalStabilityTest = T
 ===============================================================

------------------------------------------------------------------------
| You are using CutTools - Version 1.9.3 |
| Authors: G. Ossola, C. Papadopoulos, R. Pittau |
| Published in JHEP 0803:042,2008 |
| http://www.ugr.es/~pittau/CutTools |
| |
| Compiler with 34 significant digits detetected |
 ----------------------------------------------------------------------

########################################################################
# #
# You are using OneLOop-3.6 #
# #
# for the evaluation of 1-loop scalar 1-, 2-, 3- and 4-point functions #
# #
# author: Andreas van Hameren <email address hidden> #
# date: 18-02-2015 #
# #
# Please cite #
# A. van Hameren, #
# Comput.Phys.Commun. 182 (2011) 2427-2438, arXiv:1007.4716 #
# A. van Hameren, C.G. Papadopoulos and R. Pittau, #
# JHEP 0909:106,2009, arXiv:0903.4665 #
# in publications with results obtained with the help of this program. #
# #
########################################################################
########################################################################
# #
# You are using OneLOop-3.6 #
# #
# for the evaluation of 1-loop scalar 1-, 2-, 3- and 4-point functions #
# #
# author: Andreas van Hameren <email address hidden> #
# date: 18-02-2015 #
# #
# Please cite #
# A. van Hameren, #
# Comput.Phys.Commun. 182 (2011) 2427-2438, arXiv:1007.4716 #
# A. van Hameren, C.G. Papadopoulos and R. Pittau, #
# JHEP 0909:106,2009, arXiv:0903.4665 #
# in publications with results obtained with the help of this program. #
# #
########################################################################
 VIRT: keeping split order 1

 Sum of all split-orders
 ---- POLES CANCELLED ----
  COEFFICIENT DOUBLE POLE:
        MadFKS: -1.5143712932570514E-011 OLP: -1.5143712932570495E-011
  COEFFICIENT SINGLE POLE:
        MadFKS: -1.6147346519205082E-011 OLP: -1.4838994742105130E-011
  FINITE:
           OLP: -2.5371448851782186E-011
           BORN: 1.5125744906824726E-010
  MOMENTA (Exyzm):
           1 122.18223148529495 0.0000000000000000 0.0000000000000000 122.18223148529495 0.0000000000000000
           2 122.18223148529495 -0.0000000000000000 -0.0000000000000000 -122.18223148529495 0.0000000000000000
           3 72.117932911235158 57.113603952643288 34.145124135423352 27.805448903165608 0.0000000000000000
           4 67.672051241298689 -64.918277690184254 13.267521560682361 13.751240401575046 0.0000000000000000
           5 34.623847569745735 21.547478300943471 -17.895004321220270 20.354012374636756 0.0000000000000000
           6 69.950631248310316 -13.742804563402505 -29.517641374885443 -61.910701679377397 0.0000000000000000

 Splitorders 2
        NP: 0
       QCD: 2
       QED: 8
 ---- POLES CANCELLED ----
  COEFFICIENT DOUBLE POLE:
        MadFKS: -1.5143712932570514E-011 OLP: -1.5143712932570495E-011
  COEFFICIENT SINGLE POLE:
        MadFKS: -1.6147346519205079E-011 OLP: -1.4838994742105130E-011
 REAL 1: keeping split order 1

and here's the final log file:

launch auto
Traceback (most recent call last):
  File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/extended_cmd.py", line 1544, in onecmd
    return self.onecmd_orig(line, **opt)
  File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/extended_cmd.py", line 1493, in onecmd_orig
    return func(arg, **opt)
  File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/amcatnlo_run_interface.py", line 1780, in do_launch
    evt_file = self.run(mode, options)
  File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/amcatnlo_run_interface.py", line 1933, in run
    self.collect_log_files(jobs_to_run,integration_step)
  File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/amcatnlo_run_interface.py", line 3050, in collect_log_files
    with open(log) as l:
FileNotFoundError: [Errno 2] No such file or directory: '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/log_MINT0.txt'
Related File: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/log_MINT0.txt
Value of current Options:
             pythia8_path : None
                hwpp_path : None
              thepeg_path : None
               hepmc_path : None
         madanalysis_path : None
        madanalysis5_path : None
          pythia-pgs_path : None
                  td_path : None
             delphes_path : None
      exrootanalysis_path : None
             syscalc_path : None
                  timeout : 60
              web_browser : None
               eps_viewer : None
              text_editor : None
         fortran_compiler : None
            f2py_compiler : None
        f2py_compiler_py2 : None
        f2py_compiler_py3 : None
             cpp_compiler : None
              auto_update : 7
             cluster_type : slurm
            cluster_queue : None
    cluster_status_update : (900, 60)
                  fastjet : None
                    golem : None
                  samurai : None
                    ninja : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/HEPTools/lib
                  collier : ./HEPTools/lib
                   lhapdf : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/HEPTools/lhapdf6_py3/bin/lhapdf-config
                 pineappl : pineappl
               lhapdf_py2 : None
               lhapdf_py3 : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/HEPTools/lhapdf6_py3/bin/lhapdf-config
        cluster_temp_path : None
mg5amc_py8_interface_path : None
       cluster_local_path : None
                      OLP : MadLoop
         cluster_nb_retry : 1
       cluster_retry_wait : 900
             cluster_size : 150
      output_dependencies : external
           crash_on_error : False
       auto_convert_model : False
       group_subprocesses : Auto
ignore_six_quark_processes : False
low_mem_multicore_nlo_generation : False
      complex_mass_scheme : False
include_lepton_initiated_processes : False
                    gauge : unitary
             stdout_level : 20
    loop_optimized_output : True
         loop_color_flows : False
   max_npoint_for_channel : 0
  default_unset_couplings : 99
        max_t_for_channel : 99
       zerowidth_tchannel : True
      nlo_mixed_expansion : True
   automatic_html_opening : False
                 run_mode : 1
                  nb_core : 5
      notification_center : True
                 mg5_path : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0
#************************************************************
#* MadGraph5_aMC@NLO *
#* *
#* * * *
#* * * * * *
#* * * * * 5 * * * * *
#* * * * * *
#* * * *
#* *
#* *
#* VERSION 3.1.0 2021-03-30 *
#* *
#* The MadGraph5_aMC@NLO Development Team - Find us at *
#* https://server06.fynu.ucl.ac.be/projects/madgraph *
#* *
#************************************************************
#* *
#* Command File for MadGraph5_aMC@NLO *
#* *
#* run as ./bin/mg5_aMC filename *
#* *
#************************************************************
set group_subprocesses Auto
set ignore_six_quark_processes False
set low_mem_multicore_nlo_generation False
set complex_mass_scheme False
set include_lepton_initiated_processes False
set gauge unitary
set loop_optimized_output True
set loop_color_flows False
set max_npoint_for_channel 0
set default_unset_couplings 99
set max_t_for_channel 99
set zerowidth_tchannel True
set nlo_mixed_expansion True
import model SMEFTatNLO-loopWWW
define p = g u c d s u~ c~ d~ s~
define j = g u c d s u~ c~ d~ s~
define l+ = e+ mu+
define l- = e- mu-
define vl = ve vm vt
define vl~ = ve~ vm~ vt~
define p = 21 2 4 1 3 -2 -4 -1 -3 5 -5 # pass to 5 flavors
define j = p
generate p p > l+ l- j j QCD=0 NP=2 [QCD]
output /home/ucl/cp3/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_or\
der
######################################################################
## PARAM_CARD AUTOMATICALY GENERATED BY MG5 FOLLOWING UFO MODEL ####
######################################################################
## ##
## Width set on Auto will be computed following the information ##
## present in the decay.py files of the model. ##
## See arXiv:1402.1178 for more details. ##
## ##
######################################################################

###################################
## INFORMATION FOR DIM6
###################################
Block dim6
    1 1.000000e+03 # Lambda
    6 1.100000e+00 # cWWW

###################################
## INFORMATION FOR MASS
###################################
Block mass
    6 1.720000e+02 # MT
   23 9.118760e+01 # MZ
   24 7.982440e+01 # MW
   25 1.250000e+02 # MH
## Dependent parameters, given by model restrictions.
## Those values should be edited following the
## analytical expression. MG5 ignores those values
## but they are important for interfacing the output of MG5
## to external program such as Pythia.
  1 0.000000e+00 # d : 0.0
  2 0.000000e+00 # u : 0.0
  3 0.000000e+00 # s : 0.0
  4 0.000000e+00 # c : 0.0
  5 0.000000e+00 # b : 0.0
  11 0.000000e+00 # e- : 0.0
  12 0.000000e+00 # ve : 0.0
  13 0.000000e+00 # mu- : 0.0
  14 0.000000e+00 # vm : 0.0
  15 0.000000e+00 # ta- : 0.0
  16 0.000000e+00 # vt : 0.0
  21 0.000000e+00 # g : 0.0
  22 0.000000e+00 # a : 0.0
  9000002 9.118760e+01 # ghz : MZ
  9000003 7.982440e+01 # ghwp : MW
  9000004 7.982440e+01 # ghwm : MW

###################################
## INFORMATION FOR RENOR
###################################
Block renor
    1 9.118800e+01 # mueft

###################################
## INFORMATION FOR SMINPUTS
###################################
Block sminputs
    2 1.166370e-05 # Gf
    3 1.184000e-01 # aS (Note that Parameter not used if you use a PDF set)

###################################
## INFORMATION FOR YUKAWA
###################################
Block yukawa
    6 1.720000e+02 # ymt

###################################
## INFORMATION FOR DECAY
###################################
DECAY 6 1.470800e+00 # WT
DECAY 23 2.416023e+00 # WZ
DECAY 24 2.002950e+00 # WW
DECAY 25 4.088000e-03 # WH
## Dependent parameters, given by model restrictions.
## Those values should be edited following the
## analytical expression. MG5 ignores those values
## but they are important for interfacing the output of MG5
## to external program such as Pythia.
DECAY 1 0.000000e+00 # d : 0.0
DECAY 2 0.000000e+00 # u : 0.0
DECAY 3 0.000000e+00 # s : 0.0
DECAY 4 0.000000e+00 # c : 0.0
DECAY 5 0.000000e+00 # b : 0.0
DECAY 11 0.000000e+00 # e- : 0.0
DECAY 12 0.000000e+00 # ve : 0.0
DECAY 13 0.000000e+00 # mu- : 0.0
DECAY 14 0.000000e+00 # vm : 0.0
DECAY 15 0.000000e+00 # ta- : 0.0
DECAY 16 0.000000e+00 # vt : 0.0
DECAY 21 0.000000e+00 # g : 0.0
DECAY 22 0.000000e+00 # a : 0.0
DECAY 9000002 2.416023e+00 # ghz : WZ
DECAY 9000003 2.002950e+00 # ghwp : WW
DECAY 9000004 2.002950e+00 # ghwm : WW
#===========================================================
# QUANTUM NUMBERS OF NEW STATE(S) (NON SM PDG CODE)
#===========================================================

Block QNUMBERS 9000001 # gha
        1 0 # 3 times electric charge
        2 1 # number of spin states (2S+1)
        3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
        4 1 # Particle/Antiparticle distinction (0=own anti)
Block QNUMBERS 9000002 # ghz
        1 0 # 3 times electric charge
        2 1 # number of spin states (2S+1)
        3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
        4 1 # Particle/Antiparticle distinction (0=own anti)
Block QNUMBERS 9000003 # ghwp
        1 3 # 3 times electric charge
        2 1 # number of spin states (2S+1)
        3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
        4 1 # Particle/Antiparticle distinction (0=own anti)
Block QNUMBERS 9000004 # ghwm
        1 -3 # 3 times electric charge
        2 1 # number of spin states (2S+1)
        3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
        4 1 # Particle/Antiparticle distinction (0=own anti)
Block QNUMBERS 9000005 # ghg
        1 0 # 3 times electric charge
        2 1 # number of spin states (2S+1)
        3 8 # colour rep (1: singlet, 3: triplet, 8: octet)
        4 1 # Particle/Antiparticle distinction (0=own anti)
#***********************************************************************
# MadGraph5_aMC@NLO *
# *
# run_card.dat aMC@NLO *
# *
# This file is used to set the parameters of the run. *
# *
# Some notation/conventions: *
# *
# Lines starting with a hash (#) are info or comments *
# *
# mind the format: value = variable ! comment *
# *
# Some of the values of variables can be list. These can either be *
# comma or space separated. *
# *
# To display additional parameter, you can use the command: *
# update to_full *
#***********************************************************************
#
#*******************
# Running parameters
#*******************
#
#***********************************************************************
# Tag name for the run (one word) *
#***********************************************************************
  tag_1 = run_tag ! name of the run
#***********************************************************************
# Number of LHE events (and their normalization) and the required *
# (relative) accuracy on the Xsec. *
# These values are ignored for fixed order runs *
#***********************************************************************
 10000 = nevents ! Number of unweighted events requested
 -1.0 = req_acc ! Required accuracy (-1=auto determined from nevents)
 -1 = nevt_job! Max number of events per job in event generation.
                 ! (-1= no split).
#***********************************************************************
# Normalize the weights of LHE events such that they sum or average to *
# the total cross section *
#***********************************************************************
 average = event_norm ! valid settings: average, sum, bias
#***********************************************************************
# Number of points per itegration channel (ignored for aMC@NLO runs) *
#***********************************************************************
 0.1 = req_acc_FO ! Required accuracy (-1=ignored, and use the
                     ! number of points and iter. below)
# These numbers are ignored except if req_acc_FO is equal to -1
 5000 = npoints_FO_grid ! number of points to setup grids
 4 = niters_FO_grid ! number of iter. to setup grids
 10000 = npoints_FO ! number of points to compute Xsec
 6 = niters_FO ! number of iter. to compute Xsec
#***********************************************************************
# Random number seed *
#***********************************************************************
 0 = iseed ! rnd seed (0=assigned automatically=default))
#***********************************************************************
# Collider type and energy *
#***********************************************************************
 1 = lpp1 ! beam 1 type (0 = no PDF)
 1 = lpp2 ! beam 2 type (0 = no PDF)
 6500.0 = ebeam1 ! beam 1 energy in GeV
 6500.0 = ebeam2 ! beam 2 energy in GeV
#***********************************************************************
# PDF choice: this automatically fixes also alpha_s(MZ) and its evol. *
#***********************************************************************
 nn23nlo = pdlabel ! PDF set
 244600 = lhaid ! If pdlabel=lhapdf, this is the lhapdf number. Only
              ! numbers for central PDF sets are allowed. Can be a list;
              ! PDF sets beyond the first are included via reweighting.
#***********************************************************************
# Include the NLO Monte Carlo subtr. terms for the following parton *
# shower (HERWIG6 | HERWIGPP | PYTHIA6Q | PYTHIA6PT | PYTHIA8) *
# WARNING: PYTHIA6PT works only for processes without FSR!!!! *
#***********************************************************************
  HERWIG6 = parton_shower
  1.0 = shower_scale_factor ! multiply default shower starting
                                  ! scale by this factor
#***********************************************************************
# Renormalization and factorization scales *
# (Default functional form for the non-fixed scales is the sum of *
# the transverse masses divided by two of all final state particles *
# and partons. This can be changed in SubProcesses/set_scales.f or via *
# dynamical_scale_choice option) *
#***********************************************************************
 False = fixed_ren_scale ! if .true. use fixed ren scale
 False = fixed_fac_scale ! if .true. use fixed fac scale
 91.118 = muR_ref_fixed ! fixed ren reference scale
 91.118 = muF_ref_fixed ! fixed fact reference scale
 -1 = dynamical_scale_choice ! Choose one (or more) of the predefined
           ! dynamical choices. Can be a list; scale choices beyond the
           ! first are included via reweighting
 1.0 = muR_over_ref ! ratio of current muR over reference muR
 1.0 = muF_over_ref ! ratio of current muF over reference muF
#***********************************************************************
# Reweight variables for scale dependence and PDF uncertainty *
#***********************************************************************
 1.0, 2.0, 0.5 = rw_rscale ! muR factors to be included by reweighting
 1.0, 2.0, 0.5 = rw_fscale ! muF factors to be included by reweighting
 True = reweight_scale ! Reweight to get scale variation using the
            ! rw_rscale and rw_fscale factors. Should be a list of
            ! booleans of equal length to dynamical_scale_choice to
            ! specify for which choice to include scale dependence.
 False = reweight_PDF ! Reweight to get PDF uncertainty. Should be a
            ! list booleans of equal length to lhaid to specify for
            ! which PDF set to include the uncertainties.
#***********************************************************************
# Store reweight information in the LHE file for off-line model- *
# parameter reweighting at NLO+PS accuracy *
#***********************************************************************
 False = store_rwgt_info ! Store info for reweighting in LHE file
#***********************************************************************
# ickkw parameter: *
# 0: No merging *
# 3: FxFx Merging - WARNING! Applies merging only at the hard-event *
# level. After showering an MLM-type merging should be applied as *
# well. See http://amcatnlo.cern.ch/FxFx_merging.htm for details. *
# 4: UNLOPS merging (with pythia8 only). No interface from within *
# MG5_aMC available, but available in Pythia8. *
# -1: NNLL+NLO jet-veto computation. See arxiv:1412.8408 [hep-ph]. *
#***********************************************************************
 0 = ickkw
#***********************************************************************
#
#***********************************************************************
# BW cutoff (M+/-bwcutoff*Gamma). Determines which resonances are *
# written in the LHE event file *
#***********************************************************************
 15.0 = bwcutoff
#***********************************************************************
# Cuts on the jets. Jet clustering is performed by FastJet. *
# - If gamma_is_j, photons are also clustered *
# - When matching to a parton shower, these generation cuts should be *
# considerably softer than the analysis cuts. *
# - More specific cuts can be specified in SubProcesses/cuts.f *
#***********************************************************************
  -1.0 = jetalgo ! FastJet jet algorithm (1=kT, 0=C/A, -1=anti-kT)
  0.4 = jetradius ! The radius parameter for the jet algorithm
 25.0 = ptj ! Min jet transverse momentum
 -1.0 = etaj ! Max jet abs(pseudo-rap) (a value .lt.0 means no cut)
 True = gamma_is_j! Wether to cluster photons as jets or not
#***********************************************************************
# Cuts on the charged leptons (e+, e-, mu+, mu-, tau+ and tau-) *
# More specific cuts can be specified in SubProcesses/cuts.f *
#***********************************************************************
  25.0 = ptl ! Min lepton transverse momentum
  2.5 = etal ! Max lepton abs(pseudo-rap) (a value .lt.0 means no cut)
  0.0 = drll ! Min distance between opposite sign lepton pairs
  0.0 = drll_sf ! Min distance between opp. sign same-flavor lepton pairs
  0.0 = mll ! Min inv. mass of all opposite sign lepton pairs
  81.2 = mll_sf ! Min inv. mass of all opp. sign same-flavor lepton pairs
  101.2 = mll_max_sf
#***********************************************************************
# Fermion-photon recombination parameters *
# If Rphreco=0, no recombination is performed *
#***********************************************************************
 0.1 = Rphreco ! Minimum fermion-photon distance for recombination
 -1.0 = etaphreco ! Maximum abs(pseudo-rap) for photons to be recombined (a value .lt.0 means no cut)
 True = lepphreco ! Recombine photons and leptons together
 True = quarkphreco ! Recombine photons and quarks together
#***********************************************************************
# Photon-isolation cuts, according to hep-ph/9801442 *
# Not applied if gamma_is_j *
# When ptgmin=0, all the other parameters are ignored *
# More specific cuts can be specified in SubProcesses/cuts.f *
#***********************************************************************
  20.0 = ptgmin ! Min photon transverse momentum
  -1.0 = etagamma ! Max photon abs(pseudo-rap)
  0.4 = R0gamma ! Radius of isolation code
  1.0 = xn ! n parameter of eq.(3.4) in hep-ph/9801442
  1.0 = epsgamma ! epsilon_gamma parameter of eq.(3.4) in hep-ph/9801442
 True = isoEM ! isolate photons from EM energy (photons and leptons)
#***********************************************************************
# Cuts associated to MASSIVE particles identified by their PDG codes. *
# All cuts are applied to both particles and anti-particles, so use *
# POSITIVE PDG CODES only. Example of the syntax is {6 : 100} or *
# {6:100, 25:200} for multiple particles *
#***********************************************************************
  {} = pt_min_pdg ! Min pT for a massive particle
  {} = pt_max_pdg ! Max pT for a massive particle
  {} = mxx_min_pdg ! inv. mass for any pair of (anti)particles
#***********************************************************************
# Use PineAPPL to generate PDF-independent fast-interpolation grid *
# (https://zenodo.org/record/3992765#.X2EWy5MzbVo) *
#***********************************************************************
 False = pineappl ! PineAPPL switch
#***********************************************************************

Question information

Language:
English Edit question
Status:
Solved
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Solved by:
matteo maltoni
Solved:
Last query:
Last reply:
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

Is it possible that you hit the waltime limit of the cluster?

> On 26 May 2021, at 18:11, matteo maltoni <email address hidden> wrote:
>
> New question #697261 on MadGraph5_aMC@NLO:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Dear MadGraph experts,
>
> I'm trying to run a fixed order analysis for (p p > l+ l- j j QCD=0 NP=2 [QCD]) on the lemaitre3 cluster, but the computation repeatedly fails with a few jobs to go.
>
> However, I cannot spot any error in the log.txt file of the SubProcesses/P0_uu_uuemep/all_G1 folder: you can find it pasted below, together with the log file of the whole run.
>
> Is there a problem with one of my madgraph settings? This error never occurred before with other processes on the same cluster.
>
> Thank you for any help,
>
> Matteo
>
> Here's the P0_uu_uuemep/all_G1/log.txt:
>
> ===============================================================
> INFO: MadFKS read these parameters from FKS_params.dat
> ===============================================================
>> IRPoleCheckThreshold = -1.0000000000000000
>> PrecisionVirtualAtRunTime = -1.0000000000000000
>> SelectedContributionTypes = All
>> VetoedContributionTypes = None
>> QCD_squared_selected = All
>> QED_squared_selected = All
>> SelectedCouplingOrders = All
>> NHelForMCoverHels = 4
>> VirtualFraction = 1.0000000000000000
>> MinVirtualFraction = 5.0000000000000001E-003
>> SeparateFlavourConfigs = F
>> UsePolyVirtual = F
> ===============================================================
> SPLIT TYPE USED: F T F
> A PDF is used, so alpha_s(MZ) is going to be modified
> Old value of alpha_s from param_card: 0.11839999999999999
> ****************************************
>
> NNPDFDriver version 1.0.3
> Grid: NNPDF23nlo_as_0119_qed_mem0.grid
> ****************************************
> New value of alpha_s from PDF nn23nlo: 0.11899999999999999
> WARNING: the value of maxjetflavorspecified in the run_card ( 4) is inconsistent with the number of light flavours inthe model. Hence it will be set to: 5
> *****************************************************
> * MadGraph/MadEvent *
> * -------------------------------- *
> * http://madgraph.hep.uiuc.edu *
> * http://madgraph.phys.ucl.ac.be *
> * http://madgraph.roma2.infn.it *
> * -------------------------------- *
> * *
> * PARAMETER AND COUPLING VALUES *
> * *
> *****************************************************
>
> External Params
> ---------------------------------
>
> MU_R = 91.188000000000002
> mdl_Lambda = 1000.0000000000000
> mdl_cWWW = 1.1000000000000001
> mdl_mueft = 91.188000000000002
> mdl_Gf = 1.1663700000000000E-005
> aS = 0.11839999999999999
> mdl_ymt = 172.00000000000000
> mdl_MZ = 91.187600000000003
> mdl_MW = 79.824399999999997
> mdl_MT = 172.00000000000000
> mdl_MH = 125.00000000000000
> mdl_WZ = 2.4160230000000000
> mdl_WW = 2.0029499999999998
> mdl_WT = 1.4708000000000001
> mdl_WH = 4.0879999999999996E-003
> Internal Params
> ---------------------------------
>
> mdl_dlam = 0.0000000000000000
> mdl_dWB = 0.0000000000000000
> mdl_dv = 0.0000000000000000
> mdl_dT = 0.0000000000000000
> mdl_cw0 = 0.87538656571726847
> mdl_sqrt__2 = 1.4142135623730951
> mdl_muH0 = 88.388347648318430
> mdl_MW__exp__2 = 6371.9348353599999
> mdl_MZ__exp__2 = 8315.1783937600012
> mdl_sw0 = 0.48342358295983689
> mdl_nb__2__exp__0_25 = 1.1892071150027210
> mdl_Lambda__exp__2 = 1000000.0000000000
> mdl_cw0__exp__2 = 0.76630163943827356
> mdl_sw0__exp__2 = 0.23369836056172630
> mdl_MH__exp__2 = 15625.000000000000
> mdl_complexi = (0.0000000000000000,1.0000000000000000)
> mdl_sw0__exp__3 = 0.11297529879458956
> mdl_cw0__exp__3 = 0.67081016045138286
> mdl_MT__exp__2 = 29584.000000000000
> mdl_MT__exp__3 = 5088448.0000000000
> mdl_mueft__exp__2 = 8315.2513440000002
> mdl_vev0 = 246.22056907348588
> mdl_vev0__exp__2 = 60624.568634871226
> mdl_ee0 = 0.31345063981313520
> mdl_cw = 0.87538656571726847
> mdl_muH = 88.388347648318430
> mdl_sw = 0.48342358295983689
> mdl_g1 = 0.35807111062562641
> mdl_gw = 0.64839749416853931
> mdl_lam = 0.12886689630821146
> mdl_vev = 246.22056907348588
> mdl_ee = 0.31345063981313520
> mdl_ee__exp__2 = 9.8251303599263817E-002
> mdl_aEW = 7.8185903165226816E-003
> aEWM1 = 127.90029398096792
> mdl_ee0__exp__2 = 9.8251303599263817E-002
> mdl_ee0__exp__3 = 3.0796933975663836E-002
> mdl_vev0__exp__3 = 14927015.789112596
> Internal Params evaluated point by point
> ----------------------------------------
>
> mdl_sqrt__aS = 0.34409301068170506
> mdl_G__exp__2 = 1.4878582807401259
> mdl_G__exp__3 = 1.8148567439626970
> mdl_G__exp__4 = 2.2137222635669636
> mdl_MU_R__exp__2 = 8315.2513440000002
> Couplings of SMEFTatNLO-loopWWW
> ---------------------------------
>
> GC_11 = 0.00000E+00 0.12198E+01
> R2GC_1684_1173 0.00000E+00 0.12563E-01
> R2GC_1685_1174 -0.00000E+00 -0.52504E-02
> R2GC_1707_1193 0.00000E+00 0.93051E-02
> R2GC_1713_1195 -0.00000E+00 -0.28995E-02
> R2GC_2315_1626 -0.00000E+00 -0.11520E-01
> R2GC_639_1821 = 0.00000E+00 0.26252E-02
> R2GC_663_1839 = 0.00000E+00 0.14497E-02
> UVGC_2315_794_1 0.00000E+00 0.28799E-02
> UVGC_2315_795_1 -0.00000E+00 -0.57598E-02
> GC_1 = -0.00000E+00 -0.10448E+00
> GC_195 = 0.00000E+00 0.45849E+00
> GC_196 = -0.00000E+00 -0.28380E+00
> GC_198 = -0.00000E+00 -0.56760E+00
> GC_2 = 0.00000E+00 0.20897E+00
> GC_253 = 0.00000E+00 0.28850E-01
> GC_254 = -0.00000E+00 -0.86550E-01
> GC_3 = -0.00000E+00 -0.31345E+00
> GC_124 = -0.00000E+00 -0.57776E-05
> GC_258 = 0.00000E+00 0.31906E-05
>
> Collider parameters:
> --------------------
>
> Running at P P machine @ 13000.000000000000 GeV
> PDF set = nn23nlo
> alpha_s(Mz)= 0.1190 running at 2 loops.
> alpha_s(Mz)= 0.1190 running at 2 loops.
> Renormalization scale set on event-by-event basis
> Factorization scale set on event-by-event basis
>
>
> Diagram information for clustering has been set-up for nFKSprocess 1
> Diagram information for clustering has been set-up for nFKSprocess 2
> Diagram information for clustering has been set-up for nFKSprocess 3
> Diagram information for clustering has been set-up for nFKSprocess 4
> Diagram information for clustering has been set-up for nFKSprocess 5
> INFO: orders_tag_plot is computed as: + NP * 1 + QCD * 100 + QED * 10000
> orders_tag_plot= 80000 for NP,QCD,QED, = 0 , 0 , 8 ,
> AMP_SPLIT: 1 correspond to S.O. 0 0 8
> orders_tag_plot= 80200 for NP,QCD,QED, = 0 , 2 , 8 ,
> AMP_SPLIT: 2 correspond to S.O. 0 2 8
> getting user params
> Number of phase-space points per iteration: -1
> Maximum number of iterations is: 6
> Desired accuracy is: 5.0000000000000003E-002
> Using adaptive grids: 2
> Using Multi-channel integration
> Do MC over helicities for the virtuals
> Number of channels to integrate together: 20
> Running Configuration Number(s): 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10
> initial-or-final 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
> Splitting channel: 0
> Weight multiplier: 1.0000000000000000
> doing the all of this channel
> Normal integration (Sfunction != 1)
> RESTART: Fresh run
> about to integrate 13 -1 6
> imode is 0
> channel 1 : 1 T 0 0 0.1000E+01 0.0000E+00 0.1000E+01
> channel 2 : 1 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 3 : 2 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 4 : 2 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 5 : 3 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 6 : 3 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 7 : 4 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 8 : 4 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 9 : 5 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 10 : 5 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 11 : 6 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 12 : 6 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 13 : 7 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 14 : 7 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 15 : 8 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 16 : 8 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 17 : 9 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 18 : 9 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 19 : 10 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> channel 20 : 10 T 0 0 0.0000E+00 0.0000E+00 0.1000E+01
> #--------------------------------------------------------------------------
> # FastJet release 3.1.3 [fjcore]
> # M. Cacciari, G.P. Salam and G. Soyez
> # A software package for jet finding and analysis at colliders
> # http://fastjet.fr
> #
> # Please cite EPJC72(2012)1896 [arXiv:1111.6097] if you use this package
> # for scientific work and optionally PLB641(2006)57 [hep-ph/0512210].
> #
> # FastJet is provided without warranty under the terms of the GNU GPLv2.
> # It uses T. Chan's closest pair algorithm, S. Fortune's Voronoi code
> # and 3rd party plugin jet algorithms. See COPYING file for details.
> #--------------------------------------------------------------------------
> ------- iteration 1
> Update # PS points (even_rn): 364 --> 364
> Using random seed offsets: 0 , 1 , 0
> with seed 35
> Ranmar initialization seeds 14386 9410
> initial-final FKS maps:
> 0 : 5 1 2 3 4 5
> 1 : 1 3 0 0 0 0
> 2 : 4 1 2 4 5 0
> Total number of FKS directories is 5
> For the Born we use nFKSprocesses:
> 1 2 3 1 2
> tau_min 1 1 : 0.00000E+00 -- 0.13120E+03
> tau_min 2 1 : 0.00000E+00 -- 0.13120E+03
> tau_min 3 1 : 0.13120E+03 0.13120E+03 0.13120E+03
> tau_min 4 1 : 0.00000E+00 -- 0.13120E+03
> tau_min 5 1 : 0.00000E+00 -- 0.13120E+03
> Scale values (may change event by event):
> muR, muR_reference: 0.966858D+02 0.966858D+02 1.00
> muF1, muF1_reference: 0.966858D+02 0.966858D+02 1.00
> muF2, muF2_reference: 0.966858D+02 0.966858D+02 1.00
> QES, QES_reference: 0.966858D+02 0.966858D+02 1.00
>
> muR_reference [functional form]:
> H_T/2 := sum_i mT(i)/2, i=final state
> muF1_reference [functional form]:
> H_T/2 := sum_i mT(i)/2, i=final state
> muF2_reference [functional form]:
> H_T/2 := sum_i mT(i)/2, i=final state
> QES_reference [functional form]:
> H_T/2 := sum_i mT(i)/2, i=final state
>
> alpha_s= 0.11794967187592105
> BORN: keeping split order 1
> counterterm S.O 1 NP
> BORN: keeping split order 1
> counterterm S.O 2 QCD
> BORN: keeping split order 1
> counterterm S.O 3 QED
> BORN: keeping split order 1
> Charge-linked born are not used
> Color-linked born are used
> alpha_s value used for the virtuals is (for the first PS point): 0.11794967187592105
> ==========================================================================================
> { }
> {   }
> {  ,,  }
> { `7MMM. ,MMF' `7MM `7MMF'  }
> {  MMMb dPMM MM MM  }
> {  M YM ,M MM ,6"Yb. ,M""bMM MM ,pW"Wq. ,pW"Wq.`7MMpdMAo.  }
> {  M Mb M' MM 8) MM ,AP MM MM 6W' `Wb 6W' `Wb MM `Wb  }
> {  M YM.P' MM ,pm9MM 8MI MM MM , 8M M8 8M M8 MM M8  }
> {  M `YM' MM 8M MM `Mb MM MM ,M YA. ,A9 YA. ,A9 MM ,AP  }
> { .JML. `' .JMML.`Moo9^Yo.`Wbmd"MML..JMMmmmmMMM `Ybmd9' `Ybmd9' MMbmmd'  }
> {  MM  }
> {  .JMML.  }
> { v3.1.0 (2021-03-30), Ref: arXiv:1103.0621v2, arXiv:1405.0301  }
> {   }
> { }
> ==========================================================================================
> ===============================================================
> INFO: MadLoop read these parameters from ../MadLoop5_resources/MadLoopParams.dat
>
> +----------------------------------------------------------------+
> | |
> | Ninja - version 1.1.0 |
> | |
> | Author: Tiziano Peraro |
> | |
> | Based on: |
> | |
> | P. Mastrolia, E. Mirabella and T. Peraro, |
> | "Integrand reduction of one-loop scattering amplitudes |
> | through Laurent series expansion," |
> | JHEP 1206 (2012) 095 [arXiv:1203.0291 [hep-ph]]. |
> | |
> | T. Peraro, |
> | "Ninja: Automated Integrand Reduction via Laurent |
> | Expansion for One-Loop Amplitudes," |
> | Comput.Phys.Commun. 185 (2014) [arXiv:1403.1229 [hep-ph]] |
> | |
> +----------------------------------------------------------------+
>
>
> ===============================================================
>> MLReductionLib = 6|7|1
>> CTModeRun = -1
>> MLStabThres = 1.0000000000000000E-003
>> NRotations_DP = 0
>> NRotations_QP = 0
>> CTStabThres = 1.0000000000000000E-002
>> CTLoopLibrary = 2
>> CTModeInit = 1
>> CheckCycle = 3
>> MaxAttempts = 10
>> UseLoopFilter = F
>> HelicityFilterLevel = 2
>> ImprovePSPoint = 2
>> DoubleCheckHelicityFilter = T
>> LoopInitStartOver = F
>> HelInitStartOver = F
>> ZeroThres = 1.0000000000000001E-009
>> OSThres = 1.0000000000000000E-008
>> WriteOutFilters = T
>> UseQPIntegrandForNinja = T
>> UseQPIntegrandForCutTools = T
>> IREGIMODE = 2
>> IREGIRECY = T
>> COLLIERMode = 1
>> COLLIERRequiredAccuracy = 1.0000000000000000E-008
>> COLLIERCanOutput = F
>> COLLIERComputeUVpoles = T
>> COLLIERComputeIRpoles = T
>> COLLIERGlobalCache = -1
>> COLLIERUseCacheForPoles = F
>> COLLIERUseInternalStabilityTest = T
> ===============================================================
>
> ------------------------------------------------------------------------
> | You are using CutTools - Version 1.9.3 |
> | Authors: G. Ossola, C. Papadopoulos, R. Pittau |
> | Published in JHEP 0803:042,2008 |
> | http://www.ugr.es/~pittau/CutTools |
> | |
> | Compiler with 34 significant digits detetected |
> ----------------------------------------------------------------------
>
> ########################################################################
> # #
> # You are using OneLOop-3.6 #
> # #
> # for the evaluation of 1-loop scalar 1-, 2-, 3- and 4-point functions #
> # #
> # author: Andreas van Hameren <email address hidden> #
> # date: 18-02-2015 #
> # #
> # Please cite #
> # A. van Hameren, #
> # Comput.Phys.Commun. 182 (2011) 2427-2438, arXiv:1007.4716 #
> # A. van Hameren, C.G. Papadopoulos and R. Pittau, #
> # JHEP 0909:106,2009, arXiv:0903.4665 #
> # in publications with results obtained with the help of this program. #
> # #
> ########################################################################
> ########################################################################
> # #
> # You are using OneLOop-3.6 #
> # #
> # for the evaluation of 1-loop scalar 1-, 2-, 3- and 4-point functions #
> # #
> # author: Andreas van Hameren <email address hidden> #
> # date: 18-02-2015 #
> # #
> # Please cite #
> # A. van Hameren, #
> # Comput.Phys.Commun. 182 (2011) 2427-2438, arXiv:1007.4716 #
> # A. van Hameren, C.G. Papadopoulos and R. Pittau, #
> # JHEP 0909:106,2009, arXiv:0903.4665 #
> # in publications with results obtained with the help of this program. #
> # #
> ########################################################################
> VIRT: keeping split order 1
>
> Sum of all split-orders
> ---- POLES CANCELLED ----
> COEFFICIENT DOUBLE POLE:
> MadFKS: -1.5143712932570514E-011 OLP: -1.5143712932570495E-011
> COEFFICIENT SINGLE POLE:
> MadFKS: -1.6147346519205082E-011 OLP: -1.4838994742105130E-011
> FINITE:
> OLP: -2.5371448851782186E-011
> BORN: 1.5125744906824726E-010
> MOMENTA (Exyzm):
> 1 122.18223148529495 0.0000000000000000 0.0000000000000000 122.18223148529495 0.0000000000000000
> 2 122.18223148529495 -0.0000000000000000 -0.0000000000000000 -122.18223148529495 0.0000000000000000
> 3 72.117932911235158 57.113603952643288 34.145124135423352 27.805448903165608 0.0000000000000000
> 4 67.672051241298689 -64.918277690184254 13.267521560682361 13.751240401575046 0.0000000000000000
> 5 34.623847569745735 21.547478300943471 -17.895004321220270 20.354012374636756 0.0000000000000000
> 6 69.950631248310316 -13.742804563402505 -29.517641374885443 -61.910701679377397 0.0000000000000000
>
> Splitorders 2
> NP: 0
> QCD: 2
> QED: 8
> ---- POLES CANCELLED ----
> COEFFICIENT DOUBLE POLE:
> MadFKS: -1.5143712932570514E-011 OLP: -1.5143712932570495E-011
> COEFFICIENT SINGLE POLE:
> MadFKS: -1.6147346519205079E-011 OLP: -1.4838994742105130E-011
> REAL 1: keeping split order 1
>
>
>
>
>
>
> and here's the final log file:
>
> launch auto
> Traceback (most recent call last):
> File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/extended_cmd.py", line 1544, in onecmd
> return self.onecmd_orig(line, **opt)
> File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/extended_cmd.py", line 1493, in onecmd_orig
> return func(arg, **opt)
> File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/amcatnlo_run_interface.py", line 1780, in do_launch
> evt_file = self.run(mode, options)
> File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/amcatnlo_run_interface.py", line 1933, in run
> self.collect_log_files(jobs_to_run,integration_step)
> File "/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/madgraph/interface/amcatnlo_run_interface.py", line 3050, in collect_log_files
> with open(log) as l:
> FileNotFoundError: [Errno 2] No such file or directory: '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/log_MINT0.txt'
> Related File: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/log_MINT0.txt
> Value of current Options:
> pythia8_path : None
> hwpp_path : None
> thepeg_path : None
> hepmc_path : None
> madanalysis_path : None
> madanalysis5_path : None
> pythia-pgs_path : None
> td_path : None
> delphes_path : None
> exrootanalysis_path : None
> syscalc_path : None
> timeout : 60
> web_browser : None
> eps_viewer : None
> text_editor : None
> fortran_compiler : None
> f2py_compiler : None
> f2py_compiler_py2 : None
> f2py_compiler_py3 : None
> cpp_compiler : None
> auto_update : 7
> cluster_type : slurm
> cluster_queue : None
> cluster_status_update : (900, 60)
> fastjet : None
> golem : None
> samurai : None
> ninja : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/HEPTools/lib
> collier : ./HEPTools/lib
> lhapdf : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/HEPTools/lhapdf6_py3/bin/lhapdf-config
> pineappl : pineappl
> lhapdf_py2 : None
> lhapdf_py3 : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/HEPTools/lhapdf6_py3/bin/lhapdf-config
> cluster_temp_path : None
> mg5amc_py8_interface_path : None
> cluster_local_path : None
> OLP : MadLoop
> cluster_nb_retry : 1
> cluster_retry_wait : 900
> cluster_size : 150
> output_dependencies : external
> crash_on_error : False
> auto_convert_model : False
> group_subprocesses : Auto
> ignore_six_quark_processes : False
> low_mem_multicore_nlo_generation : False
> complex_mass_scheme : False
> include_lepton_initiated_processes : False
> gauge : unitary
> stdout_level : 20
> loop_optimized_output : True
> loop_color_flows : False
> max_npoint_for_channel : 0
> default_unset_couplings : 99
> max_t_for_channel : 99
> zerowidth_tchannel : True
> nlo_mixed_expansion : True
> automatic_html_opening : False
> run_mode : 1
> nb_core : 5
> notification_center : True
> mg5_path : /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0
> #************************************************************
> #* MadGraph5_aMC@NLO *
> #* *
> #* * * *
> #* * * * * *
> #* * * * * 5 * * * * *
> #* * * * * *
> #* * * *
> #* *
> #* *
> #* VERSION 3.1.0 2021-03-30 *
> #* *
> #* The MadGraph5_aMC@NLO Development Team - Find us at *
> #* https://server06.fynu.ucl.ac.be/projects/madgraph *
> #* *
> #************************************************************
> #* *
> #* Command File for MadGraph5_aMC@NLO *
> #* *
> #* run as ./bin/mg5_aMC filename *
> #* *
> #************************************************************
> set group_subprocesses Auto
> set ignore_six_quark_processes False
> set low_mem_multicore_nlo_generation False
> set complex_mass_scheme False
> set include_lepton_initiated_processes False
> set gauge unitary
> set loop_optimized_output True
> set loop_color_flows False
> set max_npoint_for_channel 0
> set default_unset_couplings 99
> set max_t_for_channel 99
> set zerowidth_tchannel True
> set nlo_mixed_expansion True
> import model SMEFTatNLO-loopWWW
> define p = g u c d s u~ c~ d~ s~
> define j = g u c d s u~ c~ d~ s~
> define l+ = e+ mu+
> define l- = e- mu-
> define vl = ve vm vt
> define vl~ = ve~ vm~ vt~
> define p = 21 2 4 1 3 -2 -4 -1 -3 5 -5 # pass to 5 flavors
> define j = p
> generate p p > l+ l- j j QCD=0 NP=2 [QCD]
> output /home/ucl/cp3/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_or\
> der
> ######################################################################
> ## PARAM_CARD AUTOMATICALY GENERATED BY MG5 FOLLOWING UFO MODEL ####
> ######################################################################
> ## ##
> ## Width set on Auto will be computed following the information ##
> ## present in the decay.py files of the model. ##
> ## See arXiv:1402.1178 for more details. ##
> ## ##
> ######################################################################
>
> ###################################
> ## INFORMATION FOR DIM6
> ###################################
> Block dim6
> 1 1.000000e+03 # Lambda
> 6 1.100000e+00 # cWWW
>
> ###################################
> ## INFORMATION FOR MASS
> ###################################
> Block mass
> 6 1.720000e+02 # MT
> 23 9.118760e+01 # MZ
> 24 7.982440e+01 # MW
> 25 1.250000e+02 # MH
> ## Dependent parameters, given by model restrictions.
> ## Those values should be edited following the
> ## analytical expression. MG5 ignores those values
> ## but they are important for interfacing the output of MG5
> ## to external program such as Pythia.
> 1 0.000000e+00 # d : 0.0
> 2 0.000000e+00 # u : 0.0
> 3 0.000000e+00 # s : 0.0
> 4 0.000000e+00 # c : 0.0
> 5 0.000000e+00 # b : 0.0
> 11 0.000000e+00 # e- : 0.0
> 12 0.000000e+00 # ve : 0.0
> 13 0.000000e+00 # mu- : 0.0
> 14 0.000000e+00 # vm : 0.0
> 15 0.000000e+00 # ta- : 0.0
> 16 0.000000e+00 # vt : 0.0
> 21 0.000000e+00 # g : 0.0
> 22 0.000000e+00 # a : 0.0
> 9000002 9.118760e+01 # ghz : MZ
> 9000003 7.982440e+01 # ghwp : MW
> 9000004 7.982440e+01 # ghwm : MW
>
> ###################################
> ## INFORMATION FOR RENOR
> ###################################
> Block renor
> 1 9.118800e+01 # mueft
>
> ###################################
> ## INFORMATION FOR SMINPUTS
> ###################################
> Block sminputs
> 2 1.166370e-05 # Gf
> 3 1.184000e-01 # aS (Note that Parameter not used if you use a PDF set)
>
> ###################################
> ## INFORMATION FOR YUKAWA
> ###################################
> Block yukawa
> 6 1.720000e+02 # ymt
>
> ###################################
> ## INFORMATION FOR DECAY
> ###################################
> DECAY 6 1.470800e+00 # WT
> DECAY 23 2.416023e+00 # WZ
> DECAY 24 2.002950e+00 # WW
> DECAY 25 4.088000e-03 # WH
> ## Dependent parameters, given by model restrictions.
> ## Those values should be edited following the
> ## analytical expression. MG5 ignores those values
> ## but they are important for interfacing the output of MG5
> ## to external program such as Pythia.
> DECAY 1 0.000000e+00 # d : 0.0
> DECAY 2 0.000000e+00 # u : 0.0
> DECAY 3 0.000000e+00 # s : 0.0
> DECAY 4 0.000000e+00 # c : 0.0
> DECAY 5 0.000000e+00 # b : 0.0
> DECAY 11 0.000000e+00 # e- : 0.0
> DECAY 12 0.000000e+00 # ve : 0.0
> DECAY 13 0.000000e+00 # mu- : 0.0
> DECAY 14 0.000000e+00 # vm : 0.0
> DECAY 15 0.000000e+00 # ta- : 0.0
> DECAY 16 0.000000e+00 # vt : 0.0
> DECAY 21 0.000000e+00 # g : 0.0
> DECAY 22 0.000000e+00 # a : 0.0
> DECAY 9000002 2.416023e+00 # ghz : WZ
> DECAY 9000003 2.002950e+00 # ghwp : WW
> DECAY 9000004 2.002950e+00 # ghwm : WW
> #===========================================================
> # QUANTUM NUMBERS OF NEW STATE(S) (NON SM PDG CODE)
> #===========================================================
>
> Block QNUMBERS 9000001 # gha
> 1 0 # 3 times electric charge
> 2 1 # number of spin states (2S+1)
> 3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
> 4 1 # Particle/Antiparticle distinction (0=own anti)
> Block QNUMBERS 9000002 # ghz
> 1 0 # 3 times electric charge
> 2 1 # number of spin states (2S+1)
> 3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
> 4 1 # Particle/Antiparticle distinction (0=own anti)
> Block QNUMBERS 9000003 # ghwp
> 1 3 # 3 times electric charge
> 2 1 # number of spin states (2S+1)
> 3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
> 4 1 # Particle/Antiparticle distinction (0=own anti)
> Block QNUMBERS 9000004 # ghwm
> 1 -3 # 3 times electric charge
> 2 1 # number of spin states (2S+1)
> 3 1 # colour rep (1: singlet, 3: triplet, 8: octet)
> 4 1 # Particle/Antiparticle distinction (0=own anti)
> Block QNUMBERS 9000005 # ghg
> 1 0 # 3 times electric charge
> 2 1 # number of spin states (2S+1)
> 3 8 # colour rep (1: singlet, 3: triplet, 8: octet)
> 4 1 # Particle/Antiparticle distinction (0=own anti)
> #***********************************************************************
> # MadGraph5_aMC@NLO *
> # *
> # run_card.dat aMC@NLO *
> # *
> # This file is used to set the parameters of the run. *
> # *
> # Some notation/conventions: *
> # *
> # Lines starting with a hash (#) are info or comments *
> # *
> # mind the format: value = variable ! comment *
> # *
> # Some of the values of variables can be list. These can either be *
> # comma or space separated. *
> # *
> # To display additional parameter, you can use the command: *
> # update to_full *
> #***********************************************************************
> #
> #*******************
> # Running parameters
> #*******************
> #
> #***********************************************************************
> # Tag name for the run (one word) *
> #***********************************************************************
> tag_1 = run_tag ! name of the run
> #***********************************************************************
> # Number of LHE events (and their normalization) and the required *
> # (relative) accuracy on the Xsec. *
> # These values are ignored for fixed order runs *
> #***********************************************************************
> 10000 = nevents ! Number of unweighted events requested
> -1.0 = req_acc ! Required accuracy (-1=auto determined from nevents)
> -1 = nevt_job! Max number of events per job in event generation.
> ! (-1= no split).
> #***********************************************************************
> # Normalize the weights of LHE events such that they sum or average to *
> # the total cross section *
> #***********************************************************************
> average = event_norm ! valid settings: average, sum, bias
> #***********************************************************************
> # Number of points per itegration channel (ignored for aMC@NLO runs) *
> #***********************************************************************
> 0.1 = req_acc_FO ! Required accuracy (-1=ignored, and use the
> ! number of points and iter. below)
> # These numbers are ignored except if req_acc_FO is equal to -1
> 5000 = npoints_FO_grid ! number of points to setup grids
> 4 = niters_FO_grid ! number of iter. to setup grids
> 10000 = npoints_FO ! number of points to compute Xsec
> 6 = niters_FO ! number of iter. to compute Xsec
> #***********************************************************************
> # Random number seed *
> #***********************************************************************
> 0 = iseed ! rnd seed (0=assigned automatically=default))
> #***********************************************************************
> # Collider type and energy *
> #***********************************************************************
> 1 = lpp1 ! beam 1 type (0 = no PDF)
> 1 = lpp2 ! beam 2 type (0 = no PDF)
> 6500.0 = ebeam1 ! beam 1 energy in GeV
> 6500.0 = ebeam2 ! beam 2 energy in GeV
> #***********************************************************************
> # PDF choice: this automatically fixes also alpha_s(MZ) and its evol. *
> #***********************************************************************
> nn23nlo = pdlabel ! PDF set
> 244600 = lhaid ! If pdlabel=lhapdf, this is the lhapdf number. Only
> ! numbers for central PDF sets are allowed. Can be a list;
> ! PDF sets beyond the first are included via reweighting.
> #***********************************************************************
> # Include the NLO Monte Carlo subtr. terms for the following parton *
> # shower (HERWIG6 | HERWIGPP | PYTHIA6Q | PYTHIA6PT | PYTHIA8) *
> # WARNING: PYTHIA6PT works only for processes without FSR!!!! *
> #***********************************************************************
> HERWIG6 = parton_shower
> 1.0 = shower_scale_factor ! multiply default shower starting
> ! scale by this factor
> #***********************************************************************
> # Renormalization and factorization scales *
> # (Default functional form for the non-fixed scales is the sum of *
> # the transverse masses divided by two of all final state particles *
> # and partons. This can be changed in SubProcesses/set_scales.f or via *
> # dynamical_scale_choice option) *
> #***********************************************************************
> False = fixed_ren_scale ! if .true. use fixed ren scale
> False = fixed_fac_scale ! if .true. use fixed fac scale
> 91.118 = muR_ref_fixed ! fixed ren reference scale
> 91.118 = muF_ref_fixed ! fixed fact reference scale
> -1 = dynamical_scale_choice ! Choose one (or more) of the predefined
> ! dynamical choices. Can be a list; scale choices beyond the
> ! first are included via reweighting
> 1.0 = muR_over_ref ! ratio of current muR over reference muR
> 1.0 = muF_over_ref ! ratio of current muF over reference muF
> #***********************************************************************
> # Reweight variables for scale dependence and PDF uncertainty *
> #***********************************************************************
> 1.0, 2.0, 0.5 = rw_rscale ! muR factors to be included by reweighting
> 1.0, 2.0, 0.5 = rw_fscale ! muF factors to be included by reweighting
> True = reweight_scale ! Reweight to get scale variation using the
> ! rw_rscale and rw_fscale factors. Should be a list of
> ! booleans of equal length to dynamical_scale_choice to
> ! specify for which choice to include scale dependence.
> False = reweight_PDF ! Reweight to get PDF uncertainty. Should be a
> ! list booleans of equal length to lhaid to specify for
> ! which PDF set to include the uncertainties.
> #***********************************************************************
> # Store reweight information in the LHE file for off-line model- *
> # parameter reweighting at NLO+PS accuracy *
> #***********************************************************************
> False = store_rwgt_info ! Store info for reweighting in LHE file
> #***********************************************************************
> # ickkw parameter: *
> # 0: No merging *
> # 3: FxFx Merging - WARNING! Applies merging only at the hard-event *
> # level. After showering an MLM-type merging should be applied as *
> # well. See http://amcatnlo.cern.ch/FxFx_merging.htm for details. *
> # 4: UNLOPS merging (with pythia8 only). No interface from within *
> # MG5_aMC available, but available in Pythia8. *
> # -1: NNLL+NLO jet-veto computation. See arxiv:1412.8408 [hep-ph]. *
> #***********************************************************************
> 0 = ickkw
> #***********************************************************************
> #
> #***********************************************************************
> # BW cutoff (M+/-bwcutoff*Gamma). Determines which resonances are *
> # written in the LHE event file *
> #***********************************************************************
> 15.0 = bwcutoff
> #***********************************************************************
> # Cuts on the jets. Jet clustering is performed by FastJet. *
> # - If gamma_is_j, photons are also clustered *
> # - When matching to a parton shower, these generation cuts should be *
> # considerably softer than the analysis cuts. *
> # - More specific cuts can be specified in SubProcesses/cuts.f *
> #***********************************************************************
> -1.0 = jetalgo ! FastJet jet algorithm (1=kT, 0=C/A, -1=anti-kT)
> 0.4 = jetradius ! The radius parameter for the jet algorithm
> 25.0 = ptj ! Min jet transverse momentum
> -1.0 = etaj ! Max jet abs(pseudo-rap) (a value .lt.0 means no cut)
> True = gamma_is_j! Wether to cluster photons as jets or not
> #***********************************************************************
> # Cuts on the charged leptons (e+, e-, mu+, mu-, tau+ and tau-) *
> # More specific cuts can be specified in SubProcesses/cuts.f *
> #***********************************************************************
> 25.0 = ptl ! Min lepton transverse momentum
> 2.5 = etal ! Max lepton abs(pseudo-rap) (a value .lt.0 means no cut)
> 0.0 = drll ! Min distance between opposite sign lepton pairs
> 0.0 = drll_sf ! Min distance between opp. sign same-flavor lepton pairs
> 0.0 = mll ! Min inv. mass of all opposite sign lepton pairs
> 81.2 = mll_sf ! Min inv. mass of all opp. sign same-flavor lepton pairs
> 101.2 = mll_max_sf
> #***********************************************************************
> # Fermion-photon recombination parameters *
> # If Rphreco=0, no recombination is performed *
> #***********************************************************************
> 0.1 = Rphreco ! Minimum fermion-photon distance for recombination
> -1.0 = etaphreco ! Maximum abs(pseudo-rap) for photons to be recombined (a value .lt.0 means no cut)
> True = lepphreco ! Recombine photons and leptons together
> True = quarkphreco ! Recombine photons and quarks together
> #***********************************************************************
> # Photon-isolation cuts, according to hep-ph/9801442 *
> # Not applied if gamma_is_j *
> # When ptgmin=0, all the other parameters are ignored *
> # More specific cuts can be specified in SubProcesses/cuts.f *
> #***********************************************************************
> 20.0 = ptgmin ! Min photon transverse momentum
> -1.0 = etagamma ! Max photon abs(pseudo-rap)
> 0.4 = R0gamma ! Radius of isolation code
> 1.0 = xn ! n parameter of eq.(3.4) in hep-ph/9801442
> 1.0 = epsgamma ! epsilon_gamma parameter of eq.(3.4) in hep-ph/9801442
> True = isoEM ! isolate photons from EM energy (photons and leptons)
> #***********************************************************************
> # Cuts associated to MASSIVE particles identified by their PDG codes. *
> # All cuts are applied to both particles and anti-particles, so use *
> # POSITIVE PDG CODES only. Example of the syntax is {6 : 100} or *
> # {6:100, 25:200} for multiple particles *
> #***********************************************************************
> {} = pt_min_pdg ! Min pT for a massive particle
> {} = pt_max_pdg ! Max pT for a massive particle
> {} = mxx_min_pdg ! inv. mass for any pair of (anti)particles
> #***********************************************************************
> # Use PineAPPL to generate PDF-independent fast-interpolation grid *
> # (https://zenodo.org/record/3992765#.X2EWy5MzbVo) *
> #***********************************************************************
> False = pineappl ! PineAPPL switch
> #***********************************************************************
>
>
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#2

Hi Olivier,

Thank you for your prompt answer.

The runs usually last one hour or so before failing, while I set a maximum time of 24 hours in the .sh file.

Best,

Matteo

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#3

This is the last part of the .out file:

INFO: Setting up grids
INFO: Idle: 79, Running: 113, Completed: 0 [ 0.24s ]
INFO: Idle: 51, Running: 113, Completed: 28 [ 1m 0s ]
INFO: Idle: 18, Running: 113, Completed: 61 [ 2m 1s ]
INFO: Idle: 0, Running: 82, Completed: 110 [ 3m 1s ]
INFO: Idle: 0, Running: 36, Completed: 156 [ 4m 1s ]
INFO: Idle: 0, Running: 18, Completed: 174 [ 5m 1s ]
INFO: Idle: 0, Running: 10, Completed: 182 [ 6m 2s ]
INFO: Idle: 0, Running: 7, Completed: 185 [ 7m 2s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 8m 2s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 9m 2s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 10m 3s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 11m 3s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 12m 3s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 13m 3s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 14m 3s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 15m 4s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 16m 4s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 17m 4s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 18m 4s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 19m 4s ]
INFO: Idle: 0, Running: 6, Completed: 186 [ 20m 5s ]
WARNING: resubmit job (for the 1 times) 
WARNING: resubmit job (for the 1 times) 
WARNING: resubmit job (for the 1 times) 
WARNING: resubmit job (for the 1 times) 
INFO: Idle: 4, Running: 2, Completed: 190 [ 21m 6s ]
WARNING: resubmit job (for the 1 times) 
INFO: Idle: 1, Running: 5, Completed: 191 [ 22m 6s ]
INFO: Idle: 0, Running: 6, Completed: 191 [ 23m 6s ]
WARNING: resubmit job (for the 1 times) 
INFO: Idle: 1, Running: 5, Completed: 192 [ 24m 7s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 25m 7s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 26m 7s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 27m 7s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 28m 7s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 29m 8s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 30m 8s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 31m 8s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 32m 8s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 33m 8s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 34m 9s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 35m 9s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 36m 9s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 37m 9s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 38m 10s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 39m 10s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 40m 10s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 41m 10s ]
INFO: Idle: 0, Running: 6, Completed: 192 [ 42m 10s ]
CRITICAL: Fail to run correctly job 69893949.
            with option: {'prog': 'ajob1', 'argument': ['1', 'all', '0', '0'], 'cwd': '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep', 'stdout': None, 'stderr': None, 'log': None, 'input_files': ['/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/randinit', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/symfact.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/iproc.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/initial_states_map.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/configs_and_props_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/leshouche_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/FKS_params.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/MadLoop5_resources.tar.gz', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/madevent_mintFO', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/lib/Pdfdata/NNPDF23nlo_as_0119_qed_mem0.grid'], 'output_files': ['all_G1'], 'required_output': ['all_G1/results.dat', 'all_G1/res_0.dat', 'all_G1/log_MINT0.txt', 'all_G1/mint_grids', 'all_G1/grid.MC_integer'], 'nb_submit': 1, 'time_check': 1622029197.3903582}
            file missing: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/results.dat
            Fails 1 times
            No resubmition. 
CRITICAL: Fail to run correctly job 69893950.
            with option: {'prog': 'ajob1', 'argument': ['3', 'all', '0', '0'], 'cwd': '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep', 'stdout': None, 'stderr': None, 'log': None, 'input_files': ['/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/randinit', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/symfact.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/iproc.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/initial_states_map.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/configs_and_props_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/leshouche_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/FKS_params.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/MadLoop5_resources.tar.gz', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/madevent_mintFO', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G3', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/lib/Pdfdata/NNPDF23nlo_as_0119_qed_mem0.grid'], 'output_files': ['all_G3'], 'required_output': ['all_G3/results.dat', 'all_G3/res_0.dat', 'all_G3/log_MINT0.txt', 'all_G3/mint_grids', 'all_G3/grid.MC_integer'], 'nb_submit': 1, 'time_check': 1622029197.3905582}
            file missing: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G3/results.dat
            Fails 1 times
            No resubmition. 
CRITICAL: Fail to run correctly job 69893951.
            with option: {'prog': 'ajob1', 'argument': ['2', 'all', '0', '0'], 'cwd': '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep', 'stdout': None, 'stderr': None, 'log': None, 'input_files': ['/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/randinit', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/symfact.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/iproc.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/initial_states_map.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/configs_and_props_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/leshouche_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/FKS_params.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/MadLoop5_resources.tar.gz', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/madevent_mintFO', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/all_G2', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/lib/Pdfdata/NNPDF23nlo_as_0119_qed_mem0.grid'], 'output_files': ['all_G2'], 'required_output': ['all_G2/results.dat', 'all_G2/res_0.dat', 'all_G2/log_MINT0.txt', 'all_G2/mint_grids', 'all_G2/grid.MC_integer'], 'nb_submit': 1, 'time_check': 1622029197.3908014}
            file missing: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uux_uuxemep/all_G2/results.dat
            Fails 1 times
            No resubmition. 
CRITICAL: Fail to run correctly job 69893952.
            with option: {'prog': 'ajob1', 'argument': ['2', 'all', '0', '0'], 'cwd': '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep', 'stdout': None, 'stderr': None, 'log': None, 'input_files': ['/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/randinit', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/symfact.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/iproc.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/initial_states_map.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/configs_and_props_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/leshouche_info.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/FKS_params.dat', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/MadLoop5_resources.tar.gz', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/madevent_mintFO', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/all_G2', '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/lib/Pdfdata/NNPDF23nlo_as_0119_qed_mem0.grid'], 'output_files': ['all_G2'], 'required_output': ['all_G2/results.dat', 'all_G2/res_0.dat', 'all_G2/log_MINT0.txt', 'all_G2/mint_grids', 'all_G2/grid.MC_integer'], 'nb_submit': 1, 'time_check': 1622029197.391044}
            file missing: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uxu_uuxemep/all_G2/results.dat
            Fails 1 times
            No resubmition. 
INFO: Idle: 0, Running: 2, Completed: 196 [ 43m 11s ]
INFO: All jobs finished
INFO: Idle: 0, Running: 0, Completed: 198 [ 44m 11s ]
Command "launch auto " interrupted with error:
FileNotFoundError : [Errno 2] No such file or directory: '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/log_MINT0.txt'
Please report this bug on https://bugs.launchpad.net/mg5amcnlo
More information is found in '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/run_02_tag_1_debug.log'.
Please attach this file to your report.
INFO:
quit
INFO:
quit

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#4

Hi,

I see in your log file a million of
"""
rplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80000
 orderplotNPQCDQED
       80000
 orderplotNPQCDQED
       80000
 orderplotNPQCDQED
       80000
 orderplotNPQCDQED
       80000
 orderplotNPQCDQED
       80000
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       80200
 orderplotNPQCDQED
       802
"""

I guess that you are responsible for those line. I can be wrong but I would bet that such line are creating the issue.
With likely the nfs filesystem to slow down or maybe if those file are store in ram that your program is killed due to the lack of ram.

So I would advise you to fix your code to remove those printout.

Cheers,

Olivier

> On 26 May 2021, at 22:15, matteo maltoni <email address hidden> wrote:
>
> /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#5

Dear Olivier,

I removed the printouts, but the error is showing up the same.
However, I think the lack of ram problem is indeed the cause of that: in the subprocesses folders which are missing the results.dat file, the log.txt file seems to be incomplete?

Do you have any advice to solve this?

I paste my .sh script below.

Thank you,

Matteo

#!/bin/bash
# Submission script for Lemaitre3
#SBATCH --job-name=ppZlljjEW
#SBATCH --array=1
#SBATCH --time=0-05:00:00 # days-hh:mm:ss
#
#SBATCH --ntasks=1
#SBATCH --mem-per-cpu=95000 # megabytes
#SBATCH --partition=batch
#
#SBATCH --<email address hidden>
#SBATCH --mail-type=END,FAIL
#
#SBATCH --comment=ppZlljjEW
#SBATCH --output=ppZlljjEW.out

echo "Task ID: $SLURM_ARRAY_TASK_ID"

module load Python/3.7.4-GCCcore-8.3.0
module load gnuplot/5.2.8-GCCcore-8.3.0
./MG5_aMC_v3_1_0/bin/mg5_aMC /home/ucl/cp3/mmaltoni/MG5_aMC_v3_1_0/bin/ppZlljj.txt

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#6

So it likely means that they are killed (by Slurm?)

Maybe you should modify madgraph/various/cluster.py to receive an email for each fail subjob.
by asking
 --<email address hidden> <mailto:<email address hidden>>
--mail-type=FAIL

this might contains information on why those stops are killed.
If not ask to write on disk the slum log for each job (in the same file)

Cheers,

Olivier

> On 27 May 2021, at 14:15, matteo maltoni <email address hidden> wrote:
>
> 5_aMC_v3_1_0/bin/

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#7

Dear Olivier,

It doesn't seem the jobs are killed by the cluster, they run for much longer than the others and are resubmitted, until they fail.
There's no evident error in the log files of the subprocesses whose jobs fail, they just seem to be incomplete.

I paste the last part of one of them below: the last part, with the final results, is missing.
I tried to reduce the requested accuracy, but nothing changed.

Do you know something else I could try?

Best,

Matteo

Splitorders 6
        NP: 4
       QCD: 2
       QED: 8
 ---- POLES CANCELLED ----
  COEFFICIENT DOUBLE POLE:
        MadFKS: -3.4137760189331631E-010 OLP: -3.4137760189327913E-010
  COEFFICIENT SINGLE POLE:
        MadFKS: -5.9687950538506703E-010 OLP: -5.9687950538500137E-010
 REAL 1: keeping split order 1
 REAL 1: keeping split order 2
 REAL 1: keeping split order 3
tau_min 1 2 : 0.00000E+00 -- 0.14119E+03
tau_min 2 2 : 0.00000E+00 -- 0.14119E+03
tau_min 3 2 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 4 2 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 5 2 : 0.00000E+00 -- 0.14119E+03
tau_min 6 2 : 0.00000E+00 -- 0.14119E+03
 REAL 2: keeping split order 1
 REAL 2: keeping split order 2
 REAL 2: keeping split order 3
 REAL 3: keeping split order 1
 REAL 3: keeping split order 2
 REAL 3: keeping split order 3
tau_min 1 3 : 0.00000E+00 -- 0.14119E+03
tau_min 2 3 : 0.00000E+00 -- 0.14119E+03
tau_min 3 3 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 4 3 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 5 3 : 0.00000E+00 -- 0.14119E+03
tau_min 6 3 : 0.00000E+00 -- 0.14119E+03
tau_min 1 4 : 0.00000E+00 -- 0.14119E+03
tau_min 2 4 : 0.00000E+00 -- 0.14119E+03
tau_min 3 4 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 4 4 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 5 4 : 0.00000E+00 -- 0.14119E+03
.... (similar lines)
tau_min 3 18 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 4 18 : 0.13120E+03 0.13120E+03 0.14119E+03
tau_min 5 18 : 0.00000E+00 -- 0.14119E+03
tau_min 6 18 : 0.00000E+00 -- 0.14119E+03
ABS integral = 0.1804E+00 +/- 0.6590E-01 ( 36.540 %)
Integral = 0.1716E+00 +/- 0.6590E-01 ( 38.401 %)
Virtual = -.6087E-01 +/- 0.2384E-01 ( 39.160 %)
Virtual ratio = -.1190E+01 +/- 0.1934E-01 ( 1.625 %)
ABS virtual = 0.7784E-01 +/- 0.2940E-01 ( 37.777 %)
Born = 0.1545E+00 +/- 0.6856E-01 ( 44.387 %)
V 4 = -.4992E-01 +/- 0.2442E-01 ( 48.911 %)
B 4 = 0.1401E+00 +/- 0.7407E-01 ( 52.851 %)
V 5 = 0.3595E-02 +/- 0.4756E-02 ( 132.309 %)
B 5 = -.1128E-01 +/- 0.1065E-01 ( 94.393 %)
V 6 = -.1455E-01 +/- 0.6139E-02 ( 42.202 %)
B 6 = 0.2559E-01 +/- 0.1001E-01 ( 39.131 %)
Chi^2 per d.o.f. 0.0000E+00
accumulated results ABS integral = 0.1804E+00 +/- 0.6590E-01 ( 36.540 %)
accumulated results Integral = 0.1716E+00 +/- 0.6590E-01 ( 38.401 %)
accumulated results Virtual = -.6087E-01 +/- 0.2384E-01 ( 39.160 %)
accumulated results Virtual ratio = -.1190E+01 +/- 0.1934E-01 ( 1.625 %)
accumulated results ABS virtual = 0.7784E-01 +/- 0.2940E-01 ( 37.777 %)
accumulated results Born = 0.1545E+00 +/- 0.6856E-01 ( 44.387 %)
accumulated results V 4 = -.4992E-01 +/- 0.2442E-01 ( 48.911 %)
accumulated results B 4 = 0.1401E+00 +/- 0.7407E-01 ( 52.851 %)
accumulated results V 5 = 0.3595E-02 +/- 0.4756E-02 ( 132.309 %)
accumulated results B 5 = -.1128E-01 +/- 0.1065E-01 ( 94.393 %)
accumulated results V 6 = -.1455E-01 +/- 0.6139E-02 ( 42.202 %)
accumulated results B 6 = 0.2559E-01 +/- 0.1001E-01 ( 39.131 %)
accumulated result Chi^2 per DoF = 0.0000E+00
  1: 0 1 2
  2: 0 1 2 3 4
channel 1 : 20 T 3636 0 0.3794E-02 0.3734E-02 0.6534E+00
channel 2 : 20 T 2424 0 0.1228E-02 0.1136E-02 0.1000E+01
channel 3 : 21 T 3636 0 0.6767E-01 0.6499E-01 0.1000E+01
channel 4 : 21 T 2020 0 0.3467E-01 0.3254E-01 0.6613E+00
channel 5 : 22 T 10100 0 0.2420E-04 0.1479E-04 0.8100E+00
channel 6 : 22 T 8888 0 0.2280E-04 0.1094E-04 0.1000E+01
channel 7 : 23 T 10100 0 0.1453E-04 0.1272E-04 0.4861E+00
channel 8 : 23 T 7676 0 0.3918E-05 0.3096E-05 0.1000E+01
channel 9 : 24 T 4040 0 0.2658E-02 0.2368E-02 0.1000E+01
channel 10 : 24 T 2020 0 0.1704E-02 0.1384E-02 0.4100E+00
channel 11 : 25 T 3636 0 0.3315E-02 0.2422E-02 0.1000E+01
channel 12 : 25 T 2828 0 0.5284E-01 0.5218E-01 0.3495E+00
channel 13 : 26 T 9696 0 0.9613E-05 0.9178E-05 0.1000E+01
channel 14 : 26 T 8484 0 0.3962E-05 0.3219E-05 0.3743E+00
channel 15 : 27 T 8080 0 0.3196E-04 0.2847E-04 0.1000E+01
channel 16 : 27 T 7272 0 0.1230E-03 -.4530E-04 0.4051E+00
channel 17 : 28 T 3636 0 0.9552E-02 0.9150E-02 0.1000E+01
channel 18 : 28 T 2424 0 0.2676E-02 0.1675E-02 0.9674E+00
 ------- iteration 2
 Update # PS points (even_rn): 14544 --> 14544
ABS integral = 0.1217E+00 +/- 0.1727E-01 ( 14.183 %)
Integral = 0.1038E+00 +/- 0.1727E-01 ( 16.636 %)
Virtual = -.7242E-03 +/- 0.2800E-02 ( 386.581 %)
Virtual ratio = -.3812E+00 +/- 0.2611E-02 ( 0.685 %)
ABS virtual = 0.2228E-01 +/- 0.4702E-02 ( 21.106 %)
Born = 0.7508E-01 +/- 0.1609E-01 ( 21.427 %)
V 4 = -.2854E-02 +/- 0.1318E-02 ( 46.181 %)
B 4 = 0.5162E-01 +/- 0.8050E-02 ( 15.596 %)
V 5 = 0.2587E-02 +/- 0.3096E-02 ( 119.653 %)
B 5 = 0.4972E-02 +/- 0.6394E-02 ( 128.598 %)
V 6 = -.4573E-03 +/- 0.1800E-02 ( 393.529 %)
B 6 = 0.1849E-01 +/- 0.5917E-02 ( 32.005 %)
Chi^2= 0.4965E+00
accumulated results ABS integral = 0.1339E+00 +/- 0.1670E-01 ( 12.474 %)
accumulated results Integral = 0.1179E+00 +/- 0.1671E-01 ( 14.171 %)
accumulated results Virtual = -.7046E-02 +/- 0.2781E-02 ( 39.463 %)
accumulated results Virtual ratio = -.4774E+00 +/- 0.2587E-02 ( 0.542 %)
accumulated results ABS virtual = 0.2994E-01 +/- 0.4643E-02 ( 15.509 %)
accumulated results Born = 0.9016E-01 +/- 0.1566E-01 ( 17.370 %)
accumulated results V 4 = -.5265E-02 +/- 0.1316E-02 ( 24.999 %)
accumulated results B 4 = 0.6030E-01 +/- 0.8003E-02 ( 13.273 %)
accumulated results V 5 = 0.2985E-02 +/- 0.2595E-02 ( 86.934 %)
accumulated results B 5 = -.1126E-02 +/- 0.5481E-02 ( 486.943 %)
accumulated results V 6 = -.3651E-02 +/- 0.1727E-02 ( 47.299 %)
accumulated results B 6 = 0.2112E-01 +/- 0.5094E-02 ( 24.114 %)
accumulated result Chi^2 per DoF = 0.4965E+00
  1: 0 1 2
  2: 0 1 2 3 4
channel 1 : 20 F 990 3636 0.4660E-02 0.2856E-02 0.2653E+00
channel 2 : 20 F 291 2424 0.5523E-02 0.5499E-02 0.3868E+00
channel 3 : 21 T 16351 3636 0.4056E-01 0.3508E-01 0.6381E+00
channel 4 : 21 T 8324 2020 0.4634E-01 0.4143E-01 0.1695E+00
channel 5 : 22 F 8 10100 0.6215E-05 0.4262E-05 0.8240E+00
channel 6 : 22 F 5 8888 0.9544E-05 -.2537E-05 0.2500E+00
channel 7 : 23 F 3 10100 0.3018E-05 0.2641E-05 0.9722E+00
channel 8 : 23 F 1 7676 0.8134E-06 0.6429E-06 0.1000E+01
channel 9 : 24 F 623 4040 0.2808E-02 0.2630E-02 0.2517E+00
channel 10 : 24 F 421 2020 0.3135E-02 0.2426E-02 0.2518E+00
channel 11 : 25 F 784 3636 0.4271E-02 0.3676E-02 0.7307E+00
channel 12 : 25 T 12817 2828 0.1973E-01 0.1854E-01 0.8737E-01
channel 13 : 26 F 2 9696 0.7984E-05 0.7894E-05 0.2500E+00
channel 14 : 26 F 0 8484 0.8226E-06 0.6684E-06 0.7486E+00
channel 15 : 27 F 7 8080 0.2743E-04 0.2670E-04 0.4048E+00
channel 16 : 27 F 32 7272 0.2620E-04 -.8745E-05 0.2862E+00
channel 17 : 28 F 2307 3636 0.4223E-02 0.4064E-02 0.3662E+00
channel 18 : 28 F 666 2424 0.2581E-02 0.1669E-02 0.2419E+00
 ------- iteration 3
 Update # PS points (even_rn): 29088 --> 24576
ABS integral = 0.1268E+00 +/- 0.7388E-02 ( 5.824 %)
Integral = 0.8245E-01 +/- 0.7396E-02 ( 8.971 %)
Virtual = -.4737E-02 +/- 0.2135E-02 ( 45.070 %)
Virtual ratio = -.4174E+00 +/- 0.2636E-02 ( 0.632 %)
ABS virtual = 0.1993E-01 +/- 0.2249E-02 ( 11.283 %)
Born = 0.2780E-01 +/- 0.1854E-02 ( 6.669 %)
V 4 = -.2319E-02 +/- 0.8041E-03 ( 34.679 %)
B 4 = 0.2409E-01 +/- 0.1725E-02 ( 7.161 %)
V 5 = -.8549E-04 +/- 0.4565E-03 ( 533.939 %)
B 5 = -.8336E-03 +/- 0.5285E-03 ( 63.397 %)
V 6 = -.2333E-02 +/- 0.1703E-02 ( 73.019 %)
B 6 = 0.4538E-02 +/- 0.5436E-03 ( 11.980 %)
Chi^2= 0.8607E-01
accumulated results ABS integral = 0.1290E+00 +/- 0.6757E-02 ( 5.237 %)
accumulated results Integral = 0.9333E-01 +/- 0.6763E-02 ( 7.247 %)
accumulated results Virtual = -.5740E-02 +/- 0.1693E-02 ( 29.502 %)
accumulated results Virtual ratio = -.4477E+00 +/- 0.1846E-02 ( 0.412 %)
accumulated results ABS virtual = 0.2320E-01 +/- 0.2024E-02 ( 8.725 %)
accumulated results Born = 0.3440E-01 +/- 0.1841E-02 ( 5.352 %)
accumulated results V 4 = -.3436E-02 +/- 0.6862E-03 ( 19.970 %)
accumulated results B 4 = 0.3051E-01 +/- 0.1686E-02 ( 5.527 %)
accumulated results V 5 = 0.3738E-03 +/- 0.4496E-03 ( 120.264 %)
accumulated results B 5 = -.8593E-03 +/- 0.5260E-03 ( 61.218 %)
accumulated results V 6 = -.2987E-02 +/- 0.1213E-02 ( 40.594 %)
accumulated results B 6 = 0.6138E-02 +/- 0.5406E-03 ( 8.808 %)
accumulated result Chi^2 per DoF = 0.2913E+00
  1: 0 1 2
  2: 0 1 2 3 4
channel 1 : 20 T 3535 3636 0.4656E-02 0.2741E-02 0.6632E-01
channel 2 : 20 T 3364 2424 0.3146E-02 0.3064E-02 0.1972E+00
channel 3 : 21 T 22352 16351 0.3479E-01 0.2464E-01 0.5238E+00
channel 4 : 21 T 25492 8324 0.5002E-01 0.3755E-01 0.1061E+00
channel 5 : 22 F 14 10100 0.2188E-05 0.1590E-05 0.2060E+00
channel 6 : 22 F 12 8888 0.4353E-05 0.6472E-06 0.5000E+00
channel 7 : 23 F 3 10100 0.9254E-06 0.8103E-06 0.1000E+01
channel 8 : 23 F 1 7676 0.2494E-06 0.1973E-06 0.1000E+01
channel 9 : 24 F 2182 4040 0.3381E-02 0.1522E-02 0.6293E-01
channel 10 : 24 T 2154 2020 0.5759E-02 0.4527E-02 0.6296E-01
channel 11 : 25 F 3105 3636 0.6534E-02 0.2729E-02 0.1827E+00
channel 12 : 25 F 10822 12817 0.1421E-01 0.1142E-01 0.6189E-01
channel 13 : 26 F 4 9696 0.2448E-05 0.2422E-05 0.5000E+00
channel 14 : 26 F 3 8484 0.8302E-05 0.8253E-05 0.3047E+00
channel 15 : 27 F 22 8080 0.1118E-04 0.1096E-04 0.1012E+00
channel 16 : 27 F 46 7272 0.1124E-04 0.5225E-06 0.3260E+00
channel 17 : 28 T 4612 3636 0.3078E-02 0.2838E-02 0.1224E+00
channel 18 : 28 F 2145 2424 0.3398E-02 0.2268E-02 0.9765E-01
 ------- iteration 4
 Update # PS points (even_rn): 58176 --> 57344
ABS integral = 0.2061E+00 +/- 0.3103E-01 ( 15.061 %)
Integral = 0.1498E+00 +/- 0.3104E-01 ( 20.716 %)
Virtual = -.6913E-03 +/- 0.2128E-02 ( 307.786 %)
Virtual ratio = -.4412E+00 +/- 0.2504E-02 ( 0.568 %)
ABS virtual = 0.2201E-01 +/- 0.2608E-02 ( 11.845 %)
Born = 0.2249E-01 +/- 0.1615E-02 ( 7.180 %)
V 4 = -.1017E-02 +/- 0.9095E-03 ( 89.466 %)
B 4 = 0.1808E-01 +/- 0.1206E-02 ( 6.674 %)
V 5 = 0.4908E-03 +/- 0.3475E-03 ( 70.815 %)
B 5 = -.3895E-03 +/- 0.2050E-03 ( 52.646 %)
V 6 = -.1656E-03 +/- 0.2130E-02 ( ******* %)
B 6 = 0.4801E-02 +/- 0.1020E-02 ( 21.248 %)
Chi^2= 0.4156E+01
accumulated results ABS integral = 0.1428E+00 +/- 0.6602E-02 ( 4.624 %)
accumulated results Integral = 0.1034E+00 +/- 0.6608E-02 ( 6.389 %)
accumulated results Virtual = -.3503E-02 +/- 0.1325E-02 ( 37.829 %)
accumulated results Virtual ratio = -.4449E+00 +/- 0.1486E-02 ( 0.334 %)
accumulated results ABS virtual = 0.2268E-01 +/- 0.1599E-02 ( 7.050 %)
accumulated results Born = 0.2805E-01 +/- 0.1214E-02 ( 4.327 %)
accumulated results V 4 = -.2396E-02 +/- 0.5478E-03 ( 22.866 %)
accumulated results B 4 = 0.2326E-01 +/- 0.9812E-03 ( 4.218 %)
accumulated results V 5 = 0.4398E-03 +/- 0.2750E-03 ( 62.522 %)
accumulated results B 5 = -.5212E-03 +/- 0.1910E-03 ( 36.652 %)
accumulated results V 6 = -.1964E-02 +/- 0.1054E-02 ( 53.669 %)
accumulated results B 6 = 0.5675E-02 +/- 0.4777E-03 ( 8.417 %)
accumulated result Chi^2 per DoF = 0.1580E+01
accumulated results last 3 iterations ABS integral = 0.1398E+00 +/- 0.6635E-02 ( 4.746 %)
accumulated results last 3 iterations Integral = 0.9981E-01 +/- 0.6642E-02 ( 6.654 %)
accumulated result last 3 iterrations Chi^2 per DoF = 0.2299E+01
  1: 0 1 2
  2: 0 1 2 3 4
channel 1 : 20 T 4084 3535 0.4595E-02 0.2443E-02 0.1326E+00
channel 2 : 20 F 2764 3364 0.2973E-02 0.2871E-02 0.1174E+00
channel 3 : 21 T 31120 22352 0.3519E-01 0.2471E-01 0.2159E+00
channel 4 : 21 T 44415 25492 0.4995E-01 0.3772E-01 0.5130E-01
channel 5 : 22 F 16 10100 0.1796E-05 0.1305E-05 0.4120E+00
channel 6 : 22 F 14 8888 0.3575E-05 0.5314E-06 0.1250E+00
channel 7 : 23 F 4 10100 0.7599E-06 0.6653E-06 0.1000E+01
channel 8 : 23 F 1 7676 0.2048E-06 0.1620E-06 0.1000E+01
channel 9 : 24 T 5102 4040 0.3687E-02 0.1670E-02 0.1573E-01
channel 10 : 24 T 5088 2154 0.5759E-02 0.4547E-02 0.1743E-01
channel 11 : 25 T 8807 3636 0.9338E-02 0.5891E-02 0.5116E-01
channel 12 : 25 T 23583 12817 0.1906E-01 0.1306E-01 0.1547E-01
channel 13 : 26 F 7 9696 0.2011E-05 0.1989E-05 0.1000E+01
channel 14 : 26 F 6 8484 0.6818E-05 0.6776E-05 0.6095E+00
channel 15 : 27 F 33 8080 0.9702E-05 0.9519E-05 0.2530E-01
channel 16 : 27 F 57 7272 0.9485E-05 0.6816E-06 0.8151E-01
channel 17 : 28 F 2770 4612 0.3834E-02 0.3185E-02 0.3059E-01
channel 18 : 28 T 5173 2424 0.8361E-02 0.7315E-02 0.2441E-01
 ------- iteration 5
 Update # PS points (even_rn): 116352 --> 114688
ABS integral = 0.1795E+00 +/- 0.1646E-01 ( 9.170 %)
Integral = 0.1250E+00 +/- 0.1646E-01 ( 13.173 %)
Virtual = -.5114E-02 +/- 0.3884E-02 ( 75.954 %)
Virtual ratio = -.4515E+00 +/- 0.3912E-02 ( 0.866 %)
ABS virtual = 0.2566E-01 +/- 0.4363E-02 ( 17.001 %)
Born = 0.9500E-02 +/- 0.8450E-03 ( 8.895 %)
V 4 = -.2095E-02 +/- 0.2518E-02 ( 120.183 %)
B 4 = 0.8249E-02 +/- 0.8521E-03 ( 10.330 %)
V 5 = -.6724E-04 +/- 0.4826E-03 ( 717.708 %)
B 5 = -.2291E-03 +/- 0.9342E-04 ( 40.782 %)
V 6 = -.2951E-02 +/- 0.2961E-02 ( 100.323 %)
B 6 = 0.1480E-02 +/- 0.1758E-03 ( 11.880 %)
Chi^2= 0.2530E+01
accumulated results ABS integral = 0.1533E+00 +/- 0.6127E-02 ( 3.997 %)
accumulated results Integral = 0.1096E+00 +/- 0.6133E-02 ( 5.595 %)
accumulated results Virtual = -.3912E-02 +/- 0.1254E-02 ( 32.053 %)
accumulated results Virtual ratio = -.4467E+00 +/- 0.1389E-02 ( 0.311 %)
accumulated results ABS virtual = 0.2348E-01 +/- 0.1501E-02 ( 6.394 %)
accumulated results Born = 0.1711E-01 +/- 0.6935E-03 ( 4.052 %)
accumulated results V 4 = -.2342E-02 +/- 0.5352E-03 ( 22.855 %)
accumulated results B 4 = 0.1523E-01 +/- 0.6433E-03 ( 4.225 %)
accumulated results V 5 = 0.2558E-03 +/- 0.2389E-03 ( 93.412 %)
accumulated results B 5 = -.3250E-03 +/- 0.8393E-04 ( 25.821 %)
accumulated results V 6 = -.2223E-02 +/- 0.9928E-03 ( 44.664 %)
accumulated results B 6 = 0.2609E-02 +/- 0.1650E-03 ( 6.325 %)
accumulated result Chi^2 per DoF = 0.1817E+01
accumulated results last 3 iterations ABS integral = 0.1534E+00 +/- 0.6586E-02 ( 4.292 %)
accumulated results last 3 iterations Integral = 0.1044E+00 +/- 0.6593E-02 ( 6.315 %)
accumulated result last 3 iterrations Chi^2 per DoF = 0.3375E+01
  1: 0 1 2
  2: 0 1 2 3 4
channel 1 : 20 T 3743 4084 0.4016E-02 0.2211E-02 0.7878E-01
channel 2 : 20 T 5180 3364 0.2670E-02 0.2383E-02 0.2088E+00
channel 3 : 21 T 28031 31120 0.3522E-01 0.2409E-01 0.2429E+00
channel 4 : 21 T 40170 44415 0.5419E-01 0.4023E-01 0.3042E-01
channel 5 : 22 F 17 10100 0.1528E-05 0.1178E-05 0.1030E+00
channel 6 : 22 F 16 8888 0.2551E-05 0.3792E-06 0.2500E+00
channel 7 : 23 F 4 10100 0.5424E-06 0.4748E-06 0.1000E+01
channel 8 : 23 F 1 7676 0.1462E-06 0.1156E-06 0.1000E+01
channel 9 : 24 F 2898 5102 0.4563E-02 0.2120E-02 0.8761E-02
channel 10 : 24 T 4632 5088 0.6264E-02 0.4849E-02 0.5472E-02
channel 11 : 25 F 7427 8807 0.9375E-02 0.6094E-02 0.3114E-01
channel 12 : 25 F 15430 23583 0.1909E-01 0.1192E-01 0.2405E-01
channel 13 : 26 F 8 9696 0.1435E-05 0.1419E-05 0.1000E+01
channel 14 : 26 F 10 8484 0.4907E-05 0.4795E-05 0.1709E+00
channel 15 : 27 F 42 8080 0.9375E-05 0.9244E-05 0.6325E-02
channel 16 : 27 F 66 7272 0.7156E-05 0.8737E-06 0.2038E-01
channel 17 : 28 T 5898 4612 0.5296E-02 0.4356E-02 0.2146E-01
channel 18 : 28 T 6790 5173 0.1258E-01 0.1133E-01 0.6103E-02

and here it's over...

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#8

resubmition are normally done if a job is reported as finished by slurm but the output is not present.
So the question is why your job is consider as done by slurm.
Did you edit the code to receive the slurm email associated to that job?

Cheers,

Olivier

> On 27 May 2021, at 17:35, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Dear Olivier,
>
> It doesn't seem the jobs are killed by the cluster, they run for much longer than the others and are resubmitted, until they fail.
> There's no evident error in the log files of the subprocesses whose jobs fail, they just seem to be incomplete.
>
> I paste the last part of one of them below: the last part, with the final results, is missing.
> I tried to reduce the requested accuracy, but nothing changed.
>
> Do you know something else I could try?
>
> Best,
>
> Matteo
>
> Splitorders 6
> NP: 4
> QCD: 2
> QED: 8
> ---- POLES CANCELLED ----
> COEFFICIENT DOUBLE POLE:
> MadFKS: -3.4137760189331631E-010 OLP: -3.4137760189327913E-010
> COEFFICIENT SINGLE POLE:
> MadFKS: -5.9687950538506703E-010 OLP: -5.9687950538500137E-010
> REAL 1: keeping split order 1
> REAL 1: keeping split order 2
> REAL 1: keeping split order 3
> tau_min 1 2 : 0.00000E+00 -- 0.14119E+03
> tau_min 2 2 : 0.00000E+00 -- 0.14119E+03
> tau_min 3 2 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 4 2 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 5 2 : 0.00000E+00 -- 0.14119E+03
> tau_min 6 2 : 0.00000E+00 -- 0.14119E+03
> REAL 2: keeping split order 1
> REAL 2: keeping split order 2
> REAL 2: keeping split order 3
> REAL 3: keeping split order 1
> REAL 3: keeping split order 2
> REAL 3: keeping split order 3
> tau_min 1 3 : 0.00000E+00 -- 0.14119E+03
> tau_min 2 3 : 0.00000E+00 -- 0.14119E+03
> tau_min 3 3 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 4 3 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 5 3 : 0.00000E+00 -- 0.14119E+03
> tau_min 6 3 : 0.00000E+00 -- 0.14119E+03
> tau_min 1 4 : 0.00000E+00 -- 0.14119E+03
> tau_min 2 4 : 0.00000E+00 -- 0.14119E+03
> tau_min 3 4 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 4 4 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 5 4 : 0.00000E+00 -- 0.14119E+03
> .... (similar lines)
> tau_min 3 18 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 4 18 : 0.13120E+03 0.13120E+03 0.14119E+03
> tau_min 5 18 : 0.00000E+00 -- 0.14119E+03
> tau_min 6 18 : 0.00000E+00 -- 0.14119E+03
> ABS integral = 0.1804E+00 +/- 0.6590E-01 ( 36.540 %)
> Integral = 0.1716E+00 +/- 0.6590E-01 ( 38.401 %)
> Virtual = -.6087E-01 +/- 0.2384E-01 ( 39.160 %)
> Virtual ratio = -.1190E+01 +/- 0.1934E-01 ( 1.625 %)
> ABS virtual = 0.7784E-01 +/- 0.2940E-01 ( 37.777 %)
> Born = 0.1545E+00 +/- 0.6856E-01 ( 44.387 %)
> V 4 = -.4992E-01 +/- 0.2442E-01 ( 48.911 %)
> B 4 = 0.1401E+00 +/- 0.7407E-01 ( 52.851 %)
> V 5 = 0.3595E-02 +/- 0.4756E-02 ( 132.309 %)
> B 5 = -.1128E-01 +/- 0.1065E-01 ( 94.393 %)
> V 6 = -.1455E-01 +/- 0.6139E-02 ( 42.202 %)
> B 6 = 0.2559E-01 +/- 0.1001E-01 ( 39.131 %)
> Chi^2 per d.o.f. 0.0000E+00
> accumulated results ABS integral = 0.1804E+00 +/- 0.6590E-01 ( 36.540 %)
> accumulated results Integral = 0.1716E+00 +/- 0.6590E-01 ( 38.401 %)
> accumulated results Virtual = -.6087E-01 +/- 0.2384E-01 ( 39.160 %)
> accumulated results Virtual ratio = -.1190E+01 +/- 0.1934E-01 ( 1.625 %)
> accumulated results ABS virtual = 0.7784E-01 +/- 0.2940E-01 ( 37.777 %)
> accumulated results Born = 0.1545E+00 +/- 0.6856E-01 ( 44.387 %)
> accumulated results V 4 = -.4992E-01 +/- 0.2442E-01 ( 48.911 %)
> accumulated results B 4 = 0.1401E+00 +/- 0.7407E-01 ( 52.851 %)
> accumulated results V 5 = 0.3595E-02 +/- 0.4756E-02 ( 132.309 %)
> accumulated results B 5 = -.1128E-01 +/- 0.1065E-01 ( 94.393 %)
> accumulated results V 6 = -.1455E-01 +/- 0.6139E-02 ( 42.202 %)
> accumulated results B 6 = 0.2559E-01 +/- 0.1001E-01 ( 39.131 %)
> accumulated result Chi^2 per DoF = 0.0000E+00
> 1: 0 1 2
> 2: 0 1 2 3 4
> channel 1 : 20 T 3636 0 0.3794E-02 0.3734E-02 0.6534E+00
> channel 2 : 20 T 2424 0 0.1228E-02 0.1136E-02 0.1000E+01
> channel 3 : 21 T 3636 0 0.6767E-01 0.6499E-01 0.1000E+01
> channel 4 : 21 T 2020 0 0.3467E-01 0.3254E-01 0.6613E+00
> channel 5 : 22 T 10100 0 0.2420E-04 0.1479E-04 0.8100E+00
> channel 6 : 22 T 8888 0 0.2280E-04 0.1094E-04 0.1000E+01
> channel 7 : 23 T 10100 0 0.1453E-04 0.1272E-04 0.4861E+00
> channel 8 : 23 T 7676 0 0.3918E-05 0.3096E-05 0.1000E+01
> channel 9 : 24 T 4040 0 0.2658E-02 0.2368E-02 0.1000E+01
> channel 10 : 24 T 2020 0 0.1704E-02 0.1384E-02 0.4100E+00
> channel 11 : 25 T 3636 0 0.3315E-02 0.2422E-02 0.1000E+01
> channel 12 : 25 T 2828 0 0.5284E-01 0.5218E-01 0.3495E+00
> channel 13 : 26 T 9696 0 0.9613E-05 0.9178E-05 0.1000E+01
> channel 14 : 26 T 8484 0 0.3962E-05 0.3219E-05 0.3743E+00
> channel 15 : 27 T 8080 0 0.3196E-04 0.2847E-04 0.1000E+01
> channel 16 : 27 T 7272 0 0.1230E-03 -.4530E-04 0.4051E+00
> channel 17 : 28 T 3636 0 0.9552E-02 0.9150E-02 0.1000E+01
> channel 18 : 28 T 2424 0 0.2676E-02 0.1675E-02 0.9674E+00
> ------- iteration 2
> Update # PS points (even_rn): 14544 --> 14544
> ABS integral = 0.1217E+00 +/- 0.1727E-01 ( 14.183 %)
> Integral = 0.1038E+00 +/- 0.1727E-01 ( 16.636 %)
> Virtual = -.7242E-03 +/- 0.2800E-02 ( 386.581 %)
> Virtual ratio = -.3812E+00 +/- 0.2611E-02 ( 0.685 %)
> ABS virtual = 0.2228E-01 +/- 0.4702E-02 ( 21.106 %)
> Born = 0.7508E-01 +/- 0.1609E-01 ( 21.427 %)
> V 4 = -.2854E-02 +/- 0.1318E-02 ( 46.181 %)
> B 4 = 0.5162E-01 +/- 0.8050E-02 ( 15.596 %)
> V 5 = 0.2587E-02 +/- 0.3096E-02 ( 119.653 %)
> B 5 = 0.4972E-02 +/- 0.6394E-02 ( 128.598 %)
> V 6 = -.4573E-03 +/- 0.1800E-02 ( 393.529 %)
> B 6 = 0.1849E-01 +/- 0.5917E-02 ( 32.005 %)
> Chi^2= 0.4965E+00
> accumulated results ABS integral = 0.1339E+00 +/- 0.1670E-01 ( 12.474 %)
> accumulated results Integral = 0.1179E+00 +/- 0.1671E-01 ( 14.171 %)
> accumulated results Virtual = -.7046E-02 +/- 0.2781E-02 ( 39.463 %)
> accumulated results Virtual ratio = -.4774E+00 +/- 0.2587E-02 ( 0.542 %)
> accumulated results ABS virtual = 0.2994E-01 +/- 0.4643E-02 ( 15.509 %)
> accumulated results Born = 0.9016E-01 +/- 0.1566E-01 ( 17.370 %)
> accumulated results V 4 = -.5265E-02 +/- 0.1316E-02 ( 24.999 %)
> accumulated results B 4 = 0.6030E-01 +/- 0.8003E-02 ( 13.273 %)
> accumulated results V 5 = 0.2985E-02 +/- 0.2595E-02 ( 86.934 %)
> accumulated results B 5 = -.1126E-02 +/- 0.5481E-02 ( 486.943 %)
> accumulated results V 6 = -.3651E-02 +/- 0.1727E-02 ( 47.299 %)
> accumulated results B 6 = 0.2112E-01 +/- 0.5094E-02 ( 24.114 %)
> accumulated result Chi^2 per DoF = 0.4965E+00
> 1: 0 1 2
> 2: 0 1 2 3 4
> channel 1 : 20 F 990 3636 0.4660E-02 0.2856E-02 0.2653E+00
> channel 2 : 20 F 291 2424 0.5523E-02 0.5499E-02 0.3868E+00
> channel 3 : 21 T 16351 3636 0.4056E-01 0.3508E-01 0.6381E+00
> channel 4 : 21 T 8324 2020 0.4634E-01 0.4143E-01 0.1695E+00
> channel 5 : 22 F 8 10100 0.6215E-05 0.4262E-05 0.8240E+00
> channel 6 : 22 F 5 8888 0.9544E-05 -.2537E-05 0.2500E+00
> channel 7 : 23 F 3 10100 0.3018E-05 0.2641E-05 0.9722E+00
> channel 8 : 23 F 1 7676 0.8134E-06 0.6429E-06 0.1000E+01
> channel 9 : 24 F 623 4040 0.2808E-02 0.2630E-02 0.2517E+00
> channel 10 : 24 F 421 2020 0.3135E-02 0.2426E-02 0.2518E+00
> channel 11 : 25 F 784 3636 0.4271E-02 0.3676E-02 0.7307E+00
> channel 12 : 25 T 12817 2828 0.1973E-01 0.1854E-01 0.8737E-01
> channel 13 : 26 F 2 9696 0.7984E-05 0.7894E-05 0.2500E+00
> channel 14 : 26 F 0 8484 0.8226E-06 0.6684E-06 0.7486E+00
> channel 15 : 27 F 7 8080 0.2743E-04 0.2670E-04 0.4048E+00
> channel 16 : 27 F 32 7272 0.2620E-04 -.8745E-05 0.2862E+00
> channel 17 : 28 F 2307 3636 0.4223E-02 0.4064E-02 0.3662E+00
> channel 18 : 28 F 666 2424 0.2581E-02 0.1669E-02 0.2419E+00
> ------- iteration 3
> Update # PS points (even_rn): 29088 --> 24576
> ABS integral = 0.1268E+00 +/- 0.7388E-02 ( 5.824 %)
> Integral = 0.8245E-01 +/- 0.7396E-02 ( 8.971 %)
> Virtual = -.4737E-02 +/- 0.2135E-02 ( 45.070 %)
> Virtual ratio = -.4174E+00 +/- 0.2636E-02 ( 0.632 %)
> ABS virtual = 0.1993E-01 +/- 0.2249E-02 ( 11.283 %)
> Born = 0.2780E-01 +/- 0.1854E-02 ( 6.669 %)
> V 4 = -.2319E-02 +/- 0.8041E-03 ( 34.679 %)
> B 4 = 0.2409E-01 +/- 0.1725E-02 ( 7.161 %)
> V 5 = -.8549E-04 +/- 0.4565E-03 ( 533.939 %)
> B 5 = -.8336E-03 +/- 0.5285E-03 ( 63.397 %)
> V 6 = -.2333E-02 +/- 0.1703E-02 ( 73.019 %)
> B 6 = 0.4538E-02 +/- 0.5436E-03 ( 11.980 %)
> Chi^2= 0.8607E-01
> accumulated results ABS integral = 0.1290E+00 +/- 0.6757E-02 ( 5.237 %)
> accumulated results Integral = 0.9333E-01 +/- 0.6763E-02 ( 7.247 %)
> accumulated results Virtual = -.5740E-02 +/- 0.1693E-02 ( 29.502 %)
> accumulated results Virtual ratio = -.4477E+00 +/- 0.1846E-02 ( 0.412 %)
> accumulated results ABS virtual = 0.2320E-01 +/- 0.2024E-02 ( 8.725 %)
> accumulated results Born = 0.3440E-01 +/- 0.1841E-02 ( 5.352 %)
> accumulated results V 4 = -.3436E-02 +/- 0.6862E-03 ( 19.970 %)
> accumulated results B 4 = 0.3051E-01 +/- 0.1686E-02 ( 5.527 %)
> accumulated results V 5 = 0.3738E-03 +/- 0.4496E-03 ( 120.264 %)
> accumulated results B 5 = -.8593E-03 +/- 0.5260E-03 ( 61.218 %)
> accumulated results V 6 = -.2987E-02 +/- 0.1213E-02 ( 40.594 %)
> accumulated results B 6 = 0.6138E-02 +/- 0.5406E-03 ( 8.808 %)
> accumulated result Chi^2 per DoF = 0.2913E+00
> 1: 0 1 2
> 2: 0 1 2 3 4
> channel 1 : 20 T 3535 3636 0.4656E-02 0.2741E-02 0.6632E-01
> channel 2 : 20 T 3364 2424 0.3146E-02 0.3064E-02 0.1972E+00
> channel 3 : 21 T 22352 16351 0.3479E-01 0.2464E-01 0.5238E+00
> channel 4 : 21 T 25492 8324 0.5002E-01 0.3755E-01 0.1061E+00
> channel 5 : 22 F 14 10100 0.2188E-05 0.1590E-05 0.2060E+00
> channel 6 : 22 F 12 8888 0.4353E-05 0.6472E-06 0.5000E+00
> channel 7 : 23 F 3 10100 0.9254E-06 0.8103E-06 0.1000E+01
> channel 8 : 23 F 1 7676 0.2494E-06 0.1973E-06 0.1000E+01
> channel 9 : 24 F 2182 4040 0.3381E-02 0.1522E-02 0.6293E-01
> channel 10 : 24 T 2154 2020 0.5759E-02 0.4527E-02 0.6296E-01
> channel 11 : 25 F 3105 3636 0.6534E-02 0.2729E-02 0.1827E+00
> channel 12 : 25 F 10822 12817 0.1421E-01 0.1142E-01 0.6189E-01
> channel 13 : 26 F 4 9696 0.2448E-05 0.2422E-05 0.5000E+00
> channel 14 : 26 F 3 8484 0.8302E-05 0.8253E-05 0.3047E+00
> channel 15 : 27 F 22 8080 0.1118E-04 0.1096E-04 0.1012E+00
> channel 16 : 27 F 46 7272 0.1124E-04 0.5225E-06 0.3260E+00
> channel 17 : 28 T 4612 3636 0.3078E-02 0.2838E-02 0.1224E+00
> channel 18 : 28 F 2145 2424 0.3398E-02 0.2268E-02 0.9765E-01
> ------- iteration 4
> Update # PS points (even_rn): 58176 --> 57344
> ABS integral = 0.2061E+00 +/- 0.3103E-01 ( 15.061 %)
> Integral = 0.1498E+00 +/- 0.3104E-01 ( 20.716 %)
> Virtual = -.6913E-03 +/- 0.2128E-02 ( 307.786 %)
> Virtual ratio = -.4412E+00 +/- 0.2504E-02 ( 0.568 %)
> ABS virtual = 0.2201E-01 +/- 0.2608E-02 ( 11.845 %)
> Born = 0.2249E-01 +/- 0.1615E-02 ( 7.180 %)
> V 4 = -.1017E-02 +/- 0.9095E-03 ( 89.466 %)
> B 4 = 0.1808E-01 +/- 0.1206E-02 ( 6.674 %)
> V 5 = 0.4908E-03 +/- 0.3475E-03 ( 70.815 %)
> B 5 = -.3895E-03 +/- 0.2050E-03 ( 52.646 %)
> V 6 = -.1656E-03 +/- 0.2130E-02 ( ******* %)
> B 6 = 0.4801E-02 +/- 0.1020E-02 ( 21.248 %)
> Chi^2= 0.4156E+01
> accumulated results ABS integral = 0.1428E+00 +/- 0.6602E-02 ( 4.624 %)
> accumulated results Integral = 0.1034E+00 +/- 0.6608E-02 ( 6.389 %)
> accumulated results Virtual = -.3503E-02 +/- 0.1325E-02 ( 37.829 %)
> accumulated results Virtual ratio = -.4449E+00 +/- 0.1486E-02 ( 0.334 %)
> accumulated results ABS virtual = 0.2268E-01 +/- 0.1599E-02 ( 7.050 %)
> accumulated results Born = 0.2805E-01 +/- 0.1214E-02 ( 4.327 %)
> accumulated results V 4 = -.2396E-02 +/- 0.5478E-03 ( 22.866 %)
> accumulated results B 4 = 0.2326E-01 +/- 0.9812E-03 ( 4.218 %)
> accumulated results V 5 = 0.4398E-03 +/- 0.2750E-03 ( 62.522 %)
> accumulated results B 5 = -.5212E-03 +/- 0.1910E-03 ( 36.652 %)
> accumulated results V 6 = -.1964E-02 +/- 0.1054E-02 ( 53.669 %)
> accumulated results B 6 = 0.5675E-02 +/- 0.4777E-03 ( 8.417 %)
> accumulated result Chi^2 per DoF = 0.1580E+01
> accumulated results last 3 iterations ABS integral = 0.1398E+00 +/- 0.6635E-02 ( 4.746 %)
> accumulated results last 3 iterations Integral = 0.9981E-01 +/- 0.6642E-02 ( 6.654 %)
> accumulated result last 3 iterrations Chi^2 per DoF = 0.2299E+01
> 1: 0 1 2
> 2: 0 1 2 3 4
> channel 1 : 20 T 4084 3535 0.4595E-02 0.2443E-02 0.1326E+00
> channel 2 : 20 F 2764 3364 0.2973E-02 0.2871E-02 0.1174E+00
> channel 3 : 21 T 31120 22352 0.3519E-01 0.2471E-01 0.2159E+00
> channel 4 : 21 T 44415 25492 0.4995E-01 0.3772E-01 0.5130E-01
> channel 5 : 22 F 16 10100 0.1796E-05 0.1305E-05 0.4120E+00
> channel 6 : 22 F 14 8888 0.3575E-05 0.5314E-06 0.1250E+00
> channel 7 : 23 F 4 10100 0.7599E-06 0.6653E-06 0.1000E+01
> channel 8 : 23 F 1 7676 0.2048E-06 0.1620E-06 0.1000E+01
> channel 9 : 24 T 5102 4040 0.3687E-02 0.1670E-02 0.1573E-01
> channel 10 : 24 T 5088 2154 0.5759E-02 0.4547E-02 0.1743E-01
> channel 11 : 25 T 8807 3636 0.9338E-02 0.5891E-02 0.5116E-01
> channel 12 : 25 T 23583 12817 0.1906E-01 0.1306E-01 0.1547E-01
> channel 13 : 26 F 7 9696 0.2011E-05 0.1989E-05 0.1000E+01
> channel 14 : 26 F 6 8484 0.6818E-05 0.6776E-05 0.6095E+00
> channel 15 : 27 F 33 8080 0.9702E-05 0.9519E-05 0.2530E-01
> channel 16 : 27 F 57 7272 0.9485E-05 0.6816E-06 0.8151E-01
> channel 17 : 28 F 2770 4612 0.3834E-02 0.3185E-02 0.3059E-01
> channel 18 : 28 T 5173 2424 0.8361E-02 0.7315E-02 0.2441E-01
> ------- iteration 5
> Update # PS points (even_rn): 116352 --> 114688
> ABS integral = 0.1795E+00 +/- 0.1646E-01 ( 9.170 %)
> Integral = 0.1250E+00 +/- 0.1646E-01 ( 13.173 %)
> Virtual = -.5114E-02 +/- 0.3884E-02 ( 75.954 %)
> Virtual ratio = -.4515E+00 +/- 0.3912E-02 ( 0.866 %)
> ABS virtual = 0.2566E-01 +/- 0.4363E-02 ( 17.001 %)
> Born = 0.9500E-02 +/- 0.8450E-03 ( 8.895 %)
> V 4 = -.2095E-02 +/- 0.2518E-02 ( 120.183 %)
> B 4 = 0.8249E-02 +/- 0.8521E-03 ( 10.330 %)
> V 5 = -.6724E-04 +/- 0.4826E-03 ( 717.708 %)
> B 5 = -.2291E-03 +/- 0.9342E-04 ( 40.782 %)
> V 6 = -.2951E-02 +/- 0.2961E-02 ( 100.323 %)
> B 6 = 0.1480E-02 +/- 0.1758E-03 ( 11.880 %)
> Chi^2= 0.2530E+01
> accumulated results ABS integral = 0.1533E+00 +/- 0.6127E-02 ( 3.997 %)
> accumulated results Integral = 0.1096E+00 +/- 0.6133E-02 ( 5.595 %)
> accumulated results Virtual = -.3912E-02 +/- 0.1254E-02 ( 32.053 %)
> accumulated results Virtual ratio = -.4467E+00 +/- 0.1389E-02 ( 0.311 %)
> accumulated results ABS virtual = 0.2348E-01 +/- 0.1501E-02 ( 6.394 %)
> accumulated results Born = 0.1711E-01 +/- 0.6935E-03 ( 4.052 %)
> accumulated results V 4 = -.2342E-02 +/- 0.5352E-03 ( 22.855 %)
> accumulated results B 4 = 0.1523E-01 +/- 0.6433E-03 ( 4.225 %)
> accumulated results V 5 = 0.2558E-03 +/- 0.2389E-03 ( 93.412 %)
> accumulated results B 5 = -.3250E-03 +/- 0.8393E-04 ( 25.821 %)
> accumulated results V 6 = -.2223E-02 +/- 0.9928E-03 ( 44.664 %)
> accumulated results B 6 = 0.2609E-02 +/- 0.1650E-03 ( 6.325 %)
> accumulated result Chi^2 per DoF = 0.1817E+01
> accumulated results last 3 iterations ABS integral = 0.1534E+00 +/- 0.6586E-02 ( 4.292 %)
> accumulated results last 3 iterations Integral = 0.1044E+00 +/- 0.6593E-02 ( 6.315 %)
> accumulated result last 3 iterrations Chi^2 per DoF = 0.3375E+01
> 1: 0 1 2
> 2: 0 1 2 3 4
> channel 1 : 20 T 3743 4084 0.4016E-02 0.2211E-02 0.7878E-01
> channel 2 : 20 T 5180 3364 0.2670E-02 0.2383E-02 0.2088E+00
> channel 3 : 21 T 28031 31120 0.3522E-01 0.2409E-01 0.2429E+00
> channel 4 : 21 T 40170 44415 0.5419E-01 0.4023E-01 0.3042E-01
> channel 5 : 22 F 17 10100 0.1528E-05 0.1178E-05 0.1030E+00
> channel 6 : 22 F 16 8888 0.2551E-05 0.3792E-06 0.2500E+00
> channel 7 : 23 F 4 10100 0.5424E-06 0.4748E-06 0.1000E+01
> channel 8 : 23 F 1 7676 0.1462E-06 0.1156E-06 0.1000E+01
> channel 9 : 24 F 2898 5102 0.4563E-02 0.2120E-02 0.8761E-02
> channel 10 : 24 T 4632 5088 0.6264E-02 0.4849E-02 0.5472E-02
> channel 11 : 25 F 7427 8807 0.9375E-02 0.6094E-02 0.3114E-01
> channel 12 : 25 F 15430 23583 0.1909E-01 0.1192E-01 0.2405E-01
> channel 13 : 26 F 8 9696 0.1435E-05 0.1419E-05 0.1000E+01
> channel 14 : 26 F 10 8484 0.4907E-05 0.4795E-05 0.1709E+00
> channel 15 : 27 F 42 8080 0.9375E-05 0.9244E-05 0.6325E-02
> channel 16 : 27 F 66 7272 0.7156E-05 0.8737E-06 0.2038E-01
> channel 17 : 28 T 5898 4612 0.5296E-02 0.4356E-02 0.2146E-01
> channel 18 : 28 T 6790 5173 0.1258E-01 0.1133E-01 0.6103E-02
>
> and here it's over...
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#9

Dear Olivier,

I asked in the .sh file to receive emails if the job finishes running, or if it fails:

#SBATCH --<email address hidden>
#SBATCH --mail-type=END,FAIL

The mails I receive say the job is completed:

Slurm Array Summary Job_id=69897747_* (69897747) Name=ppZlljjEW Ended, COMPLETED, ExitCode [0-0]

Indeed the last lines of the .out file say all the jobs finished, but that some of them failed and no log_MINT0 can be found for the subprocess:

file missing: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/results.dat
            Fails 1 times
            No resubmition. 
INFO: Idle: 0, Running: 2, Completed: 196 [ 43m 11s ]
INFO: All jobs finished
INFO: Idle: 0, Running: 0, Completed: 198 [ 44m 11s ]
Command "launch auto " interrupted with error:
FileNotFoundError : [Errno 2] No such file or directory: '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/log_MINT0.txt'
Please report this bug on https://bugs.launchpad.net/mg5amcnlo
More information is found in '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/run_02_tag_1_debug.log'.
Please attach this file to your report.
INFO:
quit
INFO:
quit

Best,

Matteo

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#10

Yes but you are asking that job to submit many jobs on the cluster.
Those are the jobs who are failing and that we need to monitor.
For that you need to modify MG5aMC to trigger those emails.

Olivier

> On 28 May 2021, at 10:15, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Dear Olivier,
>
> I asked in the .sh file to receive emails if the job finishes running,
> or if it fails:
>
> #SBATCH --<email address hidden>
> #SBATCH --mail-type=END,FAIL
>
> The mails I receive say the job is completed:
>
> Slurm Array Summary Job_id=69897747_* (69897747) Name=ppZlljjEW Ended,
> COMPLETED, ExitCode [0-0]
>
> Indeed the last lines of the .out file say all the jobs finished, but
> that some of them failed and no log_MINT0 can be found for the
> subprocess:
>
> file missing: /home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/results.dat
> Fails 1 times
> No resubmition. 
> INFO: Idle: 0, Running: 2, Completed: 196 [ 43m 11s ]
> INFO: All jobs finished
> INFO: Idle: 0, Running: 0, Completed: 198 [ 44m 11s ]
> Command "launch auto " interrupted with error:
> FileNotFoundError : [Errno 2] No such file or directory: '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_uu_uuemep/all_G1/log_MINT0.txt'
> Please report this bug on https://bugs.launchpad.net/mg5amcnlo
> More information is found in '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/run_02_tag_1_debug.log'.
> Please attach this file to your report.
> INFO:
> quit
> INFO:
> quit
>
> Best,
>
> Matteo
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#11

Dear Olivier,

I'm sorry to bother for this, but I don't know how to modify the madgraph/various/cluster.py file to get an email when some subjobs fail. I tried to change some lines around the 424th one, but I'm getting back some "TabError".

Can you please give me some tips?

Thank you,

Matteo

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#12

python interprets differently tab and space.
emacs editor correct for that by default but not vi(m) by default.
so when editing such file you need to preserve the convention which should be space only.

Cheers,

Olivier

> On 28 May 2021, at 11:55, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Dear Olivier,
>
> I'm sorry to bother for this, but I don't know how to modify the
> madgraph/various/cluster.py file to get an email when some subjobs fail.
> I tried to change some lines around the 424th one, but I'm getting back
> some "TabError".
>
> Can you please give me some tips?
>
> Thank you,
>
> Matteo
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#13

Dear Olivier,

In order to write the logs for each job, is it sufficient to change the

if log is None:
            log = '/dev/null'

at line 1679 of the madgraph/various/cluster.py, into something else, like

log='/home//home/users/m/m/mmaltoni/all_logs.txt' ?

Is there something else I should do?

Sorry for these trivial questions, but I'm not used to work on clusters.

Best,

Matteo

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#14

This does not sound a good idea to have an hardcoded path there since plenty of job will write on the same file.
I would first just add
--<email address hidden> <mailto:<email address hidden>> and --mail-type=END,FAIL

If this is not enough then we might need to play with log as well but let avoid that

Olivier

> On 28 May 2021, at 17:10, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Dear Olivier,
>
> In order to write the logs for each job, is it sufficient to change the
>
> if log is None:
> log = '/dev/null'
>
> at line 1679 of the madgraph/various/cluster.py, into something else,
> like
>
> log='/home//home/users/m/m/mmaltoni/all_logs.txt' ?
>
> Is there something else I should do?
>
> Sorry for these trivial questions, but I'm not used to work on clusters.
>
> Best,
>
> Matteo
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#15

Dear Olivier,

These is the content of the mails I received, for a generation in which 3 jobs failed:

slurmstepd: error: *** JOB 69899818 ON lm3-w080 CANCELLED AT 2021-05-28T17:43:34 DUE TO TIME LIMIT ***
slurmstepd: error: *** JOB 69899822 ON lm3-w079 CANCELLED AT 2021-05-28T17:44:05 DUE TO TIME LIMIT ***
slurmstepd: error: *** JOB 69899820 ON lm3-w080 CANCELLED AT 2021-05-28T17:44:05 DUE TO TIME LIMIT ***

Best,

Matteo

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#16

The default wall-time is 6h00 on lemaitre3.
Looks like you hit that limit. Is this makes sense?

Olivier

> On 29 May 2021, at 18:01, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Dear Olivier,
>
> These is the content of the mails I received, for a generation in which
> 3 jobs failed:
>
> slurmstepd: error: *** JOB 69899818 ON lm3-w080 CANCELLED AT 2021-05-28T17:43:34 DUE TO TIME LIMIT ***
> slurmstepd: error: *** JOB 69899822 ON lm3-w079 CANCELLED AT 2021-05-28T17:44:05 DUE TO TIME LIMIT ***
> slurmstepd: error: *** JOB 69899820 ON lm3-w080 CANCELLED AT 2021-05-28T17:44:05 DUE TO TIME LIMIT ***
>
> Best,
>
> Matteo
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#17

I don't think so, since the cluster ran for about 50 minutes...
That's what is reported on the mail:

Timings:
           JobID Start End Elapsed
---------------- --------------------- --------------------- --------------
69899591_1 2021-05-28T17:09:24 2021-05-28T18:00:01 00:50:37
69899591_1.batch 2021-05-28T17:09:24 2021-05-28T18:00:01 00:50:37
69899591_1.exte+ 2021-05-28T17:09:24 2021-05-28T18:00:01 00:50:37

CPU:
           JobID NNodes NCPUS NTasks UserCPU SystemCPU TotalCPU CPUTime Elapsed
---------------- ------- ------ ------- ---------- ---------- ---------- ---------- ----------
69899591_1 1 1 05:09.775 00:31.423 05:41.198 00:50:37 00:50:37
69899591_1.batch 1 1 1 05:09.775 00:31.422 05:41.197 00:50:37 00:50:37
69899591_1.exte+ 1 1 1 00:00:00 00:00:00 00:00:00 00:50:37 00:50:37

I also set a maximum time of 5 hours for the job in the .sh file.

Cheers,

Matteo

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#18

The walltime in your .sh job is the one for the .sh job.
Not the one for the job(s) submitted by that job.

Maybe the default walltime is different on the frontend and from the other node.
Which would explain why the wall time for the jobs are only 50 mins.

The solution is then to force to use another wall-time for those jobs via the same method as the one you used for getting the email. (maybe they are a way to do it via input/mg5_configuration.txt but I do not remember that option).

Cheers,

olivier

> On 29 May 2021, at 18:30, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Iodon't think so, since the cluster ran for about 50 minutes...
> That's what is reported on the mail:
>
> Timings:
> JobID Start End Elapsed
> ---------------- --------------------- --------------------- --------------
> 69899591_1 2021-05-28T17:09:24 2021-05-28T18:00:01 00:50:37
> 69899591_1.batch 2021-05-28T17:09:24 2021-05-28T18:00:01 00:50:37
> 69899591_1.exte+ 2021-05-28T17:09:24 2021-05-28T18:00:01 00:50:37
>
> CPU:
> JobID NNodes NCPUS NTasks UserCPU SystemCPU TotalCPU CPUTime Elapsed
> ---------------- ------- ------ ------- ---------- ---------- ---------- ---------- ----------
> 69899591_1 1 1 05:09.775 00:31.423 05:41.198 00:50:37 00:50:37
> 69899591_1.batch 1 1 1 05:09.775 00:31.422 05:41.197 00:50:37 00:50:37
> 69899591_1.exte+ 1 1 1 00:00:00 00:00:00 00:00:00 00:50:37 00:50:37
>
> I also set a maximum time of 5 hours for the job in the .sh file.
>
> Cheers,
>
> Matteo
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#19

Dear Olivier,

I managed to extend the time limit, but the error occurs anyway:

INFO: Idle: 2, Running: 0, Completed: 200 [ 3h 22m ]
INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 23m ]
INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 24m ]
INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 25m ]
INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 26m ]
INFO: Start to wait 900s between checking status.
Note that you can change this time in the configuration file.
Press ctrl-C to force the update.
INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 41m ]
INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 56m ]
CRITICAL: Fail to run correctly job 69906954.
            with option: {'prog': 'ajob1', 'argument': ['1', 'all', '0', '0'], 'cwd': '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_du_duemep',
...

I also tried to increase the number of resubmissions and the time between each of them, but the result is the same. The emails claim the jobs were cancelled because of time limit, as before.

Do you have any idea?

Best,

Matteo

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#20

Hi,

I discussed this with the slurm expert of lemaitre3 (that read your egs-cism email)
and he claims that you request only 6 minutes for the subjobs.
Which command did you use for extending the running time?

Also I would advise to forbid the resubmition for the moment since it will not resolve the issue in anyway
(except indeed increasing the total running time of your job and wasting ressource).

Cheers,

Olivier

> On 31 May 2021, at 09:50, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Dear Olivier,
>
> I managed to extend the time limit, but the error occurs anyway:
>
> INFO: Idle: 2, Running: 0, Completed: 200 [ 3h 22m ]
> INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 23m ]
> INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 24m ]
> INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 25m ]
> INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 26m ]
> INFO: Start to wait 900s between checking status.
> Note that you can change this time in the configuration file.
> Press ctrl-C to force the update.
> INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 41m ]
> INFO: Idle: 0, Running: 2, Completed: 200 [ 3h 56m ]
> CRITICAL: Fail to run correctly job 69906954.
> with option: {'prog': 'ajob1', 'argument': ['1', 'all', '0', '0'], 'cwd': '/home/users/m/m/mmaltoni/MG5_aMC_v3_1_0/bin/pp_Z_lljj_EW/fixed_order/SubProcesses/P0_du_duemep',
> ...
>
> I also tried to increase the number of resubmissions and the time
> between each of them, but the result is the same. The emails claim the
> jobs were cancelled because of time limit, as before.
>
> Do you have any idea?
>
> Best,
>
> Matteo
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#21

Dear Olivier,

Thank you for contacting him.

I added the line:

--time=0-00:20:00 # days-hh:mm:ss

in the function submit2, inside the "text" variable at line 163 of madgraph/various/cluster.py; however I just recognised that the class SLURMCluster, at line 1653, is not using that function.

Is that the correct line? Where should I add it, instead?

Best,

Matteo

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#22

Hi,

Indeed, I would have expect that you need to modify around line 1664.
But I'm surprised that you doubt on that since it should be a the same place that the one you edit to have the email notification.

Cheers,

Olivier

> On 31 May 2021, at 13:05, matteo maltoni <email address hidden> wrote:
>
> Question #697261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/697261
>
> Status: Answered => Open
>
> matteo maltoni is still having a problem:
> Dear Olivier,
>
> Thank you for contacting him.
>
> I added the line:
>
> --time=0-00:20:00 # days-hh:mm:ss
>
> in the function submit2, inside the "text" variable at line 163 of
> madgraph/various/cluster.py; however I just recognised that the class
> SLURMCluster, at line 1653, is not using that function.
>
> Is that the correct line? Where should I add it, instead?
>
> Best,
>
> Matteo
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
matteo maltoni (matteo-maltoni) said :
#23

Dear Olivier,

I indeed confused the two functions, sorry for that.

I added the time flag at line 1682, as:

command = ['sbatch', '-o', stdout,
                   '-J', me_dir,
     '-t', '20:00',
                   '-e', stderr, prog] + argument

and now it works.

Thank you so much for your time and patience.

Best,

Matteo