NLO process terminates due to " 2 F 2 launch ends with non zero status: 134."

Asked by James

Hi everyone! I am currently trying to simulate the process p p > t t ~ b b~ [QCD] with 500,000 events. I am running it on the local super cluster at our university. I have tested it multiple times with 10,000 events, and it works flawlessly.

However, EVERY time I try it with 500,000 events (I have changed nothing but the number of events,) it always terminates about 29 hours into the simulation while "Generating Events" on process 31 out of 49. I have tried it 4 times now, and it has always terminated at the same place, with this error:

INFO: Idle: 11, Running: 8, Completed: 30 [ 9h 51m ]
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob29: line 24: 7946 Aborted ../madevent_mintMC > log.txt < input_app.txt 2>&1
^[[1;34mWARNING: program /nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob29 2 F 2 launch ends with non zero status: 134. Stop all computation ^[[0m
^[[1;34mWARNING: Last 15 lines of logfile /nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/*/log.txt:
   Unknown return code (10): 0
   Unit return code distribution (1):
 #Unit 1 = 9514
 #Unit 3 = 9
 #Unit 9 = 58
 Time spent in clustering : 0.00000000
 Time spent in PDF_Engine : 51.9575424
 Time spent in Reals_evaluation: 3289.76855
 Time spent in IS_evaluation : 4539.56787
 Time spent in OneLoop_Engine : 4506.91846
 Time spent in PS_Generation : 24.1109848
 Time spent in other_tasks : 736.281738
 Time spent in Total : 13148.6055
Time in seconds: 13356

 ^[[0m
INFO: remove job currently running
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob35: line 24: 789 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: remove job currently running
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob10: line 24: 846 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob23: line 24: 9201 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob26: line 24: 13150 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob27: line 24: 13732 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob6: line 24: 16078 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob7: line 24: 22371 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: remove job currently running
^[[1;31mCommand "launch auto " interrupted with error:
Exception : program /nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/ajob29 2 F 2 launch ends with non zero status: 134. Stop all computation
Please report this bug on https://bugs.launchpad.net/madgraph5

Has anyone seen this happening? I have no clue what could be happening, or how to fix it.

Thanks in advance!

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
Rikkert Frederix Edit question
Last query:
Last reply:
Revision history for this message
Rikkert Frederix (frederix) said :
#1

Dear James,

Could you please copy the contents of

/nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/GF29/log.txt

into here?

Best,
Rikkert

Revision history for this message
James (zrc2hs) said :
#2

Hi Rikkert,

Sorry for the delay! I accidentally had it overwrite the directory when I tried the simulation again, so I had to wait for it to fail again. Here is the contents of /nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/GF29/log.txt:

===============================================================
 INFO: MadFKS read these parameters from FKS_params.dat
 ===============================================================
  > IRPoleCheckThreshold = 1.0000000000000001E-005
  > PrecisionVirtualAtRunTime = 1.0000000000000000E-003
  > NHelForMCoverHels = 4
  > VirtualFraction = 1.0000000000000000
  > MinVirtualFraction = 5.0000000000000001E-003
 ===============================================================
 Process in group number 0
 A PDF is used, so alpha_s(MZ) is going to be modified
 Old value of alpha_s from param_card: 0.11799999999999999
  ****************************************

       NNPDFDriver version 1.0.3
   Grid: NNPDF23nlo_as_0119_qed_mem0.grid
  ****************************************
 New value of alpha_s from PDF nn23nlo: 0.11899999999999999
 *****************************************************
 * MadGraph/MadEvent *
 * -------------------------------- *
 * http://madgraph.hep.uiuc.edu *
 * http://madgraph.phys.ucl.ac.be *
 * http://madgraph.roma2.infn.it *
 * -------------------------------- *
 * *
 * PARAMETER AND COUPLING VALUES *
 * *
 *****************************************************

  External Params
  ---------------------------------

 MU_R = 91.188000000000002
 aEWM1 = 132.50700000000001
 mdl_Gf = 1.1663900000000000E-005
 aS = 0.11799999999999999
 mdl_ymb = 4.7000000000000002
 mdl_ymt = 173.00000000000000
 mdl_ymtau = 1.7769999999999999
 mdl_MT = 173.00000000000000
 mdl_MB = 4.7000000000000002
 mdl_MZ = 91.188000000000002
 mdl_MH = 125.00000000000000
 mdl_MTA = 1.7769999999999999
 mdl_WT = 0.0000000000000000
 mdl_WZ = 2.4414039999999999
 mdl_WW = 2.0476000000000001
 mdl_WH = 6.3823389999999999E-003
  Internal Params
  ---------------------------------

 mdl_conjg__CKM3x3 = 1.0000000000000000
 mdl_CKM22 = 1.0000000000000000
 mdl_lhv = 1.0000000000000000
 mdl_CKM3x3 = 1.0000000000000000
 mdl_conjg__CKM22 = 1.0000000000000000
 mdl_conjg__CKM33 = 1.0000000000000000
 mdl_Ncol = 3.0000000000000000
 mdl_CA = 3.0000000000000000
 mdl_TF = 0.50000000000000000
 mdl_CF = 1.3333333333333333
 mdl_complexi = ( 0.0000000000000000 , 1.0000000000000000 )
 mdl_MZ__exp__2 = 8315.2513440000002
 mdl_MZ__exp__4 = 69143404.913893804
 mdl_sqrt__2 = 1.4142135623730951
 mdl_MH__exp__2 = 15625.000000000000
 mdl_Ncol__exp__2_m_1 = 8.0000000000000000
 mdl_MB__exp__2 = 22.090000000000003
 mdl_MT__exp__2 = 29929.000000000000
 mdl_Ncol__exp__2_m_1_0 = 8.0000000000000000
 mdl_aEW = 7.5467711139788835E-003
 mdl_MW = 80.419002445756163
 mdl_sqrt__aEW = 8.6872153846781555E-002
 mdl_ee = 0.30795376724436879
 mdl_MW__exp__2 = 6467.2159543705357
 mdl_sw2 = 0.22224648578577766
 mdl_cw = 0.88190334743339216
 mdl_sqrt__sw2 = 0.47143025548407230
 mdl_sw = 0.47143025548407230
 mdl_g1 = 0.34919219678733299
 mdl_gw = 0.65323293034757990
 mdl_v = 246.21845810181637
 mdl_v__exp__2 = 60623.529110035903
 mdl_lam = 0.12886910601690263
 mdl_yb = 2.6995554250465490E-002
 mdl_yt = 0.99366614581500623
 mdl_ytau = 1.0206617000654717E-002
 mdl_muH = 88.388347648318430
 mdl_AxialZUp = -0.18517701861793787
 mdl_AxialZDown = 0.18517701861793787
 mdl_VectorZUp = 7.5430507588273299E-002
 mdl_VectorZDown = -0.13030376310310560
 mdl_VectorAUp = 0.20530251149624587
 mdl_VectorADown = -0.10265125574812294
 mdl_VectorWmDxU = 0.23095271737156670
 mdl_AxialWmDxU = -0.23095271737156670
 mdl_VectorWpUxD = 0.23095271737156670
 mdl_AxialWpUxD = -0.23095271737156670
 mdl_I1x33 = ( 2.6995554250465490E-002, 0.0000000000000000 )
 mdl_I2x33 = ( 0.99366614581500623 , 0.0000000000000000 )
 mdl_I3x33 = ( 0.99366614581500623 , 0.0000000000000000 )
 mdl_I4x33 = ( 2.6995554250465490E-002, 0.0000000000000000 )
 mdl_Vector_tbGp = (-0.96667059156454072 , 0.0000000000000000 )
 mdl_Axial_tbGp = ( -1.0206617000654716 , -0.0000000000000000 )
 mdl_Vector_tbGm = ( 0.96667059156454072 , 0.0000000000000000 )
 mdl_Axial_tbGm = ( -1.0206617000654716 , -0.0000000000000000 )
 mdl_gw__exp__2 = 0.42671326129048615
 mdl_cw__exp__2 = 0.77775351421422245
 mdl_ee__exp__2 = 9.4835522759998875E-002
 mdl_sw__exp__2 = 0.22224648578577769
 mdl_yb__exp__2 = 7.2875994928982540E-004
 mdl_yt__exp__2 = 0.98737240933884918
  Internal Params evaluated point by point
  ----------------------------------------

 mdl_MU_R__exp__2 = 8315.2513440000002
 mdl_sqrt__aS = 0.34351128074635334
 mdl_G__exp__2 = 1.4828317324943823
 mdl_G__exp__3 = 1.8056676068262196
 mdl_G__exp__4 = 2.1987899468922913
  Couplings of loop_sm
  ---------------------------------

       UV_3Gb = -0.22605E-01 -0.00000E+00
       UV_3Gt = 0.48815E-02 0.00000E+00
       UV_4Gb = 0.00000E+00 0.55053E-01
      UV_4Ggt = 0.00000E+00 -0.11889E-01
      UV_GQQb = 0.00000E+00 0.22605E-01
      UV_GQQt = 0.00000E+00 -0.48815E-02
     UV_bMass = 0.00000E+00 0.12824E+01
     UV_tMass = 0.00000E+00 0.34177E+00
   UVWfct_b_0 = -0.13642E+00 -0.00000E+00
   UVWfct_t_0 = -0.98778E-03 -0.00000E+00
   UVWfct_G_2 = 0.40088E-02 0.00000E+00
   UVWfct_G_1 = -0.18563E-01 0.00000E+00
         GC_4 = -0.12177E+01 0.00000E+00
         GC_5 = 0.00000E+00 0.12177E+01
         GC_6 = 0.00000E+00 0.14828E+01
       R2_3Gq = 0.76230E-02 0.00000E+00
       R2_3Gg = 0.31445E-01 0.00000E+00
  R2GC_137_43 = 0.00000E+00 0.11603E-02
  R2GC_137_44 = -0.00000E+00 -0.34810E-02
  R2GC_138_45 = -0.00000E+00 -0.11603E-02
  R2GC_138_46 = 0.00000E+00 0.34810E-02
  R2GC_139_47 = -0.00000E+00 -0.46413E-02
  R2GC_140_48 = 0.00000E+00 0.77356E-03
  R2GC_140_49 = -0.00000E+00 -0.69620E-02
  R2GC_141_50 = -0.00000E+00 -0.13924E-01
  R2GC_141_51 = -0.00000E+00 -0.48734E-01
  R2GC_142_52 = 0.00000E+00 0.13924E-01
  R2GC_142_53 = 0.00000E+00 0.48734E-01
  R2GC_143_54 = 0.00000E+00 0.12764E-01
  R2GC_143_55 = 0.00000E+00 0.52215E-01
  R2GC_144_56 = -0.00000E+00 -0.10443E-01
  R2GC_144_57 = -0.00000E+00 -0.59177E-01
  R2GC_145_58 = -0.11603E-02 0.00000E+00
  R2GC_145_59 = 0.34810E-02 0.00000E+00
       R2_GQQ = -0.00000E+00 -0.30492E-01
       R2_GGq = 0.00000E+00 0.62601E-02
       R2_GGb = -0.00000E+00 -0.82971E+00
       R2_GGt = -0.00000E+00 -0.11242E+04
     R2_GGg_1 = 0.00000E+00 0.28170E-01
     R2_GGg_2 = -0.00000E+00 -0.18780E-01
       R2_QQq = 0.00000E+00 0.12520E-01
       R2_QQb = 0.00000E+00 0.11769E+00
       R2_QQt = 0.00000E+00 0.43320E+01
  UV_3Gg_1eps = 0.62890E-01 0.00000E+00
  UV_3Gb_1eps = -0.38115E-02 0.00000E+00
  UV_4Gg_1eps = 0.00000E+00 -0.15316E+00
  UV_4Gb_1eps = 0.00000E+00 0.92827E-02
 UV_GQQg_1eps = 0.00000E+00 -0.62890E-01
 UV_GQQq_1eps = 0.00000E+00 0.38115E-02
 UV_bMass_1eps = 0.00000E+00 0.17653E+00
 UV_tMass_1eps = 0.00000E+00 0.64980E+01
 UVWfct_b_0_1eps -0.18780E-01 0.00000E+00
 UVWfct_G_2_1eps -0.31300E-02 0.00000E+00

 Collider parameters:
 --------------------

 Running at P P machine @ 13000.000000000000 GeV
 PDF set = nn23nlo
 alpha_s(Mz)= 0.1190 running at 2 loops.
 alpha_s(Mz)= 0.1190 running at 2 loops.
 Renormalization scale set on event-by-event basis
 Factorization scale set on event-by-event basis

 Diagram information for clustering has been set-up for nFKSprocess 1
 Diagram information for clustering has been set-up for nFKSprocess 2
 Diagram information for clustering has been set-up for nFKSprocess 3
 Diagram information for clustering has been set-up for nFKSprocess 4
 Diagram information for clustering has been set-up for nFKSprocess 5
 Diagram information for clustering has been set-up for nFKSprocess 6
 Diagram information for clustering has been set-up for nFKSprocess 7
 Diagram information for clustering has been set-up for nFKSprocess 8
 Diagram information for clustering has been set-up for nFKSprocess 9
 Diagram information for clustering has been set-up for nFKSprocess 10
 getting user params
Enter number of events and iterations:
 Number of events and iterations -1 12
Enter desired fractional accuracy:
 Desired fractional accuracy: 2.9999999999999999E-002
 Enter alpha, beta for G_soft
   Enter alpha<0 to set G_soft=1 (no ME soft)
 for G_soft: alpha= 1.0000000000000000 , beta= -0.10000000000000001
 Enter alpha, beta for G_azi
   Enter alpha>0 to set G_azi=0 (no azi corr)
 for G_azi: alpha= -1.0000000000000000 , beta= -0.10000000000000001
 Doing the S and H events together
Suppress amplitude (0 no, 1 yes)?
 Using suppressed amplitude.
Exact helicity sum (0 yes, n = number/event)?
 Summing over 1 helicities/event for virt
Enter Configuration Number:
Running Configuration Number: 29
Enter running mode for MINT:
0 to set-up grids, 1 to integrate, 2 to generate events
 MINT running mode: 2
 Generating events, doing only one iteration
Set the three folding parameters for MINT
xi_i, phi_i, y_ij
           1 1 1
 'all ', 'born', 'real', 'virt', 'novi' or 'grid'?
 Enter 'born0' or 'virt0' to perform
  a pure n-body integration (no S functions)
 doing the all of this channel
 Normal integration (Sfunction != 1)
 Not subdividing B.W.
 about to integrate 13 -1 1 29
 Generating 9077 events
 Generating virt :: novi approx. 335 8742
 imode is 2
Using random seed offsets: 29 , 1 , 0
  with seed 33
 Ranmar initialization seeds 11977 9408
 Total number of FKS directories is 10
 FKS process map (sum= 3 ) :
           1 --> 3 : 1 7 8
           2 --> 3 : 2 9 10
           3 --> 1 : 3
           4 --> 1 : 4
           5 --> 1 : 5
           6 --> 1 : 6
 ================================
 process combination map (specified per FKS dir):
  1 map 1
  1 inv. map 1
  2 map 1
  2 inv. map 1
  3 map 1
  3 inv. map 1
  4 map 1
  4 inv. map 1
  5 map 1
  5 inv. map 1
  6 map 1
  6 inv. map 1
  7 map 1 1 1 1
  7 inv. map 4
  8 map 1 1 1 1
  8 inv. map 4
  9 map 1 1 1 1
  9 inv. map 4
 10 map 1 1 1 1
 10 inv. map 4
 ================================
nFKSprocess: 1. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 1. Lower bound for tau is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 1. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 2. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 2. Lower bound for tau is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 2. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 3. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 3. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 4. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 4. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 5. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 5. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 6. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 6. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 7. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 7. Lower bound for tau is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 7. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 8. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 8. Lower bound for tau is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 8. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 9. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 9. Lower bound for tau is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 9. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 10. Absolute lower bound for tau at the Born is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 10. Lower bound for tau is 0.74739E-03 0.35540E+03 0.13000E+05
nFKSprocess: 10. Lower bound for tau is (taking resonances into account) 0.74739E-03 0.35540E+03 0.13000E+05
 bpower is 4.0000000000000000
 Scale values (may change event by event):
 muR, muR_reference: 0.320523D+03 0.320523D+03 1.00
 muF1, muF1_reference: 0.320523D+03 0.320523D+03 1.00
 muF2, muF2_reference: 0.320523D+03 0.320523D+03 1.00
 QES, QES_reference: 0.320523D+03 0.320523D+03 1.00

 muR_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state
 muF1_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state
 muF2_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state
 QES_reference [functional form]:
    H_T/2 := sum_i mT(i)/2, i=final state

 alpha_s= 9.9957815411085338E-002
 A PDF is used, so alpha_s(MZ) is going to be modified
 Old value of alpha_s from param_card: 0.11799999999999999
  ****************************************

       NNPDFDriver version 1.0.3
   Grid: NNPDF23nlo_as_0119_qed_mem0.grid
  ****************************************
 New value of alpha_s from PDF nn23nlo: 0.11899999999999999
 alpha_s value used for the virtuals is (for the first PS point): 0.10402807101926997
  ==========================================================================================
 { }
 {   }
 {  ,,  }
 { `7MMM. ,MMF' `7MM `7MMF'  }
 {  MMMb dPMM MM MM  }
 {  M YM ,M MM ,6"Yb. ,M""bMM MM ,pW"Wq. ,pW"Wq.`7MMpdMAo.  }
 {  M Mb M' MM 8) MM ,AP MM MM 6W' `Wb 6W' `Wb MM `Wb  }
 {  M YM.P' MM ,pm9MM 8MI MM MM , 8M M8 8M M8 MM M8  }
 {  M `YM' MM 8M MM `Mb MM MM ,M YA. ,A9 YA. ,A9 MM ,AP  }
 { .JML. `' .JMML.`Moo9^Yo.`Wbmd"MML..JMMmmmmMMM `Ybmd9' `Ybmd9' MMbmmd'  }
*** glibc detected *** ../madevent_mintMC: double free or corruption (out): 0x0000000015764af0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x75e66)[0x2ae192690e66]
/lib64/libc.so.6(+0x789b3)[0x2ae1926939b3]
../madevent_mintMC[0x9e4065]
../madevent_mintMC[0x9e2f11]
../madevent_mintMC[0x9e2532]
../madevent_mintMC[0x9e4471]
../madevent_mintMC[0x987f8e]
../madevent_mintMC[0x98a4a3]
../madevent_mintMC[0x982927]
../madevent_mintMC[0x633007]
../madevent_mintMC[0x602468]
../madevent_mintMC[0x5a8213]
../madevent_mintMC[0x512ff2]
../madevent_mintMC[0x5157d5]
../madevent_mintMC[0x515822]
../madevent_mintMC[0x4edfe6]
../madevent_mintMC[0x4ac8cc]
../madevent_mintMC[0x4b4037]
../madevent_mintMC[0x4f888e]
../madevent_mintMC[0x4f54f2]
../madevent_mintMC[0x4fdf95]
../madevent_mintMC[0x4feeac]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2ae192639d5d]
../madevent_mintMC[0x40a819]
======= Memory map: ========
00400000-00c13000 r-xp 00000000 00:1f 2147853218 /nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/madevent_mintMC
00e13000-00e5a000 rw-p 00813000 00:1f 2147853218 /nv/blue/zrc2hs/private/ttbb500k/SubProcesses/P0_gg_tbbxtx/madevent_mintMC
00e5a000-1287d000 rw-p 00000000 00:00 0
141c7000-15cec000 rw-p 00000000 00:00 0 [heap]
2ae1916de000-2ae1916fe000 r-xp 00000000 00:11 4592875 /lib64/ld-2.12.so
2ae1916fe000-2ae1916ff000 rw-p 00000000 00:00 0
2ae1918fd000-2ae1918fe000 r--p 0001f000 00:11 4592875 /lib64/ld-2.12.so
2ae1918fe000-2ae1918ff000 rw-p 00020000 00:11 4592875 /lib64/ld-2.12.so
2ae1918ff000-2ae191900000 rw-p 00000000 00:00 0
2ae191900000-2ae1919eb000 r-xp 00000000 00:1d 828787 /sfs/nfs/apps/gcc/4.8.2/lib64/libstdc++.so.6
2ae1919eb000-2ae191bea000 ---p 000eb000 00:1d 828787 /sfs/nfs/apps/gcc/4.8.2/lib64/libstdc++.so.6
2ae191bea000-2ae191bf2000 r--p 000ea000 00:1d 828787 /sfs/nfs/apps/gcc/4.8.2/lib64/libstdc++.so.6
2ae191bf2000-2ae191bf4000 rw-p 000f2000 00:1d 828787 /sfs/nfs/apps/gcc/4.8.2/lib64/libstdc++.so.6
2ae191bf4000-2ae191c0a000 rw-p 00000000 00:00 0
2ae191c0a000-2ae191d1f000 r-xp 00000000 00:1d 828737 /sfs/nfs/apps/gcc/4.8.2/lib64/libgfortran.so.3
2ae191d1f000-2ae191f1f000 ---p 00115000 00:1d 828737 /sfs/nfs/apps/gcc/4.8.2/lib64/libgfortran.so.3
2ae191f1f000-2ae191f21000 rw-p 00115000 00:1d 828737 /sfs/nfs/apps/gcc/4.8.2/lib64/libgfortran.so.3
2ae191f45000-2ae191fc8000 r-xp 00000000 00:11 4593019 /lib64/libm-2.12.so
2ae191fc8000-2ae1921c7000 ---p 00083000 00:11 4593019 /lib64/libm-2.12.so
2ae1921c7000-2ae1921c8000 r--p 00082000 00:11 4593019 /lib64/libm-2.12.so
2ae1921c8000-2ae1921c9000 rw-p 00083000 00:11 4593019 /lib64/libm-2.12.so
2ae1921c9000-2ae1921de000 r-xp 00000000 00:1d 828722 /sfs/nfs/apps/gcc/4.8.2/lib64/libgcc_s.so.1
2ae1921de000-2ae1923de000 ---p 00015000 00:1d 828722 /sfs/nfs/apps/gcc/4.8.2/lib64/libgcc_s.so.1
2ae1923de000-2ae1923df000 rw-p 00015000 00:1d 828722 /sfs/nfs/apps/gcc/4.8.2/lib64/libgcc_s.so.1
2ae1923df000-2ae1923e0000 rw-p 00000000 00:00 0
2ae1923e0000-2ae19241b000 r-xp 00000000 00:1d 828775 /sfs/nfs/apps/gcc/4.8.2/lib64/libquadmath.so.0
2ae19241b000-2ae19261a000 ---p 0003b000 00:1d 828775 /sfs/nfs/apps/gcc/4.8.2/lib64/libquadmath.so.0
2ae19261a000-2ae19261b000 rw-p 0003a000 00:1d 828775 /sfs/nfs/apps/gcc/4.8.2/lib64/libquadmath.so.0
2ae19261b000-2ae1927a5000 r-xp 00000000 00:11 4592994 /lib64/libc-2.12.so
2ae1927a5000-2ae1929a5000 ---p 0018a000 00:11 4592994 /lib64/libc-2.12.so
2ae1929a5000-2ae1929a9000 r--p 0018a000 00:11 4592994 /lib64/libc-2.12.so
2ae1929a9000-2ae1929aa000 rw-p 0018e000 00:11 4592994 /lib64/libc-2.12.so
2ae1929aa000-2ae1929b4000 rw-p 00000000 00:00 0
7fff33303000-7fff3334d000 rw-p 00000000 00:00 0 [stack]
7fff33387000-7fff33388000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:
#0 0x2AE191C232E7
#1 0x2AE191C238EE
#2 0x2AE19264D69F
#3 0x2AE19264D625
#4 0x2AE19264EE04
#5 0x2AE19268B536
#6 0x2AE192690E65
#7 0x2AE1926939B2
#8 0x9E4064
#9 0x9E2F10
#10 0x9E2531
#11 0x9E4470
#12 0x987F8D
#13 0x98A4A2
#14 0x982926
#15 0x633006
#16 0x602467
#17 0x5A8212
#18 0x512FF1
#19 0x5157D4
#20 0x515821
#21 0x4EDFE5
#22 0x4AC8CB
#23 0x4B4036
#24 0x4F888D
#25 0x4F54F1
#26 0x4FDF94
#27 0x4FEEAB
#28 0x2AE192639D5C
#29 0x40A818
Time in seconds: 30337

Thanks so much for your help!

Revision history for this message
Rikkert Frederix (frederix) said :
#3

Dear James,

Which version of the madgraph5_aMC@NLO code are you using? If you are not using the latest version (2.3.0), could you please try upgrading?

Best regards,
Rikkert

Revision history for this message
James (zrc2hs) said :
#6

Hi Rikkert,

For these simulations I've been using 2.2.3. This is because every time I use 2.3.0 I get the compilation error when launching the process:

INFO: Compiling P0_uxu_tbbxtx...
^[[1;34mWARNING: fct <function compile_dir at 0x2717ed8> does not return 0. Starts to stop the code in a clean way. ^[[0m
^[[1;34mWARNING: fct <function compile_dir at 0x2717ed8> does not return 0. Starts to stop the code in a clean way. ^[[0m
^[[1;34mWARNING: fct <function compile_dir at 0x2717ed8> does not return 0. Starts to stop the code in a clean way. ^[[0m

Which seems to be a related problem. The only difference is this error occurs a lot earlier, before it even launches the simulation.

Thanks,
James

Revision history for this message
James (zrc2hs) said :
#7

Update: I just tried it with 100k events, and it works fine. Is there some limit to how many events aMC@NLO can handle?

Revision history for this message
Rikkert Frederix (frederix) said :
#8

Dear James,

There is no such limit.

I think it's related to the use of IREGI for the tensor decomposition in the virtual corrections. IREGI is only rarely used, and therefore it might just happen that's only when you ask a lot of events that the code crashes.

This problem has been fixed in the latest version of the code (2.3.0). I do not understand why the code does not correctly compiles. There is no real difference between 2.2.3 and 2.3.0 in terms of dependences and all that. Can you try going to SubProcesses/P0_uxu_tbbxtx and try to compile there by hand using "make madevent_mintMC" ? What is the error message?

Best regards,
Rikkert

Revision history for this message
James (zrc2hs) said :
#9

Hi Rikkert,

Hmm that's interesting.. I compiled mintMC by hand like you said, and I got this error:

../../lib//libpdf.a(Ctq6Pdf.o): In function `setctq6_':
Ctq6Pdf.f:(.text+0x1892): undefined reference to `_gfortran_transfer_integer_write'
Ctq6Pdf.f:(.text+0x19b9): undefined reference to `_gfortran_transfer_character_write'
Ctq6Pdf.f:(.text+0x19ce): undefined reference to `_gfortran_transfer_integer_write'
../../lib//libpdf.a(opendata.o): In function `opendata_':
opendata.f:(.text+0x7b7): undefined reference to `_gfortran_transfer_character_write'
opendata.f:(.text+0x7ca): undefined reference to `_gfortran_transfer_character_write'
opendata.f:(.text+0x7e1): undefined reference to `_gfortran_transfer_character_write'
../../lib//libpdf.a(mrst2001.o): In function `mrst2001_':
mrst2001.f:(.text+0x21ce): undefined reference to `_gfortran_transfer_real_write'
mrst2001.f:(.text+0x225a): undefined reference to `_gfortran_transfer_real_write'
../../lib//libpdf.a(mrst2002.o): In function `mrst2002_':
mrst2002.f:(.text+0x8e1): undefined reference to `_gfortran_transfer_real_write'
mrst2002.f:(.text+0x96d): undefined reference to `_gfortran_transfer_real_write'
collect2: ld returned 1 exit status
make: *** [madevent_mintMC] Error 1

With a lot more of those "in function '...': blocks above.

Also, for some reason the 100k event processes are now also giving me the non-zero status: 134 error code, without me changing any parameters of the run.

Thanks,
James

Revision history for this message
Rikkert Frederix (frederix) said :
#10

Can you try first compiling in <YourProcess>/Source using 'make' ? Are there any errors?

best,
Rikkert

Revision history for this message
Rikkert Frederix (frederix) said :
#11

Can you try first compiling in <YourProcess>/Source using 'make' ? Are there any errors?

best,
Rikkert

Revision history for this message
James (zrc2hs) said :
#12

Nope, no errors here, here's the output:

-bash-4.1$ make
rm -f ../lib/libdhelas.a
cd DHELAS; make
make[1]: Entering directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/DHELAS'
ar cru ../../lib/libdhelas.a aloha_functions.o FFV1_2.o MP_FFV1_2.o VVVV3L2P0_1.o MP_VVVV3L2P0_1.o VVV1L2P0_1.o MP_VVV1L2P0_1.o VVV1_0.o MP_VVV1_0.o R2_QQ_1_R2_QQ_2_0.o MP_R2_QQ_1_R2_QQ_2_0.o FFV1L2P0_3.o MP_FFV1L2P0_3.o R2_GG_1_0.o MP_R2_GG_1_0.o R2RGA_VVVV2_R2RGA_VVVV3_R2RGA_VVVV5_0.o MP_R2RGA_VVVV2_R2RGA_VVVV3_R2RGA_VVVV5_0.o FFV1L1P0_3.o MP_FFV1L1P0_3.o FFV1L3_2.o MP_FFV1L3_2.o R2RGA_VVVV10_0.o MP_R2RGA_VVVV10_0.o VVVV1_0.o MP_VVVV1_0.o R2RGA_VVVV2_R2RGA_VVVV3_0.o MP_R2RGA_VVVV2_R2RGA_VVVV3_0.o R2RGA_VVVV3_0.o MP_R2RGA_VVVV3_0.o R2_QQ_2_0.o MP_R2_QQ_2_0.o FFV1_1.o MP_FFV1_1.o R2_GG_1_R2_GG_3_0.o MP_R2_GG_1_R2_GG_3_0.o VVVV1L2P0_1.o MP_VVVV1L2P0_1.o FFV1P0_3.o MP_FFV1P0_3.o R2RGA_VVVV2_0.o MP_R2RGA_VVVV2_0.o R2_QQ_1_0.o MP_R2_QQ_1_0.o FFV1_0.o MP_FFV1_0.o R2RGA_VVVV5_0.o MP_R2RGA_VVVV5_0.o VVVV4P0_1.o MP_VVVV4P0_1.o VVVV3_0.o MP_VVVV3_0.o FFV1L2_1.o MP_FFV1L2_1.o GHGHGL2_1.o MP_GHGHGL2_1.o VVVV3P0_1.o MP_VVVV3P0_1.o VVVV1P0_1.o MP_VVVV1P0_1.o VVVV4_0.o MP_VVVV4_0.o R2_GG_3_0.o MP_R2_GG_3_0.o FFV1L1_2.o MP_FFV1L1_2.o GHGHGL1_2.o MP_GHGHGL1_2.o R2_GG_1_R2_GG_2_0.o MP_R2_GG_1_R2_GG_2_0.o VVV1P0_1.o MP_VVV1P0_1.o FFV1L3_1.o MP_FFV1L3_1.o VVVV4L2P0_1.o MP_VVVV4L2P0_1.o R2_GG_2_0.o MP_R2_GG_2_0.o
ranlib ../../lib/libdhelas.a
make[1]: Leaving directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/DHELAS'
gfortran -O -fno-automatic -ffixed-line-length-132 -c alfas_functions.f
rm -f ../lib/libgeneric.a
ar cru libgeneric.a alfas_functions.o invarients.o hfill.o pawgraphs.o ran1.o rw_events.o rw_routines.o kin_functions.o open_file.o basecode.o setrun.o run_printout.o dgauss.o ranmar.o getissud.o
ranlib libgeneric.a
mv libgeneric.a ../lib/
rm -f alfas_functions.o
rm -f ../lib/libpdf.a
cd PDF; make
make[1]: Entering directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/PDF'
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o Ctq4Fn.o Ctq4Fn.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o Ctq5Par.o Ctq5Par.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o Ctq5Pdf.o Ctq5Pdf.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o Partonx5.o Partonx5.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o Ctq6Pdf.o Ctq6Pdf.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o cteq3.o cteq3.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o mrs98.o mrs98.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o mrs98lo.o mrs98lo.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o mrs98ht.o mrs98ht.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o mrs99.o mrs99.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o mrst2001.o mrst2001.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o mrst2002.o mrst2002.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o jeppe02.o jeppe02.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o pdfwrap.o pdfwrap.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o opendata.o opendata.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o pdf.o pdf.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o PhotonFlux.o PhotonFlux.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o pdg2pdf.o pdg2pdf.f
gfortran -O -fno-automatic -ffixed-line-length-132 -c -o NNPDFDriver.o NNPDFDriver.f
ar cru ../../lib/libpdf.a Ctq4Fn.o Ctq5Par.o Ctq5Pdf.o Partonx5.o Ctq6Pdf.o cteq3.o mrs98.o mrs98lo.o mrs98ht.o mrs99.o mrst2001.o mrst2002.o jeppe02.o pdfwrap.o opendata.o pdf.o PhotonFlux.o pdg2pdf.o NNPDFDriver.o
ranlib ../../lib/libpdf.a
make[1]: Leaving directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/PDF'
rm -f ../lib/libmodel.a
cd MODEL; make
make[1]: Entering directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/MODEL'
ar cru ../../lib/libmodel.a get_mass_width_fcts.o couplings.o lha_read.o printout.o rw_para.o model_functions.o couplings1.o couplings2.o couplings3.o couplings4.o mp_couplings2.o mp_couplings3.o mp_couplings4.o
ranlib ../../lib/libmodel.a
make[1]: Leaving directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/MODEL'
rm -f ../lib/libcernlib.a
cd CERNLIB; make
make[1]: Entering directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/CERNLIB'
ar cru libcernlib.a abend.o dlsqp2.o lenocc.o mtlprt.o mtlset.o radmul.o
ranlib libcernlib.a
mv libcernlib.a ../../lib/
make[1]: Leaving directory `/sfs/gluster/bigtmp/zrc2hs/ttbb100k1/Source/CERNLIB'
rm -f PDF/*.o

Thanks,
Zack

Revision history for this message
Rikkert Frederix (frederix) said :
#13

Hi,

I don't know. Your compilation problems don't look very consistent. I'm not sure I can help you further here.

Sorry.

Best regards,
Rikkert

Revision history for this message
James (zrc2hs) said :
#14

Alright, thanks for your help. Just one last question, this is from the debug.log file on the latest test (It is now giving me this error on every process I run, no matter the size):

File "/nv/blue/zrc2hs/private/MG5_aMC_v2_3_0/madgraph/various/cluster.py", line 777, in wait
    raise Exception, self.fail_msg
Exception: program /sfs/gluster/bigtmp/zrc2hs/testing/SubProcesses/P0_gg_tbbxtx/ajob34 2 F 2 launch ends with non zero status: 134. Stop all computation

Does this mean anything? Could it have something to do with the cluster I'm using? I am using SLURM instead of CONDOR, and I'm not sure if that could affect it or now.

Thanks,
James

Revision history for this message
Rikkert Frederix (frederix) said :
#15

It means that that particular job did not finish correctly. You copied the log.txt file above which showed this error in that job:

*** glibc detected *** ../madevent_mintMC: double free or corruption (out): 0x0000000015764af0 ***

Cheers,
RIkkert

Can you help with this problem?

Provide an answer of your own, or ask James for more information if necessary.

To post a message you must log in.