Hi,
I am trying to generate the process p p > l+ l+ vl vl j j QED=6 QCD=0 [QCD], using MG5 v2.2.2
In order for the process to run at all, I have turned off the check of pole cancellation; I am aware that the missing diagrams may introduce some inaccuracies into the outputs, assuming I get so far as to have outputs.
When I attempt to run the process, generating the grid takes an excessive amount of time. There are a lot of diagrams involved, so the grid set up is split into 1932 jobs. Each individual job is taking hours though, so doing all of them at this rate would take weeks.
To what extent is it possible to parallelise the grid creation? Any other suggestions to speed it up?
Further, even if I was willing to wait weeks for the job, it crashes a relatively short way in, with the error message:
Exception : program /imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob28 2 F 0 launch ends with non zero status: 134. Stop all computation
Full error output is as follows:
INFO: Setting up grid
INFO: Idle: 1916, Running: 16, Completed: 0 [ current time: 07h38 ]
INFO: Idle: 1915, Running: 16, Completed: 1 [ 2h 33m ]
INFO: Idle: 1914, Running: 16, Completed: 2 [ 2h 50m ]
INFO: Idle: 1913, Running: 16, Completed: 3 [ 2h 55m ]
INFO: Idle: 1912, Running: 16, Completed: 4 [ 2h 59m ]
INFO: Idle: 1911, Running: 16, Completed: 5 [ 3h 5m ]
INFO: Idle: 1910, Running: 16, Completed: 6 [ 3h 24m ]
INFO: Idle: 1909, Running: 16, Completed: 7 [ 3h 36m ]
INFO: Idle: 1908, Running: 16, Completed: 8 [ 4h 8m ]
INFO: Idle: 1907, Running: 16, Completed: 9 [ 7h 55m ]
INFO: Idle: 1906, Running: 16, Completed: 10 [ 8h 28m ]
INFO: Idle: 1905, Running: 16, Completed: 11 [ 9h 39m ]
INFO: Idle: 1904, Running: 16, Completed: 12 [ 9h 56m ]
*** glibc detected *** ../madevent_mintMC: double free or corruption (out): 0x000000000e203a10 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x75e66)[0x7fd269035e66]
/lib64/libc.so.6(+0x789b3)[0x7fd2690389b3]
../madevent_mintMC[0x8fa1f0]
../madevent_mintMC[0x8f90d5]
../madevent_mintMC[0x8f8a2f]
../madevent_mintMC[0x8fa606]
../madevent_mintMC[0x89e0f5]
../madevent_mintMC[0x8a0633]
../madevent_mintMC[0x898a77]
../madevent_mintMC[0x548d53]
../madevent_mintMC[0x5201f1]
../madevent_mintMC[0x50e1f9]
../madevent_mintMC[0x4e470b]
../madevent_mintMC[0x4e701f]
../madevent_mintMC[0x4bf1be]
../madevent_mintMC[0x47a3ad]
../madevent_mintMC[0x483a39]
../madevent_mintMC[0x4ca021]
../madevent_mintMC[0x4c4688]
../madevent_mintMC[0x4cce35]
../madevent_mintMC[0x4d064c]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fd268fded5d]
../madevent_mintMC[0x40a819]
======= Memory map: ========
00400000-00cfb000 r-xp 00000000 00:17 7799417 /imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/madevent_mintMC
00efb000-01153000 rw-p 008fb000 00:17 7799417 /imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/madevent_mintMC
01153000-0cd84000 rw-p 00000000 00:00 0
0e07a000-0e23f000 rw-p 00000000 00:00 0 [heap]
7fd268fc0000-7fd26914a000 r-xp 00000000 fd:01 44 /lib64/libc-2.12.so
7fd26914a000-7fd26934a000 ---p 0018a000 fd:01 44 /lib64/libc-2.12.so
7fd26934a000-7fd26934e000 r--p 0018a000 fd:01 44 /lib64/libc-2.12.so
7fd26934e000-7fd26934f000 rw-p 0018e000 fd:01 44 /lib64/libc-2.12.so
7fd26934f000-7fd269354000 rw-p 00000000 00:00 0
7fd269358000-7fd269393000 r-xp 00000000 00:18 165676190 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libquadmath.so.0.0.0
7fd269393000-7fd269592000 ---p 0003b000 00:18 165676190 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libquadmath.so.0.0.0
7fd269592000-7fd269593000 rw-p 0003a000 00:18 165676190 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libquadmath.so.0.0.0
7fd269598000-7fd2695ad000 r-xp 00000000 00:18 362192 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libgcc_s.so.1
7fd2695ad000-7fd2697ad000 ---p 00015000 00:18 362192 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libgcc_s.so.1
7fd2697ad000-7fd2697ae000 rw-p 00015000 00:18 362192 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libgcc_s.so.1
7fd2697b0000-7fd269833000 r-xp 00000000 fd:01 278 /lib64/libm-2.12.so
7fd269833000-7fd269a32000 ---p 00083000 fd:01 278 /lib64/libm-2.12.so
7fd269a32000-7fd269a33000 r--p 00082000 fd:01 278 /lib64/libm-2.12.so
7fd269a33000-7fd269a34000 rw-p 00083000 fd:01 278 /lib64/libm-2.12.so
7fd269a38000-7fd269b4d000 r-xp 00000000 00:18 165676162 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libgfortran.so.3.0.0
7fd269b4d000-7fd269d4d000 ---p 00115000 00:18 165676162 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libgfortran.so.3.0.0
7fd269d4d000-7fd269d4f000 rw-p 00115000 00:18 165676162 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libgfortran.so.3.0.0
7fd269d50000-7fd269e3b000 r-xp 00000000 00:18 362227 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libstdc++.so.6.0.18
7fd269e3b000-7fd26a03a000 ---p 000eb000 00:18 362227 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libstdc++.so.6.0.18
7fd26a03a000-7fd26a042000 r--p 000ea000 00:18 362227 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libstdc++.so.6.0.18
7fd26a042000-7fd26a044000 rw-p 000f2000 00:18 362227 /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/Gcc/gcc481_x86_64_slc6/slc6/x86_64-slc6-gcc48-opt/lib64/libstdc++.so.6.0.18
7fd26a044000-7fd26a059000 rw-p 00000000 00:00 0
7fd26a060000-7fd26a080000 r-xp 00000000 fd:01 120 /lib64/ld-2.12.so
7fd26a25e000-7fd26a260000 rw-p 00000000 00:00 0
7fd26a276000-7fd26a27f000 rw-p 00000000 00:00 0
7fd26a27f000-7fd26a280000 r--p 0001f000 fd:01 120 /lib64/ld-2.12.so
7fd26a280000-7fd26a281000 rw-p 00020000 fd:01 120 /lib64/ld-2.12.so
7fd26a281000-7fd26a283000 rw-p 00000000 00:00 0
7fd26a283000-7fd26a288000 rw-p 00000000 00:00 0
7fffde0f2000-7fffde13c000 rw-p 00000000 00:00 0 [stack]
7fffde200000-7fffde202000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob28: line 24: 8153 Aborted ../madevent_mintMC > log.txt < input_app.txt 2>&1
WARNING: program /imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob28 2 F 0 launch ends with non zero status: 134. Stop all computation
WARNING: Last 15 lines of logfile /imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/*/log.txt:
Unknown return code (100): 0
Unknown return code (10): 0
Unit return code distribution (1):
#Unit 1 = 137827
#Unit 9 = 1
Time spent in clustering : 2955.85400
Time spent in PDF_Engine : 1812.70825
Time spent in Reals_evaluation: 7093.41797
Time spent in IS_evaluation : 13954.2695
Time spent in OneLoop_Engine : 5236.48340
Time spent in PS_Generation : 1238.74316
Time spent in other_tasks : 776.335938
Time spent in Total : 33067.8125
Time in seconds: 35784
INFO: remove job currently running
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob1: line 24: 20806 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
INFO: remove job currently running
date: write error: Broken pipe
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob5: line 24: 20801 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
date: write error: Broken pipe
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob9: line 24: 20811 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
date: write error: Broken pipe
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob10: line 24: 20817 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
date: write error: Broken pipe
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob11: line 24: 20832 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
date: write error: Broken pipe
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob14: line 24: 20850 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
date: write error: Broken pipe
/imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob15: line 24: 20848 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
date: write error: Broken pipe
INFO: remove job currently running
Command "launch auto " interrupted with error:
Exception : program /imports/home/pttaylor/MonteCarlo/MADGRAPH/MG5_aMC_v2_2_2/lp_lp_v_v_j_j_EWK_NLO/SubProcesses/P0_dxu_epmupvevmscx/ajob28 2 F 0 launch ends with non zero status: 134. Stop all computation
Please report this bug on https://bugs.launchpad.net/madgraph5