/SubProcesses/P0_uu_vevexuu/ajob1 2 F 0 0 launch ends with non zero status: 1. Stop all computation

Asked by David Vannerom

Dear all,

I am generating VBF processes for different Z boson pT ranges (the pT cut is enforced with a customized_cut.f card). I have successfully generated a 100k events lhe file for the first region (100 to 250 GeV), but I am encountering issues for the three other ones for some reason. I have checked the cards and they are exactly the same, expect of course for the names and the pT cuts. My question is then: how is this possible?

I copy here the process definition:

##################################
import model loop_sm-ckm_no_b_mass
#switch to diagonal ckm matrix if needed for speed
#import model loop_sm-no_b_mass

define p = p b b~
define j = j b b~

generate p p > ve ve~ j j QED=4 QCD=0 [QCD]

output ZToNuNu_VBF_5f_NLO_pT100To250_CKM -nojpeg
##################################

And the end of the output, up to the failure:

##################################
INFO: P0_bxcx_vevexbxcx
INFO: Result for test_ME:
INFO: Passed.
INFO: Result for test_MC:
INFO: Passed.
INFO: Result for check_poles:
INFO: Poles successfully cancel for 20 points over 20 (tolerance=-1.0e+00)
INFO: Starting run
INFO: Using 8 cores
INFO: Cleaning previous results
INFO: Generating events without running the shower.
INFO: Setting up grids
INFO: Idle: 3016, Running: 8, Completed: 0 [ current time: 11h28 ]
INFO: Idle: 3015, Running: 8, Completed: 1 [ 4m 8s ]
WARNING: program /scratch/11019548.cream02.iihe.ac.be/ZToNuNu_VBF_5f_NLO_pT650ToInf_CKM/SubProcesses/P0_uu_vevexuu/ajob1 2 F 0 0 launch ends with non zero status: 1. Stop all computation
WARNING: program /scratch/11019548.cream02.iihe.ac.be/ZToNuNu_VBF_5f_NLO_pT650ToInf_CKM/SubProcesses/P0_uu_vevexuu/ajob1 1 F 0 0 launch ends with non zero status: 1. Stop all computation
INFO: Idle: 3014, Running: 7, Completed: 3 [ 8m 13s ]
INFO: Idle: 3014, Running: 6, Completed: 4 [ 8m 13s ]
INFO: Idle: 3014, Running: 3, Completed: 7 [ 8m 13s ]
INFO: Idle: 3014, Running: 2, Completed: 8 [ 8m 14s ]
INFO: Idle: 3014, Running: 0, Completed: 10 [ 8m 14s ]
##################################

I am generating the events in two steps, first, creating the process folder:

./bin/mg5_aMC XXX_proc_card.dat

and then, generating the events:

./bin/generate_events -p -m --nb_core=8 -n run_100kEvents_1

where I enforce the generation to stop at parton level (-p), and to use 8 cores (-m --nb_core=8). The failure happens at the second step.

I had already encountered such an issue but since then, I have modified two things, following previous discussions with MadGraph experts:

I have properly set:

#IRPoleCheckThreshold
-1.0d0

and

#PrecisionVirtualAtRunTime
-1.0d0

in the _FKS_params.dat card and, since I am looking at a fixed-order process, I have set the merging parameter to 0 in the _run.dat card:

 0 = ickkw ! 0 no merging, 3 FxFx merging

Do you have any clue of how the generation can succeed for the first pT region and fail for the three others?

Thank you very much,
David

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
marco zaro Edit question
Last query:
Last reply:
Revision history for this message
marco zaro (marco-zaro) said :
#1

Dear David,
thanks for reporting this.
Can you please paste the last ~20 lines of the log.txt file inside
/scratch/11019548.cream02.iihe.ac.be/ZToNuNu_VBF_5f_NLO_pT650ToInf_CKM/SubProcesses/P0_uu_vevexuu/GF1/?
Thanks,

Marco

Revision history for this message
David Vannerom (david.vannerom) said :
#2

Dear Marco,

As you see, I am copying the entire process folder (ZToNuNu_VBF_5f_NLO_pT650ToInf_CKM for instance) on a computing node where the computation is done and at the very end I am copying back the folder in the MadGraph area. Since the process crashed without properly finishing, it did not copy the GF* file. So I relaunched the generate_events function locally on my machine, and it gave this error:

##################################
INFO: P0_bxcx_vevexbxcx
INFO: Result for test_ME:
INFO: Passed.
INFO: Result for test_MC:
INFO: Passed.
INFO: Result for check_poles:
INFO: Poles successfully cancel for 20 points over 20 (tolerance=-1.0e+00)
INFO: Starting run
INFO: Using 8 cores
INFO: Cleaning previous results
INFO: Generating events without running the shower.
INFO: Setting up grids
INFO: Idle: 3016, Running: 8, Completed: 0 [ current time: 15h01 ]
WARNING: program /storage_mnt/storage/user/vannerom/Monojet/MadGraph/CMSSW_8_0_20/src/MG5_aMC_v2_5_1/ZToNuNu_VBF_5f_NLO_pT650ToInf_CKM/SubProcesses/P0_uu_vevexuu/ajob1 1 F 0 0 launch ends with non zero status: 1. Stop all computation
/storage_mnt/storage/user/vannerom/Monojet/MadGraph/CMSSW_8_0_20/src/MG5_aMC_v2_5_1/ZToNuNu_VBF_5f_NLO_pT650ToInf_CKM/SubProcesses/P0_uu_vevexuu/ajob1: line 34: 9368 Terminated ../madevent_mintMC > log.txt < input_app.txt 2>&1
##################################

The last lines of ZToNuNu_VBF_5f_NLO_pT650ToInf_CKM/SubProcesses/P0_uu_vevexuu/GF1/log.txt read:

##################################
STOP 1
Thanks for using LHAPDF 6.1.6. Please make sure to cite the paper:
  Eur.Phys.J. C75 (2015) 3, 132 (http://arxiv.org/abs/1412.7420)
# JHEP 0909:106,2009, arXiv:0903.4665 #
# in publications with results obtained with the help of this program. #
# #
########################################################################
########################################################################
# #
# You are using OneLOop-3.6 #
# #
# for the evaluation of 1-loop scalar 1-, 2-, 3- and 4-point functions #
# #
# author: Andreas van Hameren <email address hidden> #
# date: 18-02-2015 #
# #
# Please cite #
# A. van Hameren, #
# Comput.Phys.Commun. 182 (2011) 2427-2438, arXiv:1007.4716 #
# A. van Hameren, C.G. Papadopoulos and R. Pittau, #
# JHEP 0909:106,2009, arXiv:0903.4665 #
# in publications with results obtained with the help of this program. #
# #
########################################################################
 ---- POLES CANCELLED ----
 ERROR: INTEGRAL APPEARS TO BE ZERO.
 TRIED 100880 PS POINTS AND ONLY 11 GAVE A NON-ZERO INTEGRAND.
Time in seconds: 19
##################################

Cheers,
David

Revision history for this message
Rikkert Frederix (frederix) said :
#3

Dear David,

The problem is that the your cuts are so stringent the from all the random phase-space points in tries, only 11 passed the cuts. This is not enough for a proper integration with a correct error estimate and therefore the code stops with the error you got.

The easiest way to deal with this issue is to hack the 'set_tau_min' subroutine in SubProcesses/setcuts.f. Essentially, you have to set the 'taumin_j(iFKS)' variable to minimal energy (as in sqrt(s-hat)) for the Born level contribution to pass the cuts. For example, if you require 2 jets with pT>30 GeV and a Z-boson pT with more than 250 GeV, taumin_j(iFKS) should be set to 310 GeV. The place is around line 357, just before the line
            tau_lower_bound=taumin_j(iFKS)**2/stot

Hence, you would have something like (note that launchpad removes trailing spaces):

...
            enddo
            stot = 4d0*ebeam(1)*ebeam(2)
            tau_Born_lower_bound=taumin(iFKS)**2/stot

            taumin_j(iFKS) = 310d0

            tau_lower_bound=taumin_j(iFKS)**2/stot
c
c Also find the minimum lower bound if all internal s-channel particles
c were on-shell
...

Let me know if this works.

Best regards,
Rikkert

Revision history for this message
David Vannerom (david.vannerom) said :
#4

Dear Rikkert,

I relaunched the job and it seems to be working (I will let you know if it fails at some point). Would you mind explaining further what exactly is this "taumin_j(iFKS)" parameter and why I should set it to 310 GeV instead of zero?
Also, why has the generation worked for the lowest pT bin (100To250 GeV) and failed for the higher ones? (250To400, 400To650 and 650ToInf)? Is it because the pT cuts were looser and allowed for more events passing through and therefore a better integration?

Thanks a lot,
David

Revision history for this message
Rikkert Frederix (frederix) said :
#5

Dear David,

taumin_j is used in the phase-space generation. Most of the points are thrown just above taumin_j, which means very close to threshold. This is also were the cross section is expected to be largest. If all these points do not pass the cuts, the code does not know what to do, because it doesn't know where the phase-space is non-zero. Note that the total volume for a 20+ dimensional integral is way too large for simply throwing points completely random and hoping something good comes out.

best,
Rikkert

Revision history for this message
David Vannerom (david.vannerom) said :
#6

Dear all,

It seems the fix proposed by Rikkert did not work as expected (or I implemented it wrong).

I relaunched the second pT bin (ZToNuNU_pT250To400) and got this error:

##################################
INFO: Idle: 2862, Running: 8, Completed: 154 [ 6h 53m ]
INFO: Idle: 2861, Running: 8, Completed: 155 [ 7h 16m ]
WARNING: program /scratch/11190274.cream02.iihe.ac.be/ZToNuNu_VBF_5f_NLO_pT250To400_CKM/SubProcesses/P0_uux_vevexuux/ajob1 5 F 0 0 launch ends with non zero status: 1. Stop all computation
INFO: Idle: 2861, Running: 7, Completed: 156 [ 7h 16m ]
INFO: Idle: 2861, Running: 6, Completed: 157 [ 7h 16m ]
INFO: Idle: 2861, Running: 5, Completed: 158 [ 7h 16m ]
INFO: Idle: 2861, Running: 4, Completed: 159 [ 7h 16m ]
INFO: Idle: 2861, Running: 3, Completed: 160 [ 7h 16m ]
INFO: Idle: 2861, Running: 2, Completed: 161 [ 7h 16m ]
INFO: Idle: 2861, Running: 1, Completed: 162 [ 7h 16m ]
INFO: Idle: 2861, Running: 0, Completed: 163 [ 7h 16m ]
##################################

The end of the SubProcesses/P0_uux_vevexuux/GF5/log.txt reads:

##################################
 ---- POLES CANCELLED ----
 ERROR: INTEGRAL APPEARS TO BE ZERO.
 TRIED 100880 PS POINTS AND ONLY 4 GAVE A NON-ZERO INTEGRAND.
Time in seconds: 13
##################################

I notice that this is the same SubProcess that failed without the "tau_min=310" fix, but now it's only 4 events passing the cuts instead of 11 (see my first message on this thread).

Thanks, David

Revision history for this message
Rikkert Frederix (frederix) said :
#7

Dear David,

Apparently, your other cuts are such that the minimum sqrt(s-hat) energy for an Born phase-space point to pass cuts is much larger than 310 GeV. Please, update taumin_j such that it is equal to that value.

best,
Rikkert

Revision history for this message
David Vannerom (david.vannerom) said :
#8

Dear Rikkert,

I looked at the cuts I am applying in the run card and they are quite loose actually:

- no lepton pT cut
- photon pT>20
- jet pT>10

The only thing is the hard coded pT cut on the Z boson (in the process_cuts.f card).

How can I know what value to set taumin_j to?

Cheers,
David

Revision history for this message
Rikkert Frederix (frederix) said :
#9

Dear David,

I'm confused. Why do you have a cut on a photon? Which photon?

How do you exactly apply the cut on the Z-boson pT?

Best,
Rikkert

Revision history for this message
David Vannerom (david.vannerom) said :
#10

Dear Rikkert,

Indeed there is no photon in the final state of my process, I am just listing the default cuts appearing in the cuts block of the run card. For each process, I have 4 cards: process, run, FKS (see the first message in this thread) and a cuts card. In the latter, the Z boson pT cut is implemented as follows:

##################################
c
c Z PT CUTS
c
      do i=0,nexternal
         do j=i+1,nexternal
            if ((abs(ipdg(i)).eq.12.or.abs(ipdg(i)).eq.14.or.
     & abs(ipdg(i)).eq.16).and.(ipdg(i).eq.-ipdg(j))) then
              if (ptZ(p(0,i),p(0,j)).lt.250) then
                  passcuts_user=.false.
                  return
              endif
              if (ptZ(p(0,i),p(0,j)).gt.400) then
                  passcuts_user=.false.
                  return
              endif
            endif
         enddo
      enddo
##################################

Revision history for this message
Rikkert Frederix (frederix) said :
#11

What is a 'cuts card' ?
Do you mean the fortran source code file <YourProcesses>/SubProcesses/cuts.f ?

Best,
Rikkert

Revision history for this message
David Vannerom (david.vannerom) said :
#12

Yes exactly, I have for each process a cuts.f card that is copied in the SubProcesses directory to replace the default cuts.f. It is used to enforce Z boson pT range.

Cheers,
David

Revision history for this message
Rikkert Frederix (frederix) said :
#13

In your code snippet on the cut on the Z-boson pT above, there are potential out-of-bounds errors and therefore an unexpected behaviour, because ipdg(0) is not defined. I think the loops should be

      do i=1,nexternal-1
         do j=i+1,nexternal

is that correct?

I assume that "ptZ" is a function that computes the pT from two four-momenta?

Revision history for this message
Rikkert Frederix (frederix) said :
#14

Dear David,

Thinking a bit more about this problem, I think the problem is actually a more fundamental one. You are trying to compute QCD corrections to a processes that is purely QED like. The latter you enforce by setting 'QCD=0' when you generate the process. In this way the coupling order of the Born in your process is alpha^4.
When computing QCD corrections to this process, you have to compute all contributions of order alpha^4*alpha_S. This means the usual alpha_s corrections to the alpha^4 Born process (i.e. the ones that you are including), but also alpha corrections to the alpha^3*alpha_S Born contributions. Since, the latter are not available (yet) in MG5_aMC, this process can currently not be computed correctly and you should actually get a result that is formally equal to infinity, because the cancelation of the IR poles between the virtual and real emission corrections is no longer there if you do not include the full corrections.

Best,
Rikkert

P.S. A couple of more general remarks concerning your process that you should keep in mind for the future (and, given the remark above, are not really relevant right now anymore):

-- Why are you explicitly using a model including the off-diagonal CKM matrix elements? Will you be sensitive to the flavours of the light jets in your analysis (particular charm jets or strange jets)? If not, it's much faster using a diagonal CKM matrix.

-- It's probably also significant faster to use a stable Z-boson, and only afterward decay the Z-boson to neutrinos using MadSpin.

Revision history for this message
David Vannerom (david.vannerom) said :
#15

Dear Rikkert,

Thanks a lot for your help and sorry for the late answer.

I was able to run the event generation for the higher pT region (z pT between 250 and 400) for a simpler case: no Z decay and diagonal CKM, as you suggested. My question is then: how come this works for this case but not for the full simulation with Z decay to neutrinos and full CKM matrix?

As I was able to generate events with the full simulation for the first bin (Z pT between 100 and 250), I will compare that to what I get for the simpler case. This will tell me how off I am.

Cheers,
David

Revision history for this message
Rikkert Frederix (frederix) said :
#16

Dear David,

Let me repeat myself:

You are trying to compute QCD corrections to a processes that is purely QED like. The latter you enforce by setting 'QCD=0' when you generate the process. In this way the coupling order of the Born in your process is alpha^4.
When computing QCD corrections to this process, you have to compute all contributions of order alpha^4*alpha_S. This means the usual alpha_s corrections to the alpha^4 Born process (i.e. the ones that you are including), but also alpha corrections to the alpha^3*alpha_S Born contributions. Since, the latter are not available (yet) in MG5_aMC, this process can currently not be computed correctly and you should actually get a result that is formally equal to infinity, because the cancelation of the IR poles between the virtual and real emission corrections is no longer there if you do not include the full corrections.

best,
Rikkert

Revision history for this message
David Vannerom (david.vannerom) said :
#17

Dear Rikkert,

Let me try to rephrase. By not asking for the Z decay, I have now reduced the order of the process to alpha^3 with corrections like alpha^3*alpha_S. So is what you are saying that MadGraph can handle alpha corrections to alpha^2*alpha_S processes but not alpha corrections to alpha^3*alpha_S? Is that correct?

Thanks,
David

Revision history for this message
Rikkert Frederix (frederix) said :
#18

Dear David,

No, I'm saying that MadGraph5_aMC@NLO cannot calculate NLO QCD corrections to a process that you force to be QED-type, even though there also exists a QCD-Born. In this case NLO QCD corrections are not well-defined, because they are of the same order as QED corrections to the QCD-Born. It's much more general than your processes and/or if you are including the decay or not.

Note that you must have gotten a warning:

"WARNING: Some loop diagrams contributing to this process are discarded because they are not pure (QCD)-perturbation.
Make sure you did not want to include them. "

and, moreover, the cancelation of the IR poles must have given an error for your process. I don't understand why you think that simply bypassing these warnings and errors must yield some sensible result at the end.

best,
Rikkert

Revision history for this message
David Vannerom (david.vannerom) said :
#19

Dear Rikkert,

But then, if asking for Z decay or not doesn't change anything, how should I understand the fact that the event generation works without decay (and with diagonal CKM) and does not work with the dacay (and full CKM)? Is is then CKM related?

The fact that I think I can bypass those warning is because I have modified the

#IRPoleCheckThreshold
-1.0d0

and

#PrecisionVirtualAtRunTime
-1.0d0

 parameters in the _FKS_params.dat card, as prescribed here: https://answers.launchpad.net/mg5amcnlo/+question/261881 (answer #1 frm Marco). Then again, if I should not bypass those warnings, what do you suggest I should do to properly be taking care of them?

Best,
David

Revision history for this message
Rikkert Frederix (frederix) said :
#20

Dear David,

What do you mean by "event generation works". You mean that you get an event file? Sure. But how are you sure that what you get makes sense? You bypassed a ton of warnings and errors, and without a careful assessment of the contributions that are not included in your process, you cannot be certain that thing are sensible. Obviously, you didn't make this careful assessment, because then you would have realised that your processes is not only VBF-like, but rather different from the processes considered at https://answers.launchpad.net/mg5amcnlo/+question/261881

Let me say it one more time: NLO QCD results are not well-defined without also taking NLO QED/EW corrections into account for your process. The latter are not yet available in MadGraph5_aMC@NLO, hence, you cannot use this tool for your process.

Best,
Rikkert

Revision history for this message
Steven Lowette (lowette) said :
#21

Dear Rikkert,

Sorry to jump in, I'm working with David on this from the sideline.
So, point taken. The fact that it (didn't) work(ed) with(out) full CKM / Z decay just distracted from the main point of the inherent theoretical incompleteness.

As a non-expert experimentalist, I'm left with the following questions though, and I hope you'll have time to help out:
 * maybe very naive, but is there any way to have cuts on final state particles make us stay away from the problem?
 * I see no diagrams with an EW loop correction that can give Z + 2 jets below order alpha^3 alpha_s^2 (Z undecayed). The fact that this is already an order higher makes no difference? Sorry if this is a stupid question...
 * is there maybe a madgraph beta version that has VBF NLO, or is it in the plans? If not, do you know of another code that could get us to NLO VBF?

Many thanks,
Steven.

Revision history for this message
Rikkert Frederix (frederix) said :
#22

Dear Steven,

To reply your questions:

1. No. It is not related to kinematics at all, so applying cuts won't help.
2. There are EW loops at orders alpha^4, alpha^3*alpha_s and alpha^2*alphas^2 (for undecided Z-boson+2jets). Remember to multiply the virtual corrections with a Born process to have the lowest order couplings.
3. It's work in progress in MG5_aMC. Results at fixed order are available in the VBFNLO program. They might have an interface to herwig7 for NLO+PS predictions, although, I'm not 100% sure of the status and if it might work for this process.

Best,
Rikkert

Can you help with this problem?

Provide an answer of your own, or ask David Vannerom for more information if necessary.

To post a message you must log in.