Errors only in generations of large event number

Asked by Jingjing Pan on 2020-09-28

Hi MG5 experts,

We've been encountering an error that seemed to mostly happen when generating a relatively large number of events (in our case several thousands).
Additionally, this error didn't seem to happen every time (we got occasional success by just trying more), neither for every parameter.

Specifically, we were varying only two parameters, mzdinput and number of events.
And this error seemed to only occur when we syntactically allow off-shell bosons (generate g g > h Z Zp > l+ l- l+ l-) and set mzd relatively large (while not when we enforced bosons to be on-shell).

I have always tried with clean shells and directories, made sure loading the athena version that has been working (asetup 19.2.5.31,here), and tried on different clusters (LXplus, cluster local at our university with Singularity, and BNL cluster), but the error persisted despite all those factors.
And the only factor that seemed to lead to higher frequency of getting the error was going to greater number of events.

The error message is as the following:
---------------------------------------------------------------------------------------------------------------------------------------------------

generate 19:55:59 Command "generate_events -f Test --nb_core=1" interrupted in sub-command:
generate 19:55:59 "set max_npoint_for_channel 0" with error:
generate 19:55:59 Exception :
generate 19:55:59 Please report this bug on https://bugs.launchpad.net/mg5amcnlo
generate 19:55:59 More information is found in 'MG5_debug'.
generate 19:55:59 Please attach this file to your report.
generate 19:55:59 quit
generate 19:55:59
generate 19:55:59
generate 19:55:59 launch in debug mode
generate 19:55:59 Py:MadGraphUtils INFO Restoring original LHAPDF env variables:
generate 19:55:59 Py:MadGraphUtils INFO LHAPATH=/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/lhapdf/6.1.5/x86_64-slc6-gcc47-opt/share/LHAPDF:/cvmfs/sft.cern.ch/lcg/external/lhapdfsets/current/
generate 19:55:59 Py:MadGraphUtils INFO LHAPDF_DATA_PATH=/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/lhapdf/6.1.5/x86_64-slc6-gcc47-opt/share/LHAPDF:/cvmfs/sft.cern.ch/lcg/external/lhapdfsets/current/
generate 19:55:59 Py:MadGraphUtils INFO Finished at Mon Sep 28 19:55:59 2020
generate 19:55:59 Py:MadGraphUtils INFO Unzipping generated events.
generate 19:55:59 gzip: HAHM_ggf_ZZd_4l_mZd105/Events/Test/events.lhe.gz: No such file or directory
generate 19:55:59 Py:MadGraphUtils INFO Putting a copy in place for the transform.
generate 19:55:59 Shortened traceback (most recent user call last):
generate 19:55:59 File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/AtlasProduction/19.2.5.31/InstallArea/jobOptions/EvgenJobTransforms/skeleton.GENtoEVGEN.py", line 225, in <module>
generate 19:55:59 include(jo)
generate 19:55:59 File "./MC15.999999.MadGraphPythia8EvtGen_A14NNPDF23LO_HAHMggfZZd4l_mZd105.pY", line 62, in <module>
generate 19:55:59 include("MC15JobOptions/MadGraphControl_Pythia8_A14_NNPDF23LO_EvtGen_Common.py")
generate 19:55:59 File "_joproxy15/common/MadGraph/MC15JobOptions/MadGraphControl_Pythia8_A14_NNPDF23LO_EvtGen_Common.py", line 193, in <module>
generate 19:55:59 MadGraphUtils.arrange_output(run_name='Test',proc_dir=process_dir,outputDS=stringy+'._00001.events.tar.gz',saveProcDir=save_proc_dir)
generate 19:55:59 File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/AtlasProduction/19.2.5.31/InstallArea/python/MadGraphControl/MadGraphUtils.py", line 1319, in arrange_output
generate 19:55:59 with open(orig_input,'r') as fileobject:
generate 19:55:59 IOError: [Errno 2] No such file or directory: 'HAHM_ggf_ZZd_4l_mZd105/Events/Test/events.lhe'
generate 19:55:59 Py:Athena INFO leaving with code 8: "an unknown exception occurred"
PyJobTransforms.trfExe.execute 2020-09-28 19:55:59,846 INFO generate executor returns 8
PyJobTransforms.trfExe.validate 2020-09-28 19:55:59,848 ERROR Validation of return code failed: Non-zero return code from generate (8) (Error code 65)
PyJobTransforms.trfExe.validate 2020-09-28 19:55:59,872 INFO Scanning logfile log.generate for errors
PyJobTransforms.trfValidation.scanLogFile 2020-09-28 19:55:59,888 WARNING Detected python exception - activating python exception grabber
PyJobTransforms.transform.execute 2020-09-28 19:55:59,889 CRITICAL Transform executor raised TransformValidationException: Non-zero return code from generate (8); Logfile error in log.generate: "IOError: [Errno 2] No such file or directory: 'HAHM_ggf_ZZd_4l_mZd105/Events/Test/events.lhe'"

The involved job option script is as below:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
mzd = 95
evgenConfig.description="MadGraph Hidden Abelian Higgs Model (HAHM): gg -> H -> ZZd -> 4l (l=e,mu) , with mZd=95GeV"
proc_card = """
import model HAHM_variableMW_v3_UFO
define l+ = e+ mu+
define l- = e- mu-
generate g g > h Z Zp > l+ l- l+ l-"""
#else:
# raise RuntimeError("Unrecognised runNumber: %d" % runArgs.runNumber)

proc_name = "HAHM_ggf_ZZd_4l_mZd95"

#modifications to the param_card.dat (generated from the proc_card i.e. the specific model)
#if you want to see the resulting param_card, run Generate_tf with this jobo, and look at the param_card.dat in the cwd
#If you want to see the auto-calculated values of the decay widths, look at the one in <proc_name>/Cards/param_card.dat (again, after running a Generate_tf)
param_card_extras = { "HIDDEN": { 'epsilon': '1e-4', #kinetic mixing parameter
                                 'kap': '1e-10', #higgs mixing parameter
                                 'mzdinput': mzd, #Zd mass
                                 'mhsinput':'200.0' }, #dark higgs mass
                     "HIGGS": { 'mhinput':'125.0'}, #higgs mass
                     "DECAY": { 'wzp':'Auto', 'wh':'Auto', 'wt':'Auto' } #auto-calculate decay widths and BR of Zp, H, t
                  }

run_card_extras = { 'lhe_version':'2.0',
                   'cut_decays':'F',
                   'ptj':'0',
                   'ptb':'0',
                   'pta':'0',
                   'ptl':'0',
                   'etaj':'-1',
                   'etab':'-1',
                   'etaa':'-1',
                   'etal':'-1',
                   'drjj':'0',
                   'drbb':'0',
                   'drll':'0',
                   'draa':'0',
                   'drbj':'0',
                   'draj':'0',
                   'drjl':'0',
                   'drab':'0',
                   'drbl':'0',
                   'dral':'0' }

evgenConfig.keywords+=['exotic','BSMHiggs']
evgenConfig.contact = ['<email address hidden>']
evgenConfig.process = "HAHM_H_ZZd_4l"

include("MC15JobOptions/MadGraphControl_Pythia8_A14_NNPDF23LO_EvtGen_Common.py")

Many thanks,
jing

Question information

Language:
English Edit question
Status:
Solved
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Solved by:
Jingjing Pan
Solved:
2020-09-28
Last query:
2020-09-28
Last reply:
2020-09-28

This question was reopened

Hi,

Here I only see error from Athena...
Can you attach the MG5_debug file generated by madgraph?

Cheers,

Olivier

Jingjing Pan (jp2555) said : #2

Hi Olivier,

Sorry about that. Below is the MG5_debug file:

#************************************************************
#* MadGraph5_aMC@NLO *
#* *
#* * * *
#* * * * * *
#* * * * * 5 * * * * *
#* * * * * *
#* * * *
#* *
#* *
#* VERSION 2.6.0 2017-08-16 *
#* *
#* The MadGraph5_aMC@NLO Development Team - Find us at *
#* https://server06.fynu.ucl.ac.be/projects/madgraph *
#* *
#************************************************************
#* *
#* Command File for MadGraph5_aMC@NLO *
#* *
#* run as ./bin/mg5_aMC filename *
#* *
#************************************************************
set group_subprocesses Auto
set ignore_six_quark_processes False
set loop_optimized_output True
set loop_color_flows False
set gauge unitary
set complex_mass_scheme False
set max_npoint_for_channel 0
Traceback (most recent call last):
  File "/eos/home-j/jingjing/MCgenH/105_1k/HAHM_ggf_ZZd_4l_mZd105/bin/internal/extended_cmd.py", line 1438, in onecmd
    return self.onecmd_orig(line, **opt)
  File "/eos/home-j/jingjing/MCgenH/105_1k/HAHM_ggf_ZZd_4l_mZd105/bin/internal/extended_cmd.py", line 1392, in onecmd_orig
    return func(arg, **opt)
  File "/eos/home-j/jingjing/MCgenH/105_1k/HAHM_ggf_ZZd_4l_mZd105/bin/internal/madevent_interface.py", line 2083, in do_generate_events
    switch_mode = self.ask_run_configuration(mode, args)
  File "/eos/home-j/jingjing/MCgenH/105_1k/HAHM_ggf_ZZd_4l_mZd105/bin/internal/madevent_interface.py", line 6069, in ask_run_configuration
    self.check_param_card(pjoin(self.me_dir,'Cards','param_card.dat' ))
  File "/eos/home-j/jingjing/MCgenH/105_1k/HAHM_ggf_ZZd_4l_mZd105/bin/internal/common_run_interface.py", line 3147, in check_param_card
    self.do_compute_widths('%s %s' % (' '.join(pdg), path))
  File "/eos/home-j/jingjing/MCgenH/105_1k/HAHM_ggf_ZZd_4l_mZd105/bin/internal/common_run_interface.py", line 2143, in do_compute_widths
    cmd.exec_cmd(line, model=opts['model'])
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/interface/extended_cmd.py", line 1465, in exec_cmd
    stop = Cmd.onecmd_orig(current_interface, line, **opt)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/interface/extended_cmd.py", line 1392, in onecmd_orig
    return func(arg, **opt)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/interface/master_interface.py", line 334, in do_compute_widths
    return self.cmd.do_compute_widths(self, *args, **opts)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/interface/madgraph_interface.py", line 7973, in do_compute_widths
    me_cmd.exec_cmd('combine_events', postcmd=False)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/interface/extended_cmd.py", line 1465, in exec_cmd
    stop = Cmd.onecmd_orig(current_interface, line, **opt)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/interface/extended_cmd.py", line 1392, in onecmd_orig
    return func(arg, **opt)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/interface/madevent_interface.py", line 3272, in do_combine_events
    proc_charac=self.proc_characteristic)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/various/lhe_parser.py", line 1113, in unweight
    return super(MultiEventFile, self).unweight(outputpath, get_wgt_multi, **opts)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/various/lhe_parser.py", line 376, in unweight
    all_wgt, cross, nb_event = self.initialize_unweighting(get_wgt, trunc_error)
  File "/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc47-opt/19.2.5/sw/lcg/external/MCGenerators_lcgcmt67c/madgraph5amc/2.6.0.atlas/x86_64-slc6-gcc47-opt/madgraph/various/lhe_parser.py", line 1022, in initialize_unweighting
    raise Exception
Exception
                          MadGraph5_aMC@NLO Options
                          ----------------
        complex_mass_scheme : False
                      gauge : unitary
         group_subprocesses : Auto
  ignore_six_quark_processes : False
           loop_color_flows : False
      loop_optimized_output : True
  low_mem_multicore_nlo_generation : False
     max_npoint_for_channel : 0
               stdout_level : 20 (user set)

                         MadEvent Options
                          ----------------
     automatic_html_opening : False (user set)
                    nb_core : None
        notification_center : True
                   run_mode : 2

                      Configuration Options
                      ---------------------
                        OLP : MadLoop
                    amcfast : amcfast-config
                   applgrid : applgrid-config
                auto_update : 7
         cluster_local_path : None
           cluster_nb_retry : 1
              cluster_queue : None (user set)
         cluster_retry_wait : 300
               cluster_size : 100
      cluster_status_update : (600, 30)
          cluster_temp_path : None
               cluster_type : condor
                    collier : /afs/cern.ch/sw/lcg/external/MCGenerators_lcgcmt67c/collier/1.1/x86_64-slc6-gcc47-opt (user set)
               cpp_compiler : None
             crash_on_error : False
               delphes_path : ./Delphes
                 eps_viewer : None
        exrootanalysis_path : ./ExRootAnalysis
              f2py_compiler : None
                    fastjet : None (user set)
           fortran_compiler : None
                      golem : None (user set)
                 hepmc_path : None (user set)
                  hwpp_path : None (user set)
                     lhapdf : /afs/cern.ch/sw/lcg/external/MCGenerators_lcgcmt67c/lhapdf/6.1.4/x86_64-slc6-gcc47-opt/bin/lhapdf-config (user set)
          madanalysis5_path : None (user set)
           madanalysis_path : ./MadAnalysis
  mg5amc_py8_interface_path : None (user set)
                      ninja : /afs/cern.ch/sw/lcg/external/MCGenerators_lcgcmt67c/gosam_contrib/2.0/x86_64-slc6-gcc47-opt/lib (user set)
        output_dependencies : internal (user set)
                      pjfry : None (user set)
            pythia-pgs_path : ./pythia-pgs
               pythia8_path : None (user set)
                    samurai : None
               syscalc_path : /afs/cern.ch/sw/lcg/external/MCGenerators_lcgcmt67c/syscalc/1.1.4/x86_64-slc6-gcc47-opt (user set)
                    td_path : ./td
                text_editor : None
                thepeg_path : None (user set)
                    timeout : 60
                web_browser : None

Hi,

This is a three year old version... (and a hacked version of atlas).
So not sure what you expect from me on this. Even if i succeed to produce a patch are you able to apply it?
It is likely possible that the issue has been spotted and fixed already in three years. Can you try with the latest version?

The only information that I can provide is that your issue is related to the auto width feature. So one workaround is not set the width to auto and to compute them in advance

Cheers,

Olivier

Jingjing Pan (jp2555) said : #4

Hi Olivier,

Thanks so much for your suggestions.

We were sticking with this older version since we had some problems when trying to adapt the JO to work with the new Gen_tf...

It was confusing why the generation seems to mostly succeed when setting a small number of events, though I understand what you pointed out and will try again adapting the JO files.

Many thanks,
Jing

what is JO?

> On 28 Sep 2020, at 23:40, Jingjing Pan <email address hidden> wrote:
>
> Question #693143 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/693143
>
> Status: Answered => Solved
>
> Jingjing Pan confirmed that the question is solved:
> Hi Olivier,
>
> Thanks so much for your suggestions.
>
> We were sticking with this older version since we had some problems when
> trying to adapt the JO to work with the new Gen_tf...
>
> It was confusing why the generation seems to mostly succeed when setting
> a small number of events, though I understand what you pointed out and
> will try again adapting the JO files.
>
> Many thanks,
> Jing
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Jingjing Pan (jp2555) said : #6

Sorry about the abbreviation and thanks for asking -- should be "job
option".

On Mon, Sep 28, 2020 at 6:01 PM Olivier Mattelaer <
<email address hidden>> wrote:

> Your question #693143 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/693143
>
> Olivier Mattelaer posted a new comment:
> what is JO?
>
>
> > On 28 Sep 2020, at 23:40, Jingjing Pan <
> <email address hidden>> wrote:
> >
> > Question #693143 on MadGraph5_aMC@NLO changed:
> > https://answers.launchpad.net/mg5amcnlo/+question/693143
> >
> > Status: Answered => Solved
> >
> > Jingjing Pan confirmed that the question is solved:
> > Hi Olivier,
> >
> > Thanks so much for your suggestions.
> >
> > We were sticking with this older version since we had some problems when
> > trying to adapt the JO to work with the new Gen_tf...
> >
> > It was confusing why the generation seems to mostly succeed when setting
> > a small number of events, though I understand what you pointed out and
> > will try again adapting the JO files.
> >
> > Many thanks,
> > Jing
> >
> > --
> > You received this question notification because you are an answer
> > contact for MadGraph5_aMC@NLO.
>
> --
> You received this question notification because you asked the question.
>