Running event generation on cluster mode fails

Asked by Jay Sandesara on 2020-06-21

I am trying to generate events using the cluster mode. However it keeps failing. The same process runs fine on multicore mode. I am copying the error and configuration from debug:

Traceback (most recent call last):
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/extended_cmd.py", line 1515, in onecmd
    return self.onecmd_orig(line, **opt)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/extended_cmd.py", line 1464, in onecmd_orig
    return func(arg, **opt)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/madevent_interface.py", line 2469, in do_generate_events
    self.run_generate_events(switch_mode, args)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/common_run_interface.py", line 6963, in new_fct
    original_fct(obj, *args, **opts)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/madevent_interface.py", line 2511, in run_generate_events
    self.exec_cmd('refine %s' % nb_event, postcmd=False)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/extended_cmd.py", line 1544, in exec_cmd
    stop = Cmd.onecmd_orig(current_interface, line, **opt)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/extended_cmd.py", line 1464, in onecmd_orig
    return func(arg, **opt)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/interface/madevent_interface.py", line 3469, in do_refine
    x_improve.launch() # create the ajob for the refinment.
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/madevent/gen_ximprove.py", line 860, in launch
    main_dir=pjoin(self.cmd.me_dir,'SubProcesses')) #main_dir is for gridpack readonly mode
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/madevent/sum_html.py", line 747, in collect_result
    P_comb.add_results(os.path.basename(G), path, mfactors[G])
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/madevent/sum_html.py", line 425, in add_results
    oneresult.read_results(filepath)
  File "/scratch/jsandesara/MG5_aMC_v2_7_2/madgraph/madevent/sum_html.py", line 279, in read_results
    finput = open(filepath)
IOError: [Errno 2] No such file or directory: '/scratch/jsandesara/MG5_aMC_v2_7_2/gg4lep0jet/SubProcesses/P0_gg_llll/G1.04/results.dat'
Related File: /scratch/jsandesara/MG5_aMC_v2_7_2/gg4lep0jet/SubProcesses/P0_gg_llll/G1.04/results.dat

                              Run Options
                              -----------
               stdout_level : 20 (user set)

                         MadEvent Options
                         ----------------
     automatic_html_opening : False (user set)
        notification_center : True
          cluster_temp_path : /scratch/jsandesara/run (user set)
             cluster_memory : /scratch/jsandesara/run (user set)
               cluster_size : 150 (user set)
              cluster_queue : tier3 (user set)
                    nb_core : 80 (user set)
               cluster_time : 80 (user set)
                   run_mode : 1 (user set)

                      Configuration Options
                      ---------------------
                text_editor : None
         cluster_local_path : /scratch/jsandesara/run (user set)
      cluster_status_update : (600, 30)
               pythia8_path : /scratch/jsandesara/MG5_aMC_v2_7_2/HEPTools/pythia8 (user set)
                  hwpp_path : None (user set)
            pythia-pgs_path : None (user set)
                    td_path : None (user set)
               delphes_path : None (user set)
                thepeg_path : None (user set)
               cluster_type : sge (user set)
          madanalysis5_path : /scratch/jsandesara/MG5_aMC_v2_7_2/HEPTools/madanalysis5/madanalysis5 (user set)
           cluster_nb_retry : 1
                 eps_viewer : None
                web_browser : None
               syscalc_path : None (user set)
           madanalysis_path : None (user set)
                     lhapdf : /scratch/jsandesara/MG5_aMC_v2_7_2/HEPTools/lhapdf6/bin/lhapdf-config (user set)
              f2py_compiler : None
                 hepmc_path : None (user set)
         cluster_retry_wait : 20 (user set)
           fortran_compiler : None
                auto_update : 7 (user set)
        exrootanalysis_path : /scratch/jsandesara/MG5_aMC_v2_7_2/ExRootAnalysis (user set)
                    timeout : 60
               cpp_compiler : None

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Last query:
2020-06-21
Last reply:
2020-06-22

Can you unset those three parameters?
          cluster_temp_path : /scratch/jsandesara/run (user set)
             cluster_memory : /scratch/jsandesara/run (user set)
          cluster_local_path : /scratch/jsandesara/run (user set)

After that, each cluster is different, so might need to customize the cluster class to fit the requirement imposed by your sys-admin.
I do not have any SGE cluster to try, that implementation is more than 10 years old, so it might also not work with the latest SGE version.

Cheers,

Olivier

Can you help with this problem?

Provide an answer of your own, or ask Jay Sandesara for more information if necessary.

To post a message you must log in.