Error when running second simulation

Asked by Jack Collins

I am trying to run a batch of simulations locally using ./bin/madevent batchrun, where batchrun contains the instructions. Always, the first one runs fine, but at the end of the second parton level run I am getting the following error:

INFO: Combining runs
Error when reading /home/jack/physics/MG5_aMC_v2_3_0/stoptest/SubProcesses/P0_gg_t2t2x/G1a0/results.dat
Command "import command batchrun" interrupted in sub-command:
"generate_events 500_200_160" with error:
IOError : [Errno 2] No such file or directory: '/home/jack/physics/MG5_aMC_v2_3_0/stoptest/SubProcesses/P0_gg_t2t2x/G1a0/results.dat'
Please report this bug on https://bugs.launchpad.net/madgraph5
More information is found in '/home/jack/physics/MG5_aMC_v2_3_0/stoptest/500_200_160_tag_1_debug.log'.
Please attach this file to your report.

Subsequently, all further runs in the same directory will return the same error. I can comfirm that the results.dat file is not in that folder. The debug.log file contains the following lines, in addition to the cards which i can provide if you think they are relevant:

generate_events 400_200_160
generate_events 500_200_160
Traceback (most recent call last):
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 879, in onecmd
    return self.onecmd_orig(line, **opt)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 872, in onecmd_orig
    return func(arg, **opt)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 654, in do_import
    self.import_command_file(args[1])
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 1042, in import_command_file
    self.exec_cmd(line, precmd=True)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 919, in exec_cmd
    stop = Cmd.onecmd_orig(current_interface, line, **opt)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 872, in onecmd_orig
    return func(arg, **opt)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/madevent_interface.py", line 1996, in do_generate_events
    self.exec_cmd('refine %s' % nb_event, postcmd=False)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 919, in exec_cmd
    stop = Cmd.onecmd_orig(current_interface, line, **opt)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/extended_cmd.py", line 872, in onecmd_orig
    return func(arg, **opt)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/madevent_interface.py", line 2807, in do_refine
    combine_runs.CombineRuns(self.me_dir)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/combine_runs.py", line 78, in __init__
    self.sum_multichannel(channel)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/combine_runs.py", line 101, in sum_multichannel
    filepath=pjoin(path, 'results.dat'))
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/sum_html.py", line 399, in add_results
    oneresult.read_results(filepath)
  File "/home/jack/physics/MG5_aMC_v2_3_0/stoptest/bin/internal/sum_html.py", line 267, in read_results
    finput = open(filepath)
IOError: [Errno 2] No such file or directory: '/home/jack/physics/MG5_aMC_v2_3_0/stoptest/SubProcesses/P0_gg_t2t2x/G1a0/results.dat'

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Jack Collins (jhc296) said :
#1

Some additional information regarding this problem:

This error arises when I simulate (using the mssm model which comes with madgraph):
generate p p > t2 t2~
add process p p > t2 t2~ j
and use the jet matching procedure.

When I generate just p p > t2 t2~ the problem does not seem to arise. When I generate only p p > t2 t2~ j the problem also has not come up. It is only when I generate both processes together that there is a problem.

Revision history for this message
Jack Collins (jhc296) said :
#2

I also have no problem when I create a directory for the combined processes (t2 t2~ and t2 t2~ j), make six copies of it, and run a separate simulation for each of my six parameter sets, one in each directory. The problem is only arising when I try to do a series of runs in a single directory.

Revision history for this message
Gauthier (gauthier.d) said :
#3

Hi Jack,
Dear MG Team,

I encountered a similar problem. It arises when a first run requires more multijobs than the than the latter.

The information about the number of multijobs needed is written in each SubProcess/G<i>/multijob.dat during the first run and doesn't always get reset as it should (the 'reset_multijob' function in madgraph/madevent/gen_ximprove.py:904 is supposed to do that job).

When the results of the second run are being gathered, 'sum_multichannel' in madgraph/madevent/combine_runs.py:80 uses the information in each multijob.dat file to get to know where the various results are to be gathered... and so fails.

I couldn't quite figure out were exactly is the 'reset_multijob()' call missing.
Deleting manually all the SubProcess/G<i>/multijob.dat files after each run provides a temporary fix.

Cheers,
Gauthier

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#4

Hi Gauthier,

Thanks so much to have found that I will deeply check the status of those files and the change made on those in the latest version.

Thanks to you, I hope that I will be able to fix it.

Cheers,

Olivier

On 28 Jul 2015, at 20:41, Gauthier <email address hidden> wrote:

> Question #269261 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/269261
>
> Gauthier posted a new comment:
> Hi Jack,
> Dear MG Team,
>
> I encountered a similar problem. It arises when a first run requires
> more multijobs than the than the latter.
>
> The information about the number of multijobs needed is written in each
> SubProcess/G<i>/multijob.dat during the first run and doesn't always get
> reset as it should (the 'reset_multijob' function in
> madgraph/madevent/gen_ximprove.py:904 is supposed to do that job).
>
> When the results of the second run are being gathered,
> 'sum_multichannel' in madgraph/madevent/combine_runs.py:80 uses the
> information in each multijob.dat file to get to know where the various
> results are to be gathered... and so fails.
>
> I couldn't quite figure out were exactly is the 'reset_multijob()' call missing.
> Deleting manually all the SubProcess/G<i>/multijob.dat files after each run provides a temporary fix.
>
> Cheers,
> Gauthier
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Gauthier (gauthier.d) said :
#5

Hi Olivier,

To be a tiny bit more precise, I think it is actually when the second run requires no multijob at all that the problem occurs (otherwise, the multijob.dat file may have been overwritten with the correct value).

Cheers,
Gauthier

Revision history for this message
Jack Collins (jhc296) said :
#6

Thanks Gauthier, deleting the multijob.dat files indeed allows me to continue simulating more runs after this error has occurred.

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#7

Hi Gauthier/Jack

I think that I have found the problem (still testing) (thanks to Gauthier).

The following patch should solves the issue:

=== modified file 'madgraph/madevent/gen_ximprove.py'
--- madgraph/madevent/gen_ximprove.py 2015-06-28 14:44:22 +0000
+++ madgraph/madevent/gen_ximprove.py 2015-07-30 13:55:34 +0000
@@ -903,7 +903,7 @@

     def reset_multijob(self):

- for path in glob.glob(pjoin(self.me_dir, 'Subprocesses', '*',
+ for path in glob.glob(pjoin(self.me_dir, 'SubProcesses', '*',
                                                            '*','multijob.dat')):
             open(path,'w').write('0\n')

It also explains why I never face it on mac since the file system on mac is case insensitive

Can you help with this problem?

Provide an answer of your own, or ask Jack Collins for more information if necessary.

To post a message you must log in.