Fail to read the number of unweighted events in the combine.log file

Asked by Patricia Rebello Teles

Dear Support Team

I am trying to generate events at Condor Batch (at Fermilab LPC machines) with Madgraph5_v1.5.12

My input/mg5_configuration.txt file looks like

# Default Running mode
# 0: single machine/ 1: cluster / 2: multicore

run_mode = 1

# Cluster Type [pbs|sge|condor|lsf|ge|slurm] Use for cluster run only
# And cluster queue

cluster_type = condor
cluster_queue = None

# Path to a node directory to avoid direct writting on the central disk
#Note that condor cluster avoid direct writting by default (therefore this
#options didn't modify condor cluster)

cluster_temp_path = None

The jobs are well submitted at Condor as you can see below

-- Submitter: <email address hidden> : <131.225.190.179:34650> : cmslpc27.fnal.gov
 ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
153569.0 prebello 8/21 16:30 0+00:00:03 R 0 0.0 ajob4
153570.0 prebello 8/21 16:30 0+00:00:03 R 0 0.0 ajob5
153571.0 prebello 8/21 16:30 0+00:00:02 R 0 0.0 ajob6
153572.0 prebello 8/21 16:30 0+00:00:02 R 0 0.0 ajob7
153573.0 prebello 8/21 16:30 0+00:00:01 R 0 0.0 ajob8
153574.0 prebello 8/21 16:30 0+00:00:00 I 0 0.0 ajob9
153575.0 prebello 8/21 16:30 0+00:00:00 I 0 0.0 ajob10
153576.0 prebello 8/21 16:30 0+00:00:00 I 0 0.0 ajob11
153577.0 prebello 8/21 16:30 0+00:00:00 I 0 0.0 ajob12

9 jobs; 0 completed, 0 removed, 4 idle, 5 running, 0 held, 0 suspended

NEVERTHELESS

INFO: All jobs finished
Combining runs
finish refine
combine_events
Combining Events
Fail to read the number of unweighted events in the combine.log file
cat: events.lhe: No such file or directory
cat: unweighted_events.lhe: No such file or directory
  === Results Summary for run: run_01 tag: tag_1 ===

     Cross-section : 949.6 +- 2.259 pb
     Nb of events : 0

store_events
Storing parton level results
End Parton
quit

COULD YOU HELP ME TO FIX THIS PROBLEM?

THANK YOU IN ADVANCE.

ALL THE BEST, PATRICIA.

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

Hi Patricia,

Do you have a central disk?
I expect that the problem might be related to the fact that the collection of events expect that the node have a direct file access to a (huge) series of files via a nfs kind of mounting. If this is not the case, then I expect exactly this kind of crash.

Could you check that point with the IT team before trying to fix the problem on the way to submit the job.

Cheers,

Olivier

Revision history for this message
Patricia Rebello Teles (athenafma) said :
#2

Hi Olivier

I have taken a look at the output of one job According to the CERN-LSF
setup, a STDOUT file which is

@(#)CERN job starter $Date: 2010/06/23 14:22:16 $
Working directory is </pool/lsf/prebello/427048392> on <lxbrg1807.cern.ch>

../madevent: /usr/lib64/libgfortran.so.3: version `GFORTRAN_1.4' not found
(required by ../madevent)

Job finished at Wed Jul 17 18:47:06 CEST 2013 on node
 under linux version Scientific Linux CERN SLC release 5.8 (Boron)

CERN statistics: This job used 0:00:01 NCU hours (1 NCU seconds)

CERN statistics: This job used 0:00:01 KSI2K hours (1 KSI2K seconds)

KSI2K = kilo-SpecInt2000 benchmark units = 1.00 NCU
STDOUT (END)

THE QUESTION: WHY MADEVENT IS REQUIRING HERE THE GFORTRAN VERSION 1.4 ? COULD
THIS ERROR AVOID EVENT WRITING AND THE COMBINING STEPS?

THE VERSION WE HAVE AT the release of CMSSW5.3.11 is
GNU Fortran (GCC) 4.6.2
Copyright (C) 2011 Free Software Foundation, Inc.

CHeers Patricia

2013/8/22 Olivier Mattelaer <email address hidden>

> Your question #234413 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> Status: Open => Answered
>
> Olivier Mattelaer proposed the following answer:
> Hi Patricia,
>
> Do you have a central disk?
> I expect that the problem might be related to the fact that the collection
> of events expect that the node have a direct file access to a (huge) series
> of files via a nfs kind of mounting. If this is not the case, then I expect
> exactly this kind of crash.
>
> Could you check that point with the IT team before trying to fix the
> problem on the way to submit the job.
>
> Cheers,
>
> Olivier
>
> --
> If this answers your question, please go to the following page to let us
> know that it is solved:
>
> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=0
>
> If you still need help, you can reply to this email or go to the
> following page to enter your feedback:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> You received this question notification because you asked the question.
>

Revision history for this message
Patricia Rebello Teles (athenafma) said :
#3

Hi Olivier I apologize.. I have asked you first a question concerning FNAL
CONDOR error msg and, by mistake, I have answered your request with a new
question concerning CERN LSF..

In fact I have problems at both clusters but with different error msg.

At FNAL I could obtain the cross section but not the events.

At CERN I cannot obtain anything, returning that request for gfortran1.4

I am trying to solve it with madgraph-team at CERN at least. If I could
generate these events at LSF it would be wonderful.

If you could help, I would be grateful.

Thank you. Cheers Patricia.

2013/8/22 Patricia Rebello Teles <email address hidden>

> Your question #234413 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> Status: Answered => Open
>
> You are still having a problem:
> Hi Olivier
>
> I have taken a look at the output of one job According to the CERN-LSF
> setup, a STDOUT file which is
>
> @(#)CERN job starter $Date: 2010/06/23 14:22:16 $
> Working directory is </pool/lsf/prebello/427048392> on <lxbrg1807.cern.ch>
>
> ../madevent: /usr/lib64/libgfortran.so.3: version `GFORTRAN_1.4' not found
> (required by ../madevent)
>
> Job finished at Wed Jul 17 18:47:06 CEST 2013 on node
> under linux version Scientific Linux CERN SLC release 5.8 (Boron)
>
>
> CERN statistics: This job used 0:00:01 NCU hours (1 NCU seconds)
>
> CERN statistics: This job used 0:00:01 KSI2K hours (1 KSI2K seconds)
>
> KSI2K = kilo-SpecInt2000 benchmark units = 1.00 NCU
> STDOUT (END)
>
> THE QUESTION: WHY MADEVENT IS REQUIRING HERE THE GFORTRAN VERSION 1.4 ?
> COULD
> THIS ERROR AVOID EVENT WRITING AND THE COMBINING STEPS?
>
> THE VERSION WE HAVE AT the release of CMSSW5.3.11 is
> GNU Fortran (GCC) 4.6.2
> Copyright (C) 2011 Free Software Foundation, Inc.
>
> CHeers Patricia
>
>
>
> 2013/8/22 Olivier Mattelaer <email address hidden>
>
> > Your question #234413 on MadGraph5 changed:
> > https://answers.launchpad.net/madgraph5/+question/234413
> >
> > Status: Open => Answered
> >
> > Olivier Mattelaer proposed the following answer:
> > Hi Patricia,
> >
> > Do you have a central disk?
> > I expect that the problem might be related to the fact that the
> collection
> > of events expect that the node have a direct file access to a (huge)
> series
> > of files via a nfs kind of mounting. If this is not the case, then I
> expect
> > exactly this kind of crash.
> >
> > Could you check that point with the IT team before trying to fix the
> > problem on the way to submit the job.
> >
> > Cheers,
> >
> > Olivier
> >
> > --
> > If this answers your question, please go to the following page to let us
> > know that it is solved:
> >
> >
> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=0
> >
> > If you still need help, you can reply to this email or go to the
> > following page to enter your feedback:
> > https://answers.launchpad.net/madgraph5/+question/234413
> >
> > You received this question notification because you asked the question.
> >
>
> --
> You received this question notification because you asked the question.
>

Revision history for this message
Patricia Rebello Teles (athenafma) said :
#4

Hi Olivier

let me know if the instruction on this webpage below can help you to
understand what is going on FNAL LPC batch concerning that error msg

http://www.uscms.org/uscms_at_work/computing/setup/batch_systems.shtml#condor_1

Cheers Patricia

2013/8/22 Patricia Rebello Teles <email address hidden>

> Your question #234413 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> You gave more information on the question:
> Hi Olivier I apologize.. I have asked you first a question concerning FNAL
> CONDOR error msg and, by mistake, I have answered your request with a new
> question concerning CERN LSF..
>
> In fact I have problems at both clusters but with different error msg.
>
> At FNAL I could obtain the cross section but not the events.
>
> At CERN I cannot obtain anything, returning that request for gfortran1.4
>
> I am trying to solve it with madgraph-team at CERN at least. If I could
> generate these events at LSF it would be wonderful.
>
> If you could help, I would be grateful.
>
> Thank you. Cheers Patricia.
>
>
> 2013/8/22 Patricia Rebello Teles <email address hidden>
>
> > Your question #234413 on MadGraph5 changed:
> > https://answers.launchpad.net/madgraph5/+question/234413
> >
> > Status: Answered => Open
> >
> > You are still having a problem:
> > Hi Olivier
> >
> > I have taken a look at the output of one job According to the CERN-LSF
> > setup, a STDOUT file which is
> >
> > @(#)CERN job starter $Date: 2010/06/23 14:22:16 $
> > Working directory is </pool/lsf/prebello/427048392> on <
> lxbrg1807.cern.ch>
> >
> > ../madevent: /usr/lib64/libgfortran.so.3: version `GFORTRAN_1.4' not
> found
> > (required by ../madevent)
> >
> > Job finished at Wed Jul 17 18:47:06 CEST 2013 on node
> > under linux version Scientific Linux CERN SLC release 5.8 (Boron)
> >
> >
> > CERN statistics: This job used 0:00:01 NCU hours (1 NCU seconds)
> >
> > CERN statistics: This job used 0:00:01 KSI2K hours (1 KSI2K seconds)
> >
> > KSI2K = kilo-SpecInt2000 benchmark units = 1.00 NCU
> > STDOUT (END)
> >
> > THE QUESTION: WHY MADEVENT IS REQUIRING HERE THE GFORTRAN VERSION 1.4 ?
> > COULD
> > THIS ERROR AVOID EVENT WRITING AND THE COMBINING STEPS?
> >
> > THE VERSION WE HAVE AT the release of CMSSW5.3.11 is
> > GNU Fortran (GCC) 4.6.2
> > Copyright (C) 2011 Free Software Foundation, Inc.
> >
> > CHeers Patricia
> >
> >
> >
> > 2013/8/22 Olivier Mattelaer <email address hidden>
> >
> > > Your question #234413 on MadGraph5 changed:
> > > https://answers.launchpad.net/madgraph5/+question/234413
> > >
> > > Status: Open => Answered
> > >
> > > Olivier Mattelaer proposed the following answer:
> > > Hi Patricia,
> > >
> > > Do you have a central disk?
> > > I expect that the problem might be related to the fact that the
> > collection
> > > of events expect that the node have a direct file access to a (huge)
> > series
> > > of files via a nfs kind of mounting. If this is not the case, then I
> > expect
> > > exactly this kind of crash.
> > >
> > > Could you check that point with the IT team before trying to fix the
> > > problem on the way to submit the job.
> > >
> > > Cheers,
> > >
> > > Olivier
> > >
> > > --
> > > If this answers your question, please go to the following page to let
> us
> > > know that it is solved:
> > >
> > >
> >
> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=0
> > >
> > > If you still need help, you can reply to this email or go to the
> > > following page to enter your feedback:
> > > https://answers.launchpad.net/madgraph5/+question/234413
> > >
> > > You received this question notification because you asked the question.
> > >
> >
> > --
> > You received this question notification because you asked the question.
> >
>
> --
> You received this question notification because you asked the question.
>

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#5

Hi,

Looks like you indeed don't have a central disk.
More exactly some of disk are but not all.

So I think that I know how to solve this.

Cheers,

Olivier

On Aug 23, 2013, at 4:11 PM, Patricia Rebello Teles <email address hidden> wrote:

> Question #234413 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> Patricia Rebello Teles gave more information on the question:
> Hi Olivier
>
> let me know if the instruction on this webpage below can help you to
> understand what is going on FNAL LPC batch concerning that error msg
>
> http://www.uscms.org/uscms_at_work/computing/setup/batch_systems.shtml#condor_1
>
> Cheers Patricia
>
>
> 2013/8/22 Patricia Rebello Teles <email address hidden>
>
>> Your question #234413 on MadGraph5 changed:
>> https://answers.launchpad.net/madgraph5/+question/234413
>>
>> You gave more information on the question:
>> Hi Olivier I apologize.. I have asked you first a question concerning FNAL
>> CONDOR error msg and, by mistake, I have answered your request with a new
>> question concerning CERN LSF..
>>
>> In fact I have problems at both clusters but with different error msg.
>>
>> At FNAL I could obtain the cross section but not the events.
>>
>> At CERN I cannot obtain anything, returning that request for gfortran1.4
>>
>> I am trying to solve it with madgraph-team at CERN at least. If I could
>> generate these events at LSF it would be wonderful.
>>
>> If you could help, I would be grateful.
>>
>> Thank you. Cheers Patricia.
>>
>>
>> 2013/8/22 Patricia Rebello Teles <email address hidden>
>>
>>> Your question #234413 on MadGraph5 changed:
>>> https://answers.launchpad.net/madgraph5/+question/234413
>>>
>>> Status: Answered => Open
>>>
>>> You are still having a problem:
>>> Hi Olivier
>>>
>>> I have taken a look at the output of one job According to the CERN-LSF
>>> setup, a STDOUT file which is
>>>
>>> @(#)CERN job starter $Date: 2010/06/23 14:22:16 $
>>> Working directory is </pool/lsf/prebello/427048392> on <
>> lxbrg1807.cern.ch>
>>>
>>> ../madevent: /usr/lib64/libgfortran.so.3: version `GFORTRAN_1.4' not
>> found
>>> (required by ../madevent)
>>>
>>> Job finished at Wed Jul 17 18:47:06 CEST 2013 on node
>>> under linux version Scientific Linux CERN SLC release 5.8 (Boron)
>>>
>>>
>>> CERN statistics: This job used 0:00:01 NCU hours (1 NCU seconds)
>>>
>>> CERN statistics: This job used 0:00:01 KSI2K hours (1 KSI2K seconds)
>>>
>>> KSI2K = kilo-SpecInt2000 benchmark units = 1.00 NCU
>>> STDOUT (END)
>>>
>>> THE QUESTION: WHY MADEVENT IS REQUIRING HERE THE GFORTRAN VERSION 1.4 ?
>>> COULD
>>> THIS ERROR AVOID EVENT WRITING AND THE COMBINING STEPS?
>>>
>>> THE VERSION WE HAVE AT the release of CMSSW5.3.11 is
>>> GNU Fortran (GCC) 4.6.2
>>> Copyright (C) 2011 Free Software Foundation, Inc.
>>>
>>> CHeers Patricia
>>>
>>>
>>>
>>> 2013/8/22 Olivier Mattelaer <email address hidden>
>>>
>>>> Your question #234413 on MadGraph5 changed:
>>>> https://answers.launchpad.net/madgraph5/+question/234413
>>>>
>>>> Status: Open => Answered
>>>>
>>>> Olivier Mattelaer proposed the following answer:
>>>> Hi Patricia,
>>>>
>>>> Do you have a central disk?
>>>> I expect that the problem might be related to the fact that the
>>> collection
>>>> of events expect that the node have a direct file access to a (huge)
>>> series
>>>> of files via a nfs kind of mounting. If this is not the case, then I
>>> expect
>>>> exactly this kind of crash.
>>>>
>>>> Could you check that point with the IT team before trying to fix the
>>>> problem on the way to submit the job.
>>>>
>>>> Cheers,
>>>>
>>>> Olivier
>>>>
>>>> --
>>>> If this answers your question, please go to the following page to let
>> us
>>>> know that it is solved:
>>>>
>>>>
>>>
>> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=0
>>>>
>>>> If you still need help, you can reply to this email or go to the
>>>> following page to enter your feedback:
>>>> https://answers.launchpad.net/madgraph5/+question/234413
>>>>
>>>> You received this question notification because you asked the question.
>>>>
>>>
>>> --
>>> You received this question notification because you asked the question.
>>>
>>
>> --
>> You received this question notification because you asked the question.
>>
>
> --
> You received this question notification because you are a member of
> MadTeam, which is an answer contact for MadGraph5.

Revision history for this message
Patricia Rebello Teles (athenafma) said :
#6

Hi Olivier, do you know how to solve? Please as soon as you have the
solution please let me know. :-)

Thank you very much.

All the best Patricia

2013/8/23 Olivier Mattelaer <email address hidden>

> Your question #234413 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> Status: Open => Answered
>
> Olivier Mattelaer proposed the following answer:
> Hi,
>
> Looks like you indeed don't have a central disk.
> More exactly some of disk are but not all.
>
> So I think that I know how to solve this.
>
> Cheers,
>
> Olivier
>
>
> On Aug 23, 2013, at 4:11 PM, Patricia Rebello Teles <
> <email address hidden>> wrote:
>
> > Question #234413 on MadGraph5 changed:
> > https://answers.launchpad.net/madgraph5/+question/234413
> >
> > Patricia Rebello Teles gave more information on the question:
> > Hi Olivier
> >
> > let me know if the instruction on this webpage below can help you to
> > understand what is going on FNAL LPC batch concerning that error msg
> >
> >
> http://www.uscms.org/uscms_at_work/computing/setup/batch_systems.shtml#condor_1
> >
> > Cheers Patricia
> >
> >
> > 2013/8/22 Patricia Rebello Teles <email address hidden>
> >
> >> Your question #234413 on MadGraph5 changed:
> >> https://answers.launchpad.net/madgraph5/+question/234413
> >>
> >> You gave more information on the question:
> >> Hi Olivier I apologize.. I have asked you first a question concerning
> FNAL
> >> CONDOR error msg and, by mistake, I have answered your request with a
> new
> >> question concerning CERN LSF..
> >>
> >> In fact I have problems at both clusters but with different error msg.
> >>
> >> At FNAL I could obtain the cross section but not the events.
> >>
> >> At CERN I cannot obtain anything, returning that request for gfortran1.4
> >>
> >> I am trying to solve it with madgraph-team at CERN at least. If I could
> >> generate these events at LSF it would be wonderful.
> >>
> >> If you could help, I would be grateful.
> >>
> >> Thank you. Cheers Patricia.
> >>
> >>
> >> 2013/8/22 Patricia Rebello Teles <email address hidden>
> >>
> >>> Your question #234413 on MadGraph5 changed:
> >>> https://answers.launchpad.net/madgraph5/+question/234413
> >>>
> >>> Status: Answered => Open
> >>>
> >>> You are still having a problem:
> >>> Hi Olivier
> >>>
> >>> I have taken a look at the output of one job According to the CERN-LSF
> >>> setup, a STDOUT file which is
> >>>
> >>> @(#)CERN job starter $Date: 2010/06/23 14:22:16 $
> >>> Working directory is </pool/lsf/prebello/427048392> on <
> >> lxbrg1807.cern.ch>
> >>>
> >>> ../madevent: /usr/lib64/libgfortran.so.3: version `GFORTRAN_1.4' not
> >> found
> >>> (required by ../madevent)
> >>>
> >>> Job finished at Wed Jul 17 18:47:06 CEST 2013 on node
> >>> under linux version Scientific Linux CERN SLC release 5.8 (Boron)
> >>>
> >>>
> >>> CERN statistics: This job used 0:00:01 NCU hours (1 NCU seconds)
> >>>
> >>> CERN statistics: This job used 0:00:01 KSI2K hours (1 KSI2K seconds)
> >>>
> >>> KSI2K = kilo-SpecInt2000 benchmark units = 1.00 NCU
> >>> STDOUT (END)
> >>>
> >>> THE QUESTION: WHY MADEVENT IS REQUIRING HERE THE GFORTRAN VERSION 1.4 ?
> >>> COULD
> >>> THIS ERROR AVOID EVENT WRITING AND THE COMBINING STEPS?
> >>>
> >>> THE VERSION WE HAVE AT the release of CMSSW5.3.11 is
> >>> GNU Fortran (GCC) 4.6.2
> >>> Copyright (C) 2011 Free Software Foundation, Inc.
> >>>
> >>> CHeers Patricia
> >>>
> >>>
> >>>
> >>> 2013/8/22 Olivier Mattelaer <email address hidden>
> >>>
> >>>> Your question #234413 on MadGraph5 changed:
> >>>> https://answers.launchpad.net/madgraph5/+question/234413
> >>>>
> >>>> Status: Open => Answered
> >>>>
> >>>> Olivier Mattelaer proposed the following answer:
> >>>> Hi Patricia,
> >>>>
> >>>> Do you have a central disk?
> >>>> I expect that the problem might be related to the fact that the
> >>> collection
> >>>> of events expect that the node have a direct file access to a (huge)
> >>> series
> >>>> of files via a nfs kind of mounting. If this is not the case, then I
> >>> expect
> >>>> exactly this kind of crash.
> >>>>
> >>>> Could you check that point with the IT team before trying to fix the
> >>>> problem on the way to submit the job.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Olivier
> >>>>
> >>>> --
> >>>> If this answers your question, please go to the following page to let
> >> us
> >>>> know that it is solved:
> >>>>
> >>>>
> >>>
> >>
> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=0
> >>>>
> >>>> If you still need help, you can reply to this email or go to the
> >>>> following page to enter your feedback:
> >>>> https://answers.launchpad.net/madgraph5/+question/234413
> >>>>
> >>>> You received this question notification because you asked the
> question.
> >>>>
> >>>
> >>> --
> >>> You received this question notification because you asked the question.
> >>>
> >>
> >> --
> >> You received this question notification because you asked the question.
> >>
> >
> > --
> > You received this question notification because you are a member of
> > MadTeam, which is an answer contact for MadGraph5.
>
> --
> If this answers your question, please go to the following page to let us
> know that it is solved:
>
> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=4
>
> If you still need help, you can reply to this email or go to the
> following page to enter your feedback:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> You received this question notification because you asked the question.
>

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#7

Yes I know how to solve it the point.

You need to specify to the submission jobs all the input/output file for this jobs.
In our condor cluster this is not needed since we run in a central disk, this doesn't seems to be the case at FERMILAB.

The point is that this is not a trivial point to fix and this requires a couple of hours of work (at least). So I can't probably do this before the end-September (I have a series a deadline in September).

Cheers,

Olivier

Revision history for this message
Patricia Rebello Teles (athenafma) said :
#8

No worries Olivier.Take your time. At least someone can help me.

On the other hand I have the same problem at CERN LSF. If you could fix
both with a single changing it would be great :-)

Cheers Patricia

2013/8/29 Olivier Mattelaer <email address hidden>

> Your question #234413 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> Status: Open => Answered
>
> Olivier Mattelaer proposed the following answer:
> Yes I know how to solve it the point.
>
> You need to specify to the submission jobs all the input/output file for
> this jobs.
> In our condor cluster this is not needed since we run in a central disk,
> this doesn't seems to be the case at FERMILAB.
>
> The point is that this is not a trivial point to fix and this requires a
> couple of hours of work (at least). So I can't probably do this before
> the end-September (I have a series a deadline in September).
>
> Cheers,
>
> Olivier
>
> --
> If this answers your question, please go to the following page to let us
> know that it is solved:
>
> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=6
>
> If you still need help, you can reply to this email or go to the
> following page to enter your feedback:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> You received this question notification because you asked the question.
>

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#9

The exact same problem at CERN?

that's weird. I know people using it at CERN without any problem…

Cheers,

Olivier

On Aug 29, 2013, at 6:16 PM, Patricia Rebello Teles <email address hidden> wrote:

> Question #234413 on MadGraph5 changed:
> https://answers.launchpad.net/madgraph5/+question/234413
>
> Status: Answered => Open
>
> Patricia Rebello Teles is still having a problem:
> No worries Olivier.Take your time. At least someone can help me.
>
> On the other hand I have the same problem at CERN LSF. If you could fix
> both with a single changing it would be great :-)
>
>
> Cheers Patricia
>
>
>
> 2013/8/29 Olivier Mattelaer <email address hidden>
>
>> Your question #234413 on MadGraph5 changed:
>> https://answers.launchpad.net/madgraph5/+question/234413
>>
>> Status: Open => Answered
>>
>> Olivier Mattelaer proposed the following answer:
>> Yes I know how to solve it the point.
>>
>> You need to specify to the submission jobs all the input/output file for
>> this jobs.
>> In our condor cluster this is not needed since we run in a central disk,
>> this doesn't seems to be the case at FERMILAB.
>>
>> The point is that this is not a trivial point to fix and this requires a
>> couple of hours of work (at least). So I can't probably do this before
>> the end-September (I have a series a deadline in September).
>>
>> Cheers,
>>
>> Olivier
>>
>> --
>> If this answers your question, please go to the following page to let us
>> know that it is solved:
>>
>> https://answers.launchpad.net/madgraph5/+question/234413/+confirm?answer_id=6
>>
>> If you still need help, you can reply to this email or go to the
>> following page to enter your feedback:
>> https://answers.launchpad.net/madgraph5/+question/234413
>>
>> You received this question notification because you asked the question.
>>
>
> --
> You received this question notification because you are a member of
> MadTeam, which is an answer contact for MadGraph5.

Revision history for this message
Matthew Low (mattlow) said :
#10

Hi,

I've been having the same issue on a SLURM cluster. Has the fix you've implemented for condor also been included for the SLURM submission code?

Thanks,
- Matthew

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#11

Hi,

Which version are you using?

Cheers,

Olivier

Revision history for this message
Matthew Low (mattlow) said :
#12

Hi Olivier,

Currently 1.5.14. I'm starting to move things to 2.0.2, but I have a lot of scripts that I've written that rely on the old 1.5.14 interface (cp -r Template... etc.) so it will be a while to move over.

Thanks,
- Matthew

Revision history for this message
Matthew Low (mattlow) said :
#13

Hi Olivier,

One of the computing experts at my institution helped me out with this, it looks like in madevent_interface.py, if combine.log is read before it is written, the file doesn't try to re-read combine.log. I described it here: https://bugs.launchpad.net/mg5amcnlo/+bug/1280051.

- Matthew

Revision history for this message
Alexander Law (law-59) said :
#14

Hello -

I know that this thread is old, but I don't see a resolution, and I believe that I have the same problem: using multi_run on Condor batch with no central disk, and jobs failing with the same behavior described in Patricia's original post.

I can run the same sequence of madevent commands locally with no problem, on a smaller job, but this won't be practical for my full generation.

Has anyone resolved this problem?

- Alexander

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#15

No they are no solution yet,

They are basically two solution since
- force this code to run locally
- change the way the code submission handles the cluster in that case.

This is in my todo list but I’m very busy for the moment.

Cheers,

Olivier

On Apr 10, 2014, at 10:06 PM, Alexander Law <email address hidden> wrote:

> Question #234413 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/234413
>
> Alexander Law posted a new comment:
> Hello -
>
> I know that this thread is old, but I don't see a resolution, and I
> believe that I have the same problem: using multi_run on Condor batch
> with no central disk, and jobs failing with the same behavior described
> in Patricia's original post.
>
> I can run the same sequence of madevent commands locally with no
> problem, on a smaller job, but this won't be practical for my full
> generation.
>
> Has anyone resolved this problem?
>
> - Alexander
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Patricia Rebello Teles (athenafma) said :
#16

Dear Alexandre,

a new version of MG5 is available at

https://launchpad.net/mg5amcnlo/2.0/2.1.0/+download/MG5_aMC_v2.1.1.tar.gz

It works well at my PBS cluster. I didn't try it in the Condor. Could you please check it? :-)

Cheers Patricia

Revision history for this message
Alexander Law (law-59) said :
#17

I am using version 2.1.1.

Can you help with this problem?

Provide an answer of your own, or ask Patricia Rebello Teles for more information if necessary.

To post a message you must log in.