reweighting with gridpacks: how to pre-compile everything

Asked by Matthias M

Hello,

I've recently started to use the reweighting feature, which is saving a great deal of time and book-keeping for me. In order to produce larger samples, I've looked into getting the reweighting to run in gridpacks. It's not difficult to adjust the run script to also run the reweighting step and everything works fine if I unpack the gridpack on a local machine and run it there. However, I have some trouble when sending the gridpacks to remote machines. It looks like during the reweighting step madgraph recreates and recompiles the matrix-element code from the information in the input LHE header. As you know, for grid-processing one cannot always rely on a useable compiler suite at the remote site, so I'm wondering if there is a way to generate and compile the code used in the reweighting already during the creation of the gridpack.

Yours
  Matthias

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:

This question was reopened

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

Dear Matthias,

For the moment this is not possible, but I have introduce the functionality in the big update of the reweight function.

The beta version can be download via the following command:
bzr branch lp:~maddevelopers/mg5amcnlo/unleashed_reweighting

I have already update the instructions on the following page:
https://cp3.irmp.ucl.ac.be/projects/madgraph/wiki/Reweight
See option section.

Note that compare to previous version, the package requires the f2py module to be installed. The easiest way to install it is to install numpy (a pretty standard python library).

for the gridpack, the idea is that when you generate your gridpack you include the line
change rwgt_dir PATH

and that when you run it, you also have the line "change rwgt_dir PATH"
in that case, all the fortran routine will be reused and not re-create.

On the interface point of view, the gridpack mode is similar of what is done with MadSpin.

Cheers,

Olivier

Revision history for this message
Matthias M (mmgdw) said :
#2

Dear Olivier,

thank you very much for your help. I tried your suggestions and progress is clearly visible:
during the creation of the gridpack, the directory for the reweight now gets created and compilation starts.
Unfortunately it does not succeed. The error log is much too long to post here, so I will try to summarize the more noticeable issues:

The error message starts when compiling the reweight subdirectory:
Command "launch" interrupted with error:
MadGraph5Error : A compilation Error occurs when trying to compile rwgt/rw_me/SubProcesses/P0_gg_y_y_z_taptambbx.

The following printout includes what appears to be a long list of successful steps taking in the compilation. The first sign of trouble are a large number of messages along the line of:
      Reading file 'matrix.f' (format:fix,strict)
     Line #83 in matrix.f:" DATA (NHEL(I, 1),I=1,6) /-1,-1, 1,-1,-1, 1/"
though these do not trigger an immediate abort.

A bit later in the printout, I see that f2py is not finding the correct fortran compiler:
     customize GnuFCompiler
     Found executable /usr/bin/g77
     gnu: no Fortran 90 compiler found
it is supposed to use gfortran, which is availble in the executable path. Interestingly, it finds the correct c++ compiler. I tried setting the preferred fortran compiler in Cards/me5_configuration.dat, but it didn't seem to affect f2py.

The compilation finally crashes with the message:
/usr/bin/g77 -g -Wall -g -Wall -shared /tmp/blabla/src.linux-x86_64-2.7/matrix2pymodule.o /tmp/blabla/src.linux-x86_64-2.7/fortranobject.o /tmp/blabla/src.linux-x86_64-2.7/matrix2py-f2pywrappers.o -L../../lib/ -L. -ldhelas -lmodel -lpython2.7 -lg2c -o ./matrix2py.so
     /cvmfs/cms.cern.ch/slc6_amd64_gcc481/external/gcc/4.8.1/bin/ld: cannot find -lpython2.7
     collect2: ld returned 1 exit status
Things to note:
-the versions of ld and g77 used together here may not be compatible (ld is the correct version)
-the path to libpython2.7.so is included in the $LD_LIBRARY_PATH
-copying libpython2.7.so to either of the two library paths explicitly given in the command line doesn't solve the issue either.

I'd be grateful for your advice.

Matthias

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#3

Dear Matthias,

Thanks for testing those developments.

I have look on the web and found how to specify to f2py which compiler to use.
So now, I force f2py to use the compiler detected by MG (or defined in one of the configuration file).

You can get the modification to the code via the command “bzr pull” (to run from the unleashed_reweighting directory)

Cheers and thanks,

Olivier

On 09 Aug 2015, at 17:16, Matthias M <email address hidden> wrote:

> Question #270128 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/270128
>
> Status: Answered => Open
>
> Matthias M is still having a problem:
> Dear Olivier,
>
> thank you very much for your help. I tried your suggestions and progress is clearly visible:
> during the creation of the gridpack, the directory for the reweight now gets created and compilation starts.
> Unfortunately it does not succeed. The error log is much too long to post here, so I will try to summarize the more noticeable issues:
>
> The error message starts when compiling the reweight subdirectory:
> Command "launch" interrupted with error:
> MadGraph5Error : A compilation Error occurs when trying to compile rwgt/rw_me/SubProcesses/P0_gg_y_y_z_taptambbx.
>
> The following printout includes what appears to be a long list of successful steps taking in the compilation. The first sign of trouble are a large number of messages along the line of:
> Reading file 'matrix.f' (format:fix,strict)
> Line #83 in matrix.f:" DATA (NHEL(I, 1),I=1,6) /-1,-1, 1,-1,-1, 1/"
> though these do not trigger an immediate abort.
>
> A bit later in the printout, I see that f2py is not finding the correct fortran compiler:
> customize GnuFCompiler
> Found executable /usr/bin/g77
> gnu: no Fortran 90 compiler found
> it is supposed to use gfortran, which is availble in the executable path. Interestingly, it finds the correct c++ compiler. I tried setting the preferred fortran compiler in Cards/me5_configuration.dat, but it didn't seem to affect f2py.
>
> The compilation finally crashes with the message:
> /usr/bin/g77 -g -Wall -g -Wall -shared /tmp/blabla/src.linux-x86_64-2.7/matrix2pymodule.o /tmp/blabla/src.linux-x86_64-2.7/fortranobject.o /tmp/blabla/src.linux-x86_64-2.7/matrix2py-f2pywrappers.o -L../../lib/ -L. -ldhelas -lmodel -lpython2.7 -lg2c -o ./matrix2py.so
> /cvmfs/cms.cern.ch/slc6_amd64_gcc481/external/gcc/4.8.1/bin/ld: cannot find -lpython2.7
> collect2: ld returned 1 exit status
> Things to note:
> -the versions of ld and g77 used together here may not be compatible (ld is the correct version)
> -the path to libpython2.7.so is included in the $LD_LIBRARY_PATH
> -copying libpython2.7.so to either of the two library paths explicitly given in the command line doesn't solve the issue either.
>
> I'd be grateful for your advice.
>
> Matthias
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Matthias M (mmgdw) said :
#4

Dear Olivier,

Thank you very much for your help. I can now compile the code in the reweight directory, but I still have trouble. There is a strange problem when the weights are applied:

Traceback (most recent call last):
  File "MG5_aMC_v2_3_0/gravtest/bin/internal/extended_cmd.py", line 879, in onecmd
    return self.onecmd_orig(line, **opt)
  File "MG5_aMC_v2_3_0/gravtest/bin/internal/extended_cmd.py", line 872, in onecmd_orig
    return func(arg, **opt)
  File "MG5_aMC_v2_3_0/gravtest/bin/internal/common_run_interface.py", line 1061, in do_reweight
    reweight_cmd.import_command_file(path)
  File "MG5_aMC_v2_3_0/madgraph/interface/extended_cmd.py", line 1038, in import_command_file
    self.exec_cmd(line, precmd=True)
  File "MG5_aMC_v2_3_0/madgraph/interface/extended_cmd.py", line 919, in exec_cmd
    stop = Cmd.onecmd_orig(current_interface, line, **opt)
  File "MG5_aMC_v2_3_0/madgraph/interface/extended_cmd.py", line 872, in onecmd_orig
    return func(arg, **opt)
  File "MG5_aMC_v2_3_0/madgraph/interface/reweight_interface.py", line 502, in do_launch
    weight = self.calculate_weight(event)
  File "MG5_aMC_v2_3_0/madgraph/interface/reweight_interface.py", line 646, in calculate_weight
    w_new = self.calculate_matrix_element(event, 1, space)
  File "MG5_aMC_v2_3_0/madgraph/interface/reweight_interface.py", line 725, in calculate_matrix_element
    mymod = __import__('rw_me.SubProcesses.%s.matrix%spy' % (Pname, 2*metag+1), globals(), locals(), [],-1)
ImportError: dynamic module does not define init function (initmatrix3py)

The module it is failing on is always the first one (but differs depending on the initial state of the first event to be reweighted), so it seems to affect all the libraries.
I also checked the obvious: the matrix3py.so file it's trying to load does in fact exist and 'nm -g matrix3py.so' shows that it contains a function called initmatrix3py. I don't really see why it would fail.

I'm somewhat mystified what could cause this issue. I understand that this is probably difficult for you to debug remotely, but maybe you could suggest some way for me to gather more detailed debug information.

Yours
  Mattias

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#5

Dear Matthias,

I think that I have one idea which might solve the problem.
I have implemented it in the code (you can use “bzr pull” again)
could you tell me if it works?

Cheers,

Olivier

On 10 Aug 2015, at 09:13, Matthias M <email address hidden> wrote:

> Question #270128 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/270128
>
> Status: Answered => Open
>
> Matthias M is still having a problem:
> Dear Olivier,
>
> Thank you very much for your help. I can now compile the code in the
> reweight directory, but I still have trouble. There is a strange problem
> when the weights are applied:
>
> Traceback (most recent call last):
> File "MG5_aMC_v2_3_0/gravtest/bin/internal/extended_cmd.py", line 879, in onecmd
> return self.onecmd_orig(line, **opt)
> File "MG5_aMC_v2_3_0/gravtest/bin/internal/extended_cmd.py", line 872, in onecmd_orig
> return func(arg, **opt)
> File "MG5_aMC_v2_3_0/gravtest/bin/internal/common_run_interface.py", line 1061, in do_reweight
> reweight_cmd.import_command_file(path)
> File "MG5_aMC_v2_3_0/madgraph/interface/extended_cmd.py", line 1038, in import_command_file
> self.exec_cmd(line, precmd=True)
> File "MG5_aMC_v2_3_0/madgraph/interface/extended_cmd.py", line 919, in exec_cmd
> stop = Cmd.onecmd_orig(current_interface, line, **opt)
> File "MG5_aMC_v2_3_0/madgraph/interface/extended_cmd.py", line 872, in onecmd_orig
> return func(arg, **opt)
> File "MG5_aMC_v2_3_0/madgraph/interface/reweight_interface.py", line 502, in do_launch
> weight = self.calculate_weight(event)
> File "MG5_aMC_v2_3_0/madgraph/interface/reweight_interface.py", line 646, in calculate_weight
> w_new = self.calculate_matrix_element(event, 1, space)
> File "MG5_aMC_v2_3_0/madgraph/interface/reweight_interface.py", line 725, in calculate_matrix_element
> mymod = __import__('rw_me.SubProcesses.%s.matrix%spy' % (Pname, 2*metag+1), globals(), locals(), [],-1)
> ImportError: dynamic module does not define init function (initmatrix3py)
>
> The module it is failing on is always the first one (but differs depending on the initial state of the first event to be reweighted), so it seems to affect all the libraries.
> I also checked the obvious: the matrix3py.so file it's trying to load does in fact exist and 'nm -g matrix3py.so' shows that it contains a function called initmatrix3py. I don't really see why it would fail.
>
> I'm somewhat mystified what could cause this issue. I understand that
> this is probably difficult for you to debug remotely, but maybe you
> could suggest some way for me to gather more detailed debug information.
>
> Yours
> Mattias
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Matthias M (mmgdw) said :
#6

Dear Olivier,

unfortunately, it did not solve the ImportError. However, I've played around a bit and possibly found the source of the issue. I tried starting pyhton on the command line and loading the module by hand. That produces the same error. In contrast, matrix2py.so loads fine. This makes me wonder whether the problem lies with the copy&replace step in the creation of the matrix3py.so library. Maybe something more is needed?

Matthias

Python 2.7.6 (default, Jun 24 2014, 16:51:01)
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> mymod = __import__('rw_me.SubProcesses.P0_gg_y_y_z_taptamssx.matrix2py', globals(), locals(), [],-1)
>>> mymod3 = __import__('rw_me.SubProcesses.P0_gg_y_y_z_taptamssx.matrix3py', globals(), locals(), [],-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: dynamic module does not define init function (initmatrix3py)

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#7

Dear Matthias,

> This
> makes me wonder whether the problem lies with the copy&replace step in
> the creation of the matrix3py.so library. Maybe something more is
> needed?

I believe the same. Such replacement works fine on my computer (and the one of my collaborator)
but clearly not in your case. My last change was a way to recover by making a standard compilation
instead of using that trick. With such trick, it should work as nicely as matrix2py.

My current guess is that, the compilation did not go trough at all for matrix3py.so because the file was consider up to date by the makefile.
I have try to fix that by first removing the old matrix3py.so file and then run the compilation for that file.
So if you do bzr pull, you can try with that fix.

Sorry to not be able to test this problem myself and be able to ensure that the problem is really fixed.
Thanks a lot for your help and patience.

Olivier

On 10 Aug 2015, at 15:56, Matthias M <email address hidden> wrote:

> Question #270128 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/270128
>
> Status: Answered => Open
>
> Matthias M is still having a problem:
> Dear Olivier,
>
> unfortunately, it did not solve the ImportError. However, I've played
> around a bit and possibly found the source of the issue. I tried
> starting pyhton on the command line and loading the module by hand. That
> produces the same error. In contrast, matrix2py.so loads fine. This
> makes me wonder whether the problem lies with the copy&replace step in
> the creation of the matrix3py.so library. Maybe something more is
> needed?
>
> Matthias
>
>
> Python 2.7.6 (default, Jun 24 2014, 16:51:01)
> [GCC 4.8.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> mymod = __import__('rw_me.SubProcesses.P0_gg_y_y_z_taptamssx.matrix2py', globals(), locals(), [],-1)
>>>> mymod3 = __import__('rw_me.SubProcesses.P0_gg_y_y_z_taptamssx.matrix3py', globals(), locals(), [],-1)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ImportError: dynamic module does not define init function (initmatrix3py)
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#8

Dear Matthias,

Another develloper fails to make it working with the exact same error as yours. Together we succeed to fix the problem once and for all. So if you do bzr pull, it should go trough this time.

Thanks,

Olivier

Revision history for this message
Matthias M (mmgdw) said :
#9

Dear Olivier,

I tried today on a different machine and I was able to complete the compilation and run the reweight from the gridpack without having to recompile. Once the original machine is up again I'll try there as well.
Thank you very much, you've been extremely helpful.

Matthias

P.S.: While the matter is now basically resolved, you may want to look into streamlining gridpacks with reweights if you find the time. As of right now, it seems that having the TRUE = gridpack in the run_card.dat blocks the reweighting step, so I have to create the gripack, unpack it, run a reweight to trigger compilation and repack the gridpack. It works, but is a bit more convoluted than it has to be.

Revision history for this message
Matthias M (mmgdw) said :
#10

Thanks Olivier Mattelaer, that solved my question.

Revision history for this message
Matthias M (mmgdw) said :
#11

Dear Olivier,

I'm sorry that I have to revive this old topic, but the context is useful for the issues I am facing.
Now I'm in the process of integrating the reweighting into the 'official' gridpack generation workflow at my experiment. It works reasonably well, but there are two rough edges I'd like to hear from you about.

1) reweight paths.
There is some trouble with setting the reweight path as relative path. I've traced the problem to the following lines in reweight_interface.py (line 740)

                 self.rename_f2py_lib(Pdir, 2*metag)
                 with misc.chdir(Pdir):
                     mymod = __import__('rw_me.SubProcesses.%s.matrix%spy' % (Pname, 2*metag), globals(), locals(), [],-1)
                     S = mymod.SubProcesses
                     P = getattr(S, Pname)
                     mymod = getattr(P, 'matrix%spy' % (2*metag))

The reweight-directory name is added to the python-path as is, so it will only be found if __import__ is called from the working directory where it was created. So far we are considering two solutions
a) not supporting custom reweight directories
b) moving the "__import__" line up to before the "with misc.chdir(Pdir):" line, so that the realtive path is still valid.
What is your suggestion on this issue?

2) We still have some trouble with the recompilation of of the reweighting code. At the moment we try to reduce the size of the gridpack tarballs that get distributed to the compute-nodes by removing as many un-needed files as possible. That includes the various source-code files. This practice collides with the mechanism currently employed to not recompile the code: to me it looks like it's running 'make', which will do nothing if the sources are unchanged. Unfortunately it will crash if the sources have been cleaned up.
For now, we are resetting the makefiles to skirt around this, but it's a bit of a hack.
Is there a possibility to more explicitly inhibit the recompilation (i.e. with a commadn-line parameter, or an entry in the run_card)?

Matthias

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#12

Hi,

> b) moving the “__import__" line up to before the "with misc.chdir(Pdir):" line, so that the realtive path is still valid.

If that works then yes this is perfect. Could you confirm that it works (it should) and in that case, I will do the change.

> 2) We still have some trouble with the recompilation of of the reweighting code. At the moment we try to reduce the size of the gridpack tarballs that get distributed to the compute-nodes by removing as many un-needed files as possible. That includes the various source-code files. This practice collides with the mechanism currently employed to not recompile the code: to me it looks like it's running 'make', which will do nothing if the sources are unchanged. Unfortunately it will crash if the sources have been cleaned up.
> For now, we are resetting the makefiles to skirt around this, but it's a bit of a hack.
> Is there a possibility to more explicitly inhibit the recompilation (i.e. with a commadn-line parameter, or an entry in the run_card)?

Might not be simple, since by default we run “make” at every level to always ensure that everything is in sync.

But maybe the following should be enough:
replace
misc.compile(['matrix2py.so'], cwd=Pdir)
by
if not self.self.rwgt_dir or not self.path.exists(join(Pdir, ‘matrix2py.so’):
 misc.compile(['matrix2py.so'], cwd=Pdir)

(and the equivalent for all the misc.compile of that file)

Tell me if it works and if it does, I can include it as well

Cheers,

Olivier

On 12 Oct 2015, at 08:57, Matthias M <email address hidden> wrote:

> Question #270128 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/270128
>
> Status: Solved => Open
>
> Matthias M is still having a problem:
> Dear Olivier,
>
> I'm sorry that I have to revive this old topic, but the context is useful for the issues I am facing.
> Now I'm in the process of integrating the reweighting into the 'official' gridpack generation workflow at my experiment. It works reasonably well, but there are two rough edges I'd like to hear from you about.
>
> 1) reweight paths.
> There is some trouble with setting the reweight path as relative path. I've traced the problem to the following lines in reweight_interface.py (line 740)
>
> self.rename_f2py_lib(Pdir, 2*metag)
> with misc.chdir(Pdir):
> mymod = __import__('rw_me.SubProcesses.%s.matrix%spy' % (Pname, 2*metag), globals(), locals(), [],-1)
> S = mymod.SubProcesses
> P = getattr(S, Pname)
> mymod = getattr(P, 'matrix%spy' % (2*metag))
>
> The reweight-directory name is added to the python-path as is, so it will only be found if __import__ is called from the working directory where it was created. So far we are considering two solutions
> a) not supporting custom reweight directories
> b) moving the "__import__" line up to before the "with misc.chdir(Pdir):" line, so that the realtive path is still valid.
> What is your suggestion on this issue?
>
> 2) We still have some trouble with the recompilation of of the reweighting code. At the moment we try to reduce the size of the gridpack tarballs that get distributed to the compute-nodes by removing as many un-needed files as possible. That includes the various source-code files. This practice collides with the mechanism currently employed to not recompile the code: to me it looks like it's running 'make', which will do nothing if the sources are unchanged. Unfortunately it will crash if the sources have been cleaned up.
> For now, we are resetting the makefiles to skirt around this, but it's a bit of a hack.
> Is there a possibility to more explicitly inhibit the recompilation (i.e. with a commadn-line parameter, or an entry in the run_card)?
>
> Matthias
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Matthias M (mmgdw) said :
#13

Dear Olivier,

thank you very much for the suggestions.
The first one works right away.
For the second one, I had to fix a few typos:
 self.self.rwgt_dir -> self.rwgt_dir
 self.path -> os.path
 join -> pjoin
but otherwise things appear fine.

One caveat: so far I've only tested using a single example case (using EWdim6), where model parameters are changed for the reweighting. I've not tested the case where the complete model is changed.

Thanks again for the quick reply.

Matthias

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#14

Hi,

So many error in a single line. Sorry for that.

I have push the change,

Olivier

On 12 Oct 2015, at 16:57, Matthias M <email address hidden> wrote:

> Question #270128 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/270128
>
> Matthias M posted a new comment:
> Dear Olivier,
>
> thank you very much for the suggestions.
> The first one works right away.
> For the second one, I had to fix a few typos:
> self.self.rwgt_dir -> self.rwgt_dir
> self.path -> os.path
> join -> pjoin
> but otherwise things appear fine.
>
> One caveat: so far I've only tested using a single example case (using
> EWdim6), where model parameters are changed for the reweighting. I've
> not tested the case where the complete model is changed.
>
> Thanks again for the quick reply.
>
> Matthias
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Can you help with this problem?

Provide an answer of your own, or ask Matthias M for more information if necessary.

To post a message you must log in.