Question #275409 “MadGraph PDF sampling” : Questions : MadGraph5

Revision history for this message

Olivier Mattelaer (olivier-mattelaer) said on 2015-11-25:

#1

Hi James,

The main point is that madgraph do not integrate according to x1 and x2 but according to \sqrts{\hat S} and another variable (x1/x2) if I’m correct.
The value associated of x1 and x2 are deduced from those two variables.

For \sqrts{\hat S}, we do not generate it in a flat way since this is highly inefficient. We use the cut on the final state (and on shell particles information) in order
to determine a typical guess of the scale for the Feynman diagram that driven the integration.
According to that value we have an a priori grid that we use for the integration. This grid is going to be refine automatically by the adaptive method of integration.

>
> 1) I noticed that the PDFs (for g g > g g, for example) are mostly sampled around the small x values. This is clearly an importance sampling technique. However, if you never (the last time I checked, the highest values of x probed were about 0.7) sample points with higher x values, how well can I trust the statistical errors that MG gives me?

First, in general you can not proof that the statistical error is going to converge though the real variance of the function divided by the square root of the probed point.
You can only proof this property for function which are convex and concave.
Now those methods intends to improve the integration but using the priori knowledge of the function. If this knowledge is wrong then indeed you are going to have trouble.

> If I am, say, looking at g g > g g g and I want to create a histogram binning the differential cross section as a function of the rapidity difference between the most forward and most backward jet, how well can I trust the statistical bands that come in each bin? It seems to me that if you don't sample some of the phase space you can't have a good guess at what the overall stat error should be.

creating an histogram is the same as computing multiple integral with various cut, since your cut change the function to integrate, if you want to have a precise integral in each bin, then you need to use a dedicated phase-space parametrization of the phase-space in order to have the statistical error under control in each bin.

> 2) If I wanted to force MadGraph to sample the whole of x space uniformly, how could I do this? Is there a place in the code where I can essentially turn off this importance sampling?

Well in order to do that, you need to change completely the phase-space integrator
This means
1) change the variable of integration
2) remove the pertaining module
3) remove the adaptive training of the grid (this one is easy)
For such huge change, I would actually suggest to write your own phase-space integrator rather than performing such type of modification to the code.
Since this is basically what you did already… this is actually redundant. Making all those change in MG, means that you are not validating against MG anymore but again a second by hand implementation of the same stuff.

Cheers,

Olivier

> On Nov 25, 2015, at 00:51, James Cockburn <email address hidden> wrote:
>
> New question #275409 on MadGraph5_aMC@NLO:
> https://answers.launchpad.net/mg5amcnlo/+question/275409
>
> Hi,
>
> I am currently trying to check the validity of a phase space generator we are developing. Specifically, we evaluate the MG matrix elements but the momenta that go into this evaluation is generated in a different way (i.e. not with the MadGraph generator). There is currently some tension in both the distributions and the cross-section between this result and the MadGraph result. It looks like it might have something to do with the PDF sampling in MadGraph, so I wanted to ask a few questions about it.
>
> 1) I noticed that the PDFs (for g g > g g, for example) are mostly sampled around the small x values. This is clearly an importance sampling technique. However, if you never (the last time I checked, the highest values of x probed were about 0.7) sample points with higher x values, how well can I trust the statistical errors that MG gives me? If I am, say, looking at g g > g g g and I want to create a histogram binning the differential cross section as a function of the rapidity difference between the most forward and most backward jet, how well can I trust the statistical bands that come in each bin? It seems to me that if you don't sample some of the phase space you can't have a good guess at what the overall stat error should be.
>
> 2) If I wanted to force MadGraph to sample the whole of x space uniformly, how could I do this? Is there a place in the code where I can essentially turn off this importance sampling?
>
> Cheers,
> James
>
>
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Hi James,

The main point is that madgraph do not integrate according to x1 and x2 but according to \sqrts{\hat S} and another variable (x1/x2) if I’m correct.
The value associated of x1 and x2 are deduced from those two variables.

For  \sqrts{\hat S}, we do not generate it in a flat way since this is highly inefficient. We use the cut on the final state (and on shell particles information) in order 
to determine a typical guess of the scale  for the Feynman diagram that driven the integration.
According to that value we have an a priori grid that we use for the integration. This grid is going to be refine automatically by the adaptive method of integration.

> 
> 1) I noticed that the PDFs (for g g > g g, for example) are mostly sampled around the small x values. This is clearly an importance sampling technique. However, if you never (the last time I checked, the highest values of x probed were about 0.7) sample points with higher x values, how well can I trust the statistical errors that MG gives me?

First, in general you can not proof that the statistical error is going to converge though the real variance of the function divided by the square root of the probed point.
You can only proof this property for function which are convex and concave. 
Now those methods intends to improve the integration but using the priori knowledge of the function. If this knowledge is wrong then indeed you are going to have trouble.

> If I am, say, looking at g g > g g g and I want to create a histogram binning the differential cross section as a function of the rapidity difference between the most forward and most backward jet, how well can I trust the statistical bands that come in each bin? It seems to me that if you don't sample some of the phase space you can't have a good guess at what the overall stat error should be.

creating an histogram is the same as computing multiple integral with various cut, since your cut change the function to integrate, if you want to have a precise integral in each bin, then you need to use a dedicated phase-space parametrization of the phase-space in order to have the statistical error under control in each bin.

> 2) If I wanted to force MadGraph to sample the whole of x space uniformly, how could I do this? Is there a place in the code where I can essentially turn off this importance sampling?

Well in order to do that, you need to change completely the phase-space integrator
This means
1) change the variable of integration
2) remove the pertaining module
3) remove the adaptive training of the grid (this one is easy)
For such huge change, I would actually suggest to write your own phase-space integrator rather than performing such type of modification to the code.
Since this is basically what you did already… this is actually redundant. Making all those change in MG, means that you are not validating against MG anymore but again a second by hand implementation of the same stuff.

Cheers,

Olivier

> On Nov 25, 2015, at 00:51, James Cockburn <question275409@answers.launchpad.net> wrote:
> 
> New question #275409 on MadGraph5_aMC@NLO:
> https://answers.launchpad.net/mg5amcnlo/+question/275409
> 
> Hi,
> 
> I am currently trying to check the validity of a phase space generator we are developing. Specifically, we evaluate the MG matrix elements but the momenta that go into this evaluation is generated in a different way (i.e. not with the MadGraph generator). There is currently some tension in both the distributions and the cross-section between this result and the MadGraph result. It looks like it might have something to do with the PDF sampling in MadGraph, so I wanted to ask a few questions about it.
> 
> 1) I noticed that the PDFs (for g g > g g, for example) are mostly sampled around the small x values. This is clearly an importance sampling technique. However, if you never (the last time I checked, the highest values of x probed were about 0.7) sample points with higher x values, how well can I trust the statistical errors that MG gives me? If I am, say, looking at g g > g g g and I want to create a histogram binning the differential cross section as a function of the rapidity difference between the most forward and most backward jet, how well can I trust the statistical bands that come in each bin? It seems to me that if you don't sample some of the phase space you can't have a good guess at what the overall stat error should be.
> 
> 2) If I wanted to force MadGraph to sample the whole of x space uniformly, how could I do this? Is there a place in the code where I can essentially turn off this importance sampling?
> 
> Cheers,
> James
> 
> 
> 
> -- 
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message

James Cockburn (j-d-cockburn) said on 2015-11-25:

#2

Hi Olivier,

Thanks for your reply - much appreciated! It is a bit clearer to me now how you guys do the integration. One other question though:

>The main point is that madgraph do not integrate according to x1 and x2 but according to \sqrts{\hat S} and another variable >(x1/x2) if I’m correct.
The value associated of x1 and x2 are deduced from those two variables.

From some scribbles I've just done now, it seems in order to include everything one should in principle generate random values of x1/x2 between 0 and infinity. However, this is too naive an approach and I imagine one should instead use a cut off for the smallest value of x1 or x2 you are willing to accept and then you can generate a random number between two finite numbers. Am I understanding correctly and is this what you do? If so, what is the lowest value of x you allow in MG?

Revision history for this message

Olivier Mattelaer (olivier-mattelaer) said on 2015-11-25:

#3

Hi James,

> Am I understanding correctly and is this
> what you do? If so, what is the lowest value of x you allow in MG?

actually the second variable is exactly (after check in the code)
ETA = .5*LOG(X1/X2)
For the min/max on this variable, you can check the routine gencms in the file SubProcesses/genps.f

Cheers,

Olivier

> On Nov 25, 2015, at 23:57, James Cockburn <email address hidden> wrote:
>
> Question #275409 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/275409
>
> Status: Answered => Open
>
> James Cockburn is still having a problem:
> Hi Olivier,
>
> Thanks for your reply - much appreciated! It is a bit clearer to me now
> how you guys do the integration. One other question though:
>
>> The main point is that madgraph do not integrate according to x1 and x2 but according to \sqrts{\hat S} and another variable >(x1/x2) if I’m correct.
> The value associated of x1 and x2 are deduced from those two variables.
>
>> From some scribbles I've just done now, it seems in order to include
> everything one should in principle generate random values of x1/x2
> between 0 and infinity. However, this is too naive an approach and I
> imagine one should instead use a cut off for the smallest value of x1 or
> x2 you are willing to accept and then you can generate a random number
> between two finite numbers. Am I understanding correctly and is this
> what you do? If so, what is the lowest value of x you allow in MG?
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message

James Cockburn (j-d-cockburn) said on 2015-11-26:

#4

Hi Olivier,

Okay, all of this seems fine now, thanks! I do have another request though - in order to take physics completely out of the equation when comparing the generators, I want to just set all evaluations of the matrix element to be 1 in both codes. I can do this very simply in my code but, given my lack of FORTRAN experience, I'm having trouble with doing it in MG. What is the easiest way to do this?

Cheers,
James

Revision history for this message

Olivier Mattelaer (olivier-mattelaer) said on 2015-11-26:

#5

Hi,

you can do this in dsample.f
where you comment the call to the matrix element

Cheers,

Olivier

> On Nov 26, 2015, at 23:42, James Cockburn <email address hidden> wrote:
>
> Question #275409 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/275409
>
> Status: Answered => Open
>
> James Cockburn is still having a problem:
> Hi Olivier,
>
> Okay, all of this seems fine now, thanks! I do have another request
> though - in order to take physics completely out of the equation when
> comparing the generators, I want to just set all evaluations of the
> matrix element to be 1 in both codes. I can do this very simply in my
> code but, given my lack of FORTRAN experience, I'm having trouble with
> doing it in MG. What is the easiest way to do this?
>
> Cheers,
> James
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message

James Cockburn (j-d-cockburn) said on 2015-11-27:

#6

Brilliant. Thank you so much for you help!

- James

MadGraph5_aMC@NLO

MadGraph PDF sampling

Question information

Subscribers

MadGraph5_aMC@NLO

MadGraph PDF sampling

Question information

Related bugs

Related FAQ:

Subscribers