# Different weights for different colour structures in LHE files?

I am trying to understand how MadGraph generates colour structures for Les-Houches files.

I consider a process g g > g g.
I found that in LHE file different combinations appear with different frequency.
I have written a simple python script that finds all different combinations and counts how often they appear.
Here is the output
combination = 504 501 503 502 503 501 504 502 with weight = 0.318418
combination = 503 501 504 502 503 502 504 501 with weight = 0.327642
combination = 502 501 504 502 503 501 504 503 with weight = 0.089904
combination = 503 501 501 502 503 504 504 502 with weight = 0.086785
combination = 502 501 503 502 503 504 504 501 with weight = 0.090514
combination = 504 501 501 502 503 502 504 503 with weight = 0.086737

So my questions are: 1) Why these combination are generated with different probability?
Is there a certain hit-or-miss generation algorithm involved ?

2) If it is possible, could you please give me some references
with explanations on how MG generates these structures.

3) Are colour structures and four-momenta generated independently?
Or certain combinations of colours and four-momenta appear more often than the others?

Thank you!
With best regards,
Oleh

## Question information

Language:
English Edit question
Status:
Solved
For:
Assignee:
No assignee Edit question
Solved by:
oleh
Solved:
2018-07-17
Last query:
2018-07-17
2018-07-13

## This question was reopened

• 2018-07-13 by oleh
 Olivier Mattelaer (olivier-mattelaer) said on 2018-04-25: #1

> 2) If it is possible, could you please give me some references
> with explanations on how MG generates these structures.

for each phase-space point, we evaluate the amplitude contribution for each fermion flow independently
and square them against themself. Then we discard all the NLC contribution from those and then pick
one of the color by launching a random number (the contribution of each LC being weighted by the colored amplitude square associated).
Note that interference term are neglected here, they are NLC anyway.
then we write inside the event the associate structure.

> 1) Why these combination are generated with different probability?
> Is there a certain hit-or-miss generation algorithm involved ?

this is a random process. But not really hit or miss. Since we simply pick in a weighted way.

> 3) Are colour structures and four-momenta generated independently?
> Or certain combinations of colours and four-momenta appear more often than the others?

As you can see above this is not independent.

Cheers,

Olivier

> On 25 Apr 2018, at 11:08, oleh <email address hidden> wrote:
>
> New question #668241 on MadGraph5_aMC@NLO:
>
>
> I am trying to understand how MadGraph generates colour structures for Les-Houches files.
>
> I consider a process g g > g g.
> I found that in LHE file different combinations appear with different frequency.
> I have written a simple python script that finds all different combinations and counts how often they appear.
> Here is the output
> combination = 504 501 503 502 503 501 504 502 with weight = 0.318418
> combination = 503 501 504 502 503 502 504 501 with weight = 0.327642
> combination = 502 501 504 502 503 501 504 503 with weight = 0.089904
> combination = 503 501 501 502 503 504 504 502 with weight = 0.086785
> combination = 502 501 503 502 503 504 504 501 with weight = 0.090514
> combination = 504 501 501 502 503 502 504 503 with weight = 0.086737
>
> So my questions are: 1) Why these combination are generated with different probability?
> Is there a certain hit-or-miss generation algorithm involved ?
>
> 2) If it is possible, could you please give me some references
> with explanations on how MG generates these structures.
>
> 3) Are colour structures and four-momenta generated independently?
> Or certain combinations of colours and four-momenta appear more often than the others?
>
> Thank you!
> With best regards,
> Oleh
>
>
>
> --

 oleh (fedkevych) said on 2018-04-25: #2

Dear Oliver,

many thanks for finding time to answer my question!

>>this is a random process. But not really hit or miss. Since we simply pick in a weighted way.
If it is possible could you please briefly explain how does MG turn weighted events into underweighted ones?

With best regards,
Oleh

 Olivier Mattelaer (olivier-mattelaer) said on 2018-04-25: #3

Hi,

> If it is possible could you please briefly explain how does MG turn weighted events into underweighted ones?

That part is a standard hit and miss.
The part which is not a hit and miss is the pick of a particular color for a particular event.

For the color we have n valid choice each of the with associate weight w_i.

then we throw a random number and assign the color such that
\sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j

Cheers,

Olivier

> On 25 Apr 2018, at 16:02, oleh <email address hidden> wrote:
>
> Question #668241 on MadGraph5_aMC@NLO changed:
>
>
> oleh is still having a problem:
> Dear Oliver,
>
> many thanks for finding time to answer my question!
>
>
>>> this is a random process. But not really hit or miss. Since we simply pick in a weighted way.
> If it is possible could you please briefly explain how does MG turn weighted events into underweighted ones?
>
> With best regards,
> Oleh
>
> --

 oleh (fedkevych) said on 2018-04-25: #4

Dear Oliver,
thank you very much!

 oleh (fedkevych) said on 2018-04-25: #5

Thanks Olivier Mattelaer, that solved my question.

 oleh (fedkevych) said on 2018-07-13: #6

Dear Oliver,

sorry for reopening this branch of the forum after three month, but I found that I do not understand completely the way MadGraph unweights events generated with coloured amplitudes and I also got some additional questions on colour structures in LHE files, so I will be very grateful if some one could help me to answer them.

1) The first question is about the unweighting of the events.

Olivier Mattelaer wrote in this topic:
>That part is a standard hit and miss.
>The part which is not a hit and miss is the pick of a particular color for a particular event.

>For the color we have n valid choice each of the with associate weight w_i.

>then we throw a random number and assign the color such that
>\sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j

If I understand it right I have to fix the phase-space point then take all possible combinations of colour indices that give a non-zero contribution then evaluate corresponding weights w_i and then to use the formula
to decide keep or reject the event.

However in this formula I do not understand two things : what are the boundaries for the second sum (R*\sum w_j ),
and what should I do if i = n ? (should I just leave the last sum out in this case ?)

2) I also have one question on the evaluation of the total cross section. While evaluating the total cross sections for 2 to 2 LO QCD processes does MG computes the colour structures exactly (keeping 1 / (Nc * Nc) terms ?

3) When one reweighs the events according to this formula \sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j
does one consider each channel separately ? Say, if I have a process with t and u channels (like uu > uu),
do I have to consider |M_t| and |M_u| separately (since, if I understand if correctly, we neglect the interference term)?

4) The last question is about the way MadGraph transforms the set of colour indices into the information about the colour flow in LHE files.
Assume that I have a very simple process uu to uu at LO. Than there are two colour flow diagrams
(the first diagram consists out of two parallel lines and the second one consists out of two crossed lines).
Each of these diagrams stands for a combination of Kronecker delta functions (which carry colour indices). Out of 15 non-zero combination of colour indices there are 6 combinations for which the first colour flux diagram has a non-zero weight, 6 combinations for which the second colour flux diagram has a non-zero weight and 3 combinations for which both diagrams have a non-zero weight (a case when all colours of initial and final state quarks are the same).

The question is what should one write in the LHE file for this combination of colours (when all quarks have the same colour)?
In the LHE file generated by MG for this process one has
either 502 0 or 501 0
501 0 502 0
501 0 501 0
502 0 502 0
so, if I understand it properly , these two combinations correspond to two different colour flux diagrams one has in this particular process. However, if I have a combination where all colours are equal the two flux diagrams become indistinguishable and I am confused which combination one has to write in the LHE file ?
Should one just pick up one given combination randomly with 50% probability ?

Thank you very much!

With best regards,
Oleh

 Olivier Mattelaer (olivier-mattelaer) said on 2018-07-13: #7

Hi,

> If I understand it right I have to fix the phase-space point then take all possible combinations of colour indices that give a non-zero contribution then evaluate corresponding weights w_i and then to use the formula
> to decide keep or reject the event.

Obviously, the notion of LC depends of the channel of integration that we consider.
so for example for u u~ > t t~
when will integrate according to u u~ > g > t t~ diagram.
The color-flow corresponding to a scalar gluon is not included (as sub-leading)
While when we integrate according to u u~ > Z > t t~
Then we include that color-flow.

> However in this formula I do not understand two things

This is just a formula stating that we select the element i with probability
P_i = w_i / \sum_j (w_j)
The above formula correspond to the actual implementation that we use.
But it is not important to understand that algorithm as long as you have a method
to reach such probability to reach such probability.

> : what are the boundaries for the second sum (R*\sum w_j ),

that the sum over all term. (that are LC to be precise).
so technically j=0 to n-1
and therefore (following python convention)
\sum_0^{n}

> and what should I do if i = n ? (should I just leave the last sum out in this case ?)

Since all sum starts at zero, i=n does not correspond to any physical element.
the highest possible value for i is n-1
so if i=-n1
we have for the last two term
> R*\sum_{j=0}^{n}w_j < \sum_{j=0}^{n} w_j
which is always true since 0 < R < 1
-- In principle 50% of the "striclty lower" should be replace by "lower or equal" this only complexify the discussion and does not really matter --

> then to use the formula
> to decide keep or reject the event.

No this formula is used to decide which color to write, they are no rejection at this stage.
100% of the events are kept. Some of them are assigned to a given color flow and some other to another color flow. They are no rejection.

> 2) I also have one question on the evaluation of the total cross
> section. While evaluating the total cross sections for 2 to 2 LO QCD
> processes does MG computes the colour structures exactly (keeping 1 /
> (Nc * Nc) terms ?

Yes obviously, this LC stuff is only for the determination of the CF.

> 3) When one reweighs the events according to this formula \sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j
> does one consider each channel separately ? Say, if I have a process with t and u channels (like uu > uu),
> do I have to consider |M_t| and |M_u| separately (since, if I understand if correctly, we neglect the interference term)?

This is done channel by channel (since the code is organised that way) and the determination of which term are in this sum depend of that channel. Now the computation is done for all the diagram not the one related to the channel. (i.e. the decomposition in the color flow basis is done for |M_t+M_u|)

> 4) The last question is about the way MadGraph transforms the set of colour indices into the information about the colour flow in LHE files.

Please read the LHE convention file where this is describe in detail.

> Assume that I have a very simple process uu to uu at LO. Than there are two colour flow diagrams
> (the first diagram consists out of two parallel lines and the second one consists out of two crossed lines).
> Each of these diagrams stands for a combination of Kronecker delta functions (which carry colour indices). Out of 15 non-zero combination of colour indices there are 6 combinations for which the first colour flux diagram has a non-zero weight, 6 combinations for which the second colour flux diagram has a non-zero weight and 3 combinations for which both diagrams have a non-zero weight (a case when all colours of initial and final state quarks are the same).
>
> The question is what should one write in the LHE file for this combination of colours (when all quarks have the same colour)?
> In the LHE file generated by MG for this process one has
> either 502 0 or 501 0
> 501 0 502 0
> 501 0 501 0
> 502 0 502 0

So the first column correspond to the case where you have the color-flow is crossed
and the second where they are parralel. (now it depends on which particle is put in third position obviously)

> so, if I understand it properly , these two combinations correspond to two different colour flux diagrams one has in this particular process. However, if I have a combination where all colours are equal the two flux diagrams become indistinguishable and I am confused which combination one has to write in the LHE file ?
> Should one just pick up one given combination randomly with 50% probability ?

I do not follow your argument. But to come back to above formalism.
if you have two colour flow and if you have w_1 = w_2 then yes in that case
P_1 = 50% = P_2

Cheers,

Olivier

> On 13 Jul 2018, at 13:12, oleh <email address hidden> wrote:
>
> Question #668241 on MadGraph5_aMC@NLO changed:
>
> Status: Solved => Open
>
> oleh is still having a problem:
> Dear Oliver,
>
> sorry for reopening this branch of the forum after three month, but I
> found that I do not understand completely the way MadGraph unweights
> events generated with coloured amplitudes and I also got some additional
> questions on colour structures in LHE files, so I will be very grateful
> if some one could help me to answer them.
>
> 1) The first question is about the unweighting of the events.
>
> Olivier Mattelaer wrote in this topic:
>> That part is a standard hit and miss.
>> The part which is not a hit and miss is the pick of a particular color for a particular event.
>
>> For the color we have n valid choice each of the with associate weight
> w_i.
>
>> then we throw a random number and assign the color such that
>> \sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j
>
> If I understand it right I have to fix the phase-space point then take all possible combinations of colour indices that give a non-zero contribution then evaluate corresponding weights w_i and then to use the formula
> to decide keep or reject the event.
>
> However in this formula I do not understand two things : what are the boundaries for the second sum (R*\sum w_j ),
> and what should I do if i = n ? (should I just leave the last sum out in this case ?)
>
> 2) I also have one question on the evaluation of the total cross
> section. While evaluating the total cross sections for 2 to 2 LO QCD
> processes does MG computes the colour structures exactly (keeping 1 /
> (Nc * Nc) terms ?
>
> 3) When one reweighs the events according to this formula \sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j
> does one consider each channel separately ? Say, if I have a process with t and u channels (like uu > uu),
> do I have to consider |M_t| and |M_u| separately (since, if I understand if correctly, we neglect the interference term)?
>
> 4) The last question is about the way MadGraph transforms the set of colour indices into the information about the colour flow in LHE files.
> Assume that I have a very simple process uu to uu at LO. Than there are two colour flow diagrams
> (the first diagram consists out of two parallel lines and the second one consists out of two crossed lines).
> Each of these diagrams stands for a combination of Kronecker delta functions (which carry colour indices). Out of 15 non-zero combination of colour indices there are 6 combinations for which the first colour flux diagram has a non-zero weight, 6 combinations for which the second colour flux diagram has a non-zero weight and 3 combinations for which both diagrams have a non-zero weight (a case when all colours of initial and final state quarks are the same).
>
> The question is what should one write in the LHE file for this combination of colours (when all quarks have the same colour)?
> In the LHE file generated by MG for this process one has
> either 502 0 or 501 0
> 501 0 502 0
> 501 0 501 0
> 502 0 502 0
> so, if I understand it properly , these two combinations correspond to two different colour flux diagrams one has in this particular process. However, if I have a combination where all colours are equal the two flux diagrams become indistinguishable and I am confused which combination one has to write in the LHE file ?
> Should one just pick up one given combination randomly with 50% probability ?
>
> Thank you very much!
>
> With best regards,
> Oleh
>
> --

 oleh (fedkevych) said on 2018-07-13: #8

Dear Oliver,

As I understood from your answer the generation algorithm has two steps:
generation of phase-space points (with the hit or miss procedure) and the assignment of the colours according to the formula \sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j.

If it is the case I have two additional questions:

1) Am I right that the selection of a phase-space point is due to a hit and miss procedure
where one has to find the phase-space point that gives the highest contribution to the matrix element squared, than find the ratio |M|^2 (at the given point) / |M|^2(max) < random_numer and then discard or keep the phase-space point?

If it is the case does one evaluate the |M^2 for a given set of colours or does one use the colour averaged |M|^2?

2) And am I right that after a given phase-space point was selected one evaluates |M|^2 in the LC approximation for each channel separately, evaluates w_i for all possible combination of colours then generates a random number R and
evaluates sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j ?

If it is the case could you please explain how to use this formula once more?
Should I generate a one random number R then evaluate the expression sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j for all values of i (from 0 to n) and then
pick up a set of colours for which the formula sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j holds ?

But what to do if there are several different colour combination that fulfil this condition?
Should one than pick up the one with the highest weight w_i ?

Thank you again and have a nice weekend.

With best regards,
Oleh

 Olivier Mattelaer (olivier-mattelaer) said on 2018-07-13: #9

Hi,

> 1) Am I right that the selection of a phase-space point is due to a hit and miss procedure
> where one has to find the phase-space point that gives the highest contribution to the matrix element squared, than find the ratio |M|^2 (at the given point) / |M|^2(max) < random_numer and then discard or keep the phase-space point?

This is over-simplified but yes basically we use a hit-and miss procedure for that.
Now the hit and miss is not on the ratio of the matrix-element, you forget the parton-distribution function,
the phase-space factor and the various jacobian that are use for the optmization (which can be tecnically be consider as phase-space factor)

> If it is the case does one evaluate the |M^2 for a given set of colours
> or does one use the colour averaged |M|^2?

This is for the colour average. (Otherwise the method of choosing the colour in a second step will not make any sense)

> 2) And am I right that after a given phase-space point was selected one evaluates |M|^2 in the LC approximation for each channel separately, evaluates w_i for all possible combination of colours then generates a random number R and
> evaluates sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j ?

1) In our code, we first select the channel and then generate the phase-space point accordingly.
(This allow a lot of optimization but requires an information feedback on how many time you need to select each channel --or more exactly on how many event you need to generate for each channel--

2) We never calculate |M|^2 in the LC. In order to calculate |M|^2 we have to evaluate the w_j anyway
(actually here also the non leading color ones) so we do not have to compute the w_j anymore this is already done. We just have to book-keep the information and reuse it.

> If it is the case could you please explain how to use this formula once more?
> Should I generate a one random number R then evaluate the expression sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j for all values of i (from 0 to n) and then
> pick up a set of colours for which the formula sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j holds ?

That's correct.

> But what to do if there are several different colour combination that fulfil this condition?
> Should one than pick up the one with the highest weight w_i ?

That's not possible (but if some w_i are zero in such case, you have some border effect to handle but ok.
The reason why this is not possible is that all w_j are positive.

Cheers,

Olivier

> On 13 Jul 2018, at 20:43, oleh <email address hidden> wrote:
>
> Question #668241 on MadGraph5_aMC@NLO changed:
>
>
> oleh is still having a problem:
> Dear Oliver,
>
> As I understood from your answer the generation algorithm has two steps:
> generation of phase-space points (with the hit or miss procedure) and the assignment of the colours according to the formula \sum_{j=0}^iw_j < R*\sum w_j < \sum_{j=0}^{i+1} w_j.
>
> If it is the case I have two additional questions:
>
> 1) Am I right that the selection of a phase-space point is due to a hit and miss procedure
> where one has to find the phase-space point that gives the highest contribution to the matrix element squared, than find the ratio |M|^2 (at the given point) / |M|^2(max) < random_numer and then discard or keep the phase-space point?
>
> If it is the case does one evaluate the |M^2 for a given set of colours
> or does one use the colour averaged |M|^2?
>
> 2) And am I right that after a given phase-space point was selected one evaluates |M|^2 in the LC approximation for each channel separately, evaluates w_i for all possible combination of colours then generates a random number R and
> evaluates sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j ?
>
> If it is the case could you please explain how to use this formula once more?
> Should I generate a one random number R then evaluate the expression sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j for all values of i (from 0 to n) and then
> pick up a set of colours for which the formula sum_{j=0}^iw_j < R*\sum_{j = 0}^n w_j holds ?
>
> But what to do if there are several different colour combination that fulfil this condition?
> Should one than pick up the one with the highest weight w_i ?
>
> Thank you again and have a nice weekend.
>
> With best regards,
> Oleh
>
> --

 oleh (fedkevych) said on 2018-07-17: #10

Dear Oliver,
Now I see how it works.

With best regards,
Oleh