Number of events in results file vs weighted events file

Asked by Matthew Klimek on 2020-12-27

I want to be sure I understand the meaning of the "Events (K)" column in the results webpage.

I have some process for which I request 100k events. After generation the results webpage tells me that 2495k events were generated. The "Unwgt" column reads about 141k. These are unweighted events from which 100k were stored in unweighted_events.lhe.

I know that as of version 2.4.3, the events.lhe is no longer saved, and I know that you do not support older versions. However, I ran this process in version 2.4.2 just to see what the old events.lhe would have contained. I am only asking this as a question of principle, so that I understand what is happening in the MadGraph unweighting procedure. The events.lhe file contains 181k events.

My understanding was that events.lhe contained unweighted events, but yet the number of events in that file and the number of events reported in the results webpage are different by more than an order of magnitude.

May I ask what these numbers mean? I feel that this is a very basic question, but I can't find any mention of it on the help pages or any documentation. I apologize in advance if this is a trivial question.

Question information

Language:
English Edit question
Status:
Solved
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Solved by:
Olivier Mattelaer
Solved:
2020-12-28
Last query:
2020-12-28
Last reply:
2020-12-28

Hi,

MG5aMC performs three layer of unweighting.
The column "Events (K)" is the number of non-vanishing phase-space point evaluated.

The first unweighting one is a pure efficiency one so let ignore it. (it is actualy only a partial unweighting in order to reduce the pressure on /tmp and avoid to write on disk event with small weight)

The second correspond to the unweighting for a given channel of integration and the result was the (weighted) file events.lhe.
Each channel performs an unweighting for HIS channel and therefore the events are unweighted for that channel but since all channel use a different maximal value they are still weighted for the full sample (just that the weight for all the events coming from the same channel will all have the same weights).
This is the "141k" and/or "181k" events that you report.

The final unweighting is combining all those channel in a unique unweighted events.
In the past this was simply unweighting the events.lhe. Now we bypass the writting/reading of that file for higher efficiency. Looks like your process has at least one channel of integration with not enough events ( and therefore large weights) which will kill the unweighting efficiency and lead to onlyy 2495 events after unweighting.

So to sum-up
"Events (K)": number of times we evaluated the matrix-element (all of them passed the cuts)
events.lhe: is a weighted file where all the event from a giveh channel have the same weight
unweighed_events.lhe: real unweighted_file (all events have the same weight)

Cheers,

Olivier

PS: you can order the channel by luminosity to identify which channel is kiling your event efficiency
PPS: 2.8.0 starts to include nice improvement in phase-space integration and the future 2.9.0 will introduce additional trick to fix such type of issue (mainly for VBF processes)

> On 27 Dec 2020, at 18:25, Matthew Klimek <email address hidden> wrote:
>
> New question #694690 on MadGraph5_aMC@NLO:
> https://answers.launchpad.net/mg5amcnlo/+question/694690
>
> I want to be sure I understand the meaning of the "Events (K)" column in the results webpage.
>
> I have some process for which I request 100k events. After generation the results webpage tells me that 2495k events were generated. The "Unwgt" column reads about 141k. These are unweighted events from which 100k were stored in unweighted_events.lhe.
>
> I know that as of version 2.4.3, the events.lhe is no longer saved, and I know that you do not support older versions. However, I ran this process in version 2.4.2 just to see what the old events.lhe would have contained. I am only asking this as a question of principle, so that I understand what is happening in the MadGraph unweighting procedure. The events.lhe file contains 181k events.
>
> My understanding was that events.lhe contained unweighted events, but yet the number of events in that file and the number of events reported in the results webpage are different by more than an order of magnitude.
>
> May I ask what these numbers mean? I feel that this is a very basic question, but I can't find any mention of it on the help pages or any documentation. I apologize in advance if this is a trivial question.
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Dear Olivier,
Thanks so much for this thorough answer.

Just to clarify my original question, the 2495k was the number in the "Events (K)" column. I didn't mean that only 2495 final unweighted events were generated. It's a simple process that doesn't have any particularly troublesome channels.

The 181k events are in events.lhe, which thanks to you, I now understand are events which are unweighted in each channel, but not unweighted relative to other channels.

The 141k events reported in the "Unwgt" column of the webpage are how many were left after fully unweighting. I only asked for 100k, so I assume the 100k in unweighted_events.lhe are just randomly or sequentially chosen from the 141k. But anyway, this doesn't introduce any bias because they were all properly unweighted anyway.

Just one additional question, then:
Is there any way for a user to access raw weighted events that are generated throughout the various stages of the code? It seems, for example, that in https://arxiv.org/abs/1308.1636 such raw events were obtained, but only by hacking the code.

Hi,

> Just one additional question, then:
> Is there any way for a user to access raw weighted events that are generated throughout the various stages of the code? It seems, for example, that in https://arxiv.org/abs/1308.1636 such raw events were obtained, but only by hacking the code.

We do not have any option for that indeed.
The closest is the following approach:
https://cp3.irmp.ucl.ac.be/projects/madgraph/wiki/LOEventGenerationBias <https://cp3.irmp.ucl.ac.be/projects/madgraph/wiki/LOEventGenerationBias>

Hacking only the "third layer of unweighting" (i.e. equivalent of regenerating events.lhe.gz) should be pretty simple. de-activating layer 2 should not be too complicated but might create IO botttelnect (and maybe slow down the code)
de-activating layer 3 is likely un-reasonable, it will slow down the code, will kill your cluster IO.

Cheers,

Olivier

> On 28 Dec 2020, at 14:50, Matthew Klimek <email address hidden> wrote:
>
> Question #694690 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/694690
>
> Status: Answered => Open
>
> Matthew Klimek is still having a problem:
> Dear Olivier,
> Thanks so much for this thorough answer.
>
> Just to clarify my original question, the 2495k was the number in the
> "Events (K)" column. I didn't mean that only 2495 final unweighted
> events were generated. It's a simple process that doesn't have any
> particularly troublesome channels.
>
> The 181k events are in events.lhe, which thanks to you, I now understand
> are events which are unweighted in each channel, but not unweighted
> relative to other channels.
>
> The 141k events reported in the "Unwgt" column of the webpage are how
> many were left after fully unweighting. I only asked for 100k, so I
> assume the 100k in unweighted_events.lhe are just randomly or
> sequentially chosen from the 141k. But anyway, this doesn't introduce
> any bias because they were all properly unweighted anyway.
>
> Just one additional question, then:
> Is there any way for a user to access raw weighted events that are generated throughout the various stages of the code? It seems, for example, that in https://arxiv.org/abs/1308.1636 such raw events were obtained, but only by hacking the code.
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Thanks Olivier Mattelaer, that solved my question.