Incoherent results for interference process with zero cross section

Asked by Andrey Popov

Dear experts,

I study production of a heavy Higgs boson in the process pp -> H -> tt, for which I generate the resonant (Breit-Wigner) part and the interference with the SM pp -> tt independently. I'm facing what seems to be a numeric instability for the interference part when its overall cross section approaches zero. Namely, the ratio between the cross section, as reported by MadGraph, and the mean weight computed from the generated LHE file deviates from unity significantly, in some cases more than a factor of two. For the resonant part this ratio is always one, as expected. The issue is seen in MadGraph_aMC@NLO 2.5.1 and also 2.5.4.

What is the reason for such a behaviour? Naively, I thought that the cross section and the mean weight are supposed to be equal by construction. How the rates should be reproduced in this case? In order to construct a differential cross section, I would normally fill a histogram using weights sigma * w_i / (N * <w>), where w_i is the weight of event i, <w> is the mean weight in a sample of size N, and sigma is the corresponding cross section. Is the absolute value of the event weights correct in the sense that the histogram can be filled using weights w_i / N (i.e. imposing sigma = <w>)?

Best regards,
Andrey

Question information

Language:
English Edit question
Status:
Answered
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#1

Dear Andrey,

Are you computing the pure interference term? (i.e. a non positive definite quantity)
Did you get the number of events that you asked for?

Can you send me your process definition and benchmark such that I can check myself?
I can see reason why this happens but I would like to investigate more.

Cheers,

Olivier

> On 28 Apr 2017, at 16:28, Andrey Popov <email address hidden> wrote:
>
> New question #629926 on MadGraph5_aMC@NLO:
> https://answers.launchpad.net/mg5amcnlo/+question/629926
>
> Dear experts,
>
> I study production of a heavy Higgs boson in the process pp -> H -> tt, for which I generate the resonant (Breit-Wigner) part and the interference with the SM pp -> tt independently. I'm facing what seems to be a numeric instability for the interference part when its overall cross section approaches zero. Namely, the ratio between the cross section, as reported by MadGraph, and the mean weight computed from the generated LHE file deviates from unity significantly, in some cases more than a factor of two. For the resonant part this ratio is always one, as expected. The issue is seen in MadGraph_aMC@NLO 2.5.1 and also 2.5.4.
>
> What is the reason for such a behaviour? Naively, I thought that the cross section and the mean weight are supposed to be equal by construction. How the rates should be reproduced in this case? In order to construct a differential cross section, I would normally fill a histogram using weights sigma * w_i / (N * <w>), where w_i is the weight of event i, <w> is the mean weight in a sample of size N, and sigma is the corresponding cross section. Is the absolute value of the event weights correct in the sense that the histogram can be filled using weights w_i / N (i.e. imposing sigma = <w>)?
>
>
> Best regards,
> Andrey
>
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Andrey Popov (rainfinder) said :
#2

Hello Olivier,

Thank you for your reply.

> Are you computing the pure interference term? (i.e. a non positive definite quantity)
I would guess so, although I'm not totally sure I understand what you mean. The matrix element is built from a product of amplitudes for SM and s-channel Higgs boson, and it is negative in some regions of the phase space.

> Did you get the number of events that you asked for?
Yes.

We use a custom model and also hack into matrix.f, so this is not too easy to share. I'm going to try to create a simplistic example that reproduces the problem.

Cheers,
Andrey

Revision history for this message
Andrey Popov (rainfinder) said :
#3

Hello Olivier,

I haven't managed to prepare the example with one of the models included in the distribution. If you are willing to give it a try with our models, you can find it here [1] (created by Sébastien Brochet). To reproduce the issue, do
  import Massive_Higgs_UFO
  generate g g > t t~ /h0 HIG=2 HIG^2==2
and set MA0 = 600 GeV and A0width = 10 GeV in the param_card. When generating 10k events with random-number seed 17 (though, I guess results would depend also on the architecture, right?), I get a cross section of -0.02974 +- 0.02027 pb. The mean weight is -0.05993. The absolute value or the weight is 6.116, and the fraction of events with positive weights is 0.4951.

Running the same with a different random-number seed, I get cross section -0.080437 pb and mean weight -0.1536. But the absolute value of the weight and the fraction of events with positive weights stay similar (6.138 and 0.4875 in this example), which gives me a hope that they are correct.

[1] https://cernbox.cern.ch/index.php/s/3VZpEEuoWvXGFHd

Please let me know if you'd like to know any other details.

Cheers,
Andrey

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#4

Hi Andrey,

I have take a look at multiple run:
Here is my results: (everything below is output via the command check_events)

Run_01_0
=========

INFO: total cross-section: -994.3884
INFO: total abs cross-section: 61382.0
INFO: fraction of negative event 0.5081
INFO: total number of events 10000
INFO: negative event 5081
INFO: Original cross-section: -0.080437 +- 0.02045044 pb

Run_01_1
========
INFO: total cross-section: -503.01752
INFO: total abs cross-section: 61343.6
INFO: fraction of negative event 0.5041
INFO: total number of events 10000
INFO: negative event 5041
INFO: Original cross-section: -0.059256 +- 0.02407925 pb

Run_01_2:
=========
INFO: total cross-section: -1351.5656
INFO: total abs cross-section: 61434.8
INFO: fraction of negative event 0.511
INFO: total number of events 10000
INFO: negative event 5110
INFO: Original cross-section: -0.083195 +- 0.01912786 pb

Run_01_3
========
INFO: total cross-section: -1020.11648
INFO: total abs cross-section: 61452.8
INFO: fraction of negative event 0.5083
INFO: total number of events 10000
INFO: negative event 5083
INFO: Original cross-section: -0.07244 +- 0.0174325 pb
INFO: Computed cross-section:

You can observe that
1) they are a bug with the command check_events related to the normalisation (I did not update that routine when I changed the default normalisation for LO process) but ok this is irrelevant. (You can divide those number by 10k, the number of events)

2) The total absolute cross-section is very stable and quite precise. (This explains that you get the correct number of events)
3) the total cross-section (either from the sum of the events or via the more refine estimator) are quite unstable
4) the number of events with negative weights is very close to 50%

Point 3 and 4 explains why you have those difference. The procedure of unweighting preserve the absolute cross-section but not the “real” cross-section.
Since you have close to 50% negative weights, you indeed have large difference between the two method of computation and large error associated to each of those method. So I do not see anything wrong here.

I have made 23 of the above run, and the combination of cross-section gives (for 230k events generated in 23k sample of 10k each):
-0.0678382 +- 0.00387938 pb

To be more sure, I have also generated samples with higher stats (250k):
Run_02_1:
INFO: total cross-section: -20807.111
INFO: total abs cross-section: 1538987.50001
INFO: fraction of negative event 0.50676
INFO: total number of events 250000
INFO: negative event 126690
INFO: Original cross-section: -0.063962 +- 0.004357236 pb
INFO: Computed cross-section:

Run_02_2:
INFO: total cross-section: -20535.6226
INFO: total abs cross-section: 1537097.5
INFO: fraction of negative event 0.50668
INFO: total number of events 250000
INFO: negative event 126670
INFO: Original cross-section: -0.061585 +- 0.003982865 pb
INFO: Computed cross-section:

Run_03_3:
INFO: total cross-section: -16798.98604
INFO: total abs cross-section: 1537242.5
INFO: fraction of negative event 0.505464
INFO: total number of events 250000
INFO: negative event 126366
INFO: Original cross-section: -0.066561 +- 0.003939775 pb
INFO: Computed cross-section:

On this run, you can see that this is going in the correct direction, smaller error and more stable results.

So to sum up, I do not see any real problem in this case, but indeed you need to go to large number of events in order to have an accurate results when the results
Is actually close to be zero.

Cheers,

Olivier

> On 28 Apr 2017, at 18:52, Andrey Popov <email address hidden> wrote:
>
> Question #629926 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/629926
>
> Andrey Popov posted a new comment:
> Hello Olivier,
>
> I haven't managed to prepare the example with one of the models included in the distribution. If you are willing to give it a try with our models, you can find it here [1] (created by Sébastien Brochet). To reproduce the issue, do
> import Massive_Higgs_UFO
> generate g g > t t~ /h0 HIG=2 HIG^2==2
> and set MA0 = 600 GeV and A0width = 10 GeV in the param_card. When generating 10k events with random-number seed 17 (though, I guess results would depend also on the architecture, right?), I get a cross section of -0.02974 +- 0.02027 pb. The mean weight is -0.05993. The absolute value or the weight is 6.116, and the fraction of events with positive weights is 0.4951.
>
> Running the same with a different random-number seed, I get cross
> section -0.080437 pb and mean weight -0.1536. But the absolute value of
> the weight and the fraction of events with positive weights stay similar
> (6.138 and 0.4875 in this example), which gives me a hope that they are
> correct.
>
> [1] https://cernbox.cern.ch/index.php/s/3VZpEEuoWvXGFHd
>
>
> Please let me know if you'd like to know any other details.
>
>
> Cheers,
> Andrey
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#5

Just want to add one comment, the combination of all the 250k samples gives an estimate of
     Cross-section : -0.06623 +- 0.00155 pb
     Nb of events : 1250000

Cheers,

Olivier

Revision history for this message
Andrey Popov (rainfinder) said :
#6

Hi Olivier,

Thank you for the investigation. I did not try to suggest that there is a bug in MadGraph, but I would like to figure out how one could reproduce correct rates for such a process in an analysis. The cross section can be made arbitrarily close to zero by playing with parameters (there are points at which the physical cross section crosses zero). Because of this, the solution of generating more events is not quite scalable.

I had hoped that the absolute value of the weight is correct nonetheless and the event weights can be used in normalization as I outlined in the first message. Is it not true? What is the absolute cross section you mentioned (apparently it is directly related with the absolute value of the weights)? Is it documented somewhere?

Cheers,
Andrey

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#7

Hi Andrey,

> Thank you for the investigation. I did not try to suggest that there is
> a bug in MadGraph, but I would like to figure out how one could
> reproduce correct rates for such a process in an analysis. The cross
> section can be made arbitrarily close to zero by playing with parameters
> (there are points at which the physical cross section crosses zero).
> Because of this, the solution of generating more events is not quite
> scalable.

I think that you try to push our code too far. If you really want to have a quick estimation of the (nearly vanishing) integral.
You have to find a semi-analytical method. Making an integral of a non positive function is indeed not scalable. (Whatever the kind of estimator you are using)

What I mean is that the integral that you try to estimate is basically A - (A-epsilon), if you try to estimate numerically A and A-epsilon, you obviously need to have each term very precise in order to have the difference precise as well (this is call a catastrophic cancelation). The solution is to find a way to compute directly epsilon.

> I had hoped that the absolute value of the weight is correct nonetheless

The absolute value of the weight is indeed correct since they are directly related to A+A-epsilon, so you do not need that much precision on those to have the absolute value to be trustable. Actually, what is not trustable is the number of positive/negative events which is a random number related to a Poisson (in this limit gaussian) distribution

> and the event weights can be used in normalization as I outlined in the
> first message. Is it not true?

This is a second estimator of the integral, you can use it if you want, but I expect that estimator to be less precise that the one used for reporting the result.

> What is the absolute cross section you
> mentioned (apparently it is directly related with the absolute value of
> the weights)? Is it documented somewhere?

The absolute cross-section is nothing else that the |A| +|A-epsilon| or more precisely \int |f(x)|
For documentation, you can take a look at any of the MCnet school (or any of the mad graph school)

Cheers,

Olivier

> On 3 May 2017, at 12:58, Andrey Popov <email address hidden> wrote:
>
> Question #629926 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/629926
>
> Status: Answered => Open
>
> Andrey Popov is still having a problem:
> Hi Olivier,
>
> Thank you for the investigation. I did not try to suggest that there is
> a bug in MadGraph, but I would like to figure out how one could
> reproduce correct rates for such a process in an analysis. The cross
> section can be made arbitrarily close to zero by playing with parameters
> (there are points at which the physical cross section crosses zero).
> Because of this, the solution of generating more events is not quite
> scalable.
>
> I had hoped that the absolute value of the weight is correct nonetheless
> and the event weights can be used in normalization as I outlined in the
> first message. Is it not true? What is the absolute cross section you
> mentioned (apparently it is directly related with the absolute value of
> the weights)? Is it documented somewhere?
>
>
> Cheers,
> Andrey
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Andrey Popov (rainfinder) said :
#8

Hi Oliver,

I understand that computing the overall cross section to a high precision is difficult numerically. But this is not what I'm trying to accomplish. I'd like to build the differential cross section in mtt. It has a characteristic peak-dip structure. Even when its integral is zero, the differential cross section is well-defined, and one can speak about the “rate” of the peak or the dip.

Thanks, I'll search materials of MCnet schools for the absolute cross section.

Cheers,
Andrey

Revision history for this message
Olivier Mattelaer (olivier-mattelaer) said :
#9

Hi,

> . But this is not what I'm trying to
> accomplish. I'd like to build the differential cross section in mtt. It
> has a characteristic peak-dip structure. Even when its integral is zero,
> the differential cross section is well-defined, and one can speak about
> the “rate” of the peak or the dip.

Sure you can use those events for the differential distributions.
Since for the mtt, you have this nice structure you should have in each bin close to 100% of the events having the same sign.
This should allow to have a quite good accuracy on each bin.

Cheers,

Olivier

> On 3 May 2017, at 14:28, Andrey Popov <email address hidden> wrote:
>
> Question #629926 on MadGraph5_aMC@NLO changed:
> https://answers.launchpad.net/mg5amcnlo/+question/629926
>
> Andrey Popov posted a new comment:
> Hi Oliver,
>
> I understand that computing the overall cross section to a high
> precision is difficult numerically. But this is not what I'm trying to
> accomplish. I'd like to build the differential cross section in mtt. It
> has a characteristic peak-dip structure. Even when its integral is zero,
> the differential cross section is well-defined, and one can speak about
> the “rate” of the peak or the dip.
>
> Thanks, I'll search materials of MCnet schools for the absolute cross
> section.
>
>
> Cheers,
> Andrey
>
> --
> You received this question notification because you are an answer
> contact for MadGraph5_aMC@NLO.

Revision history for this message
Andrey Popov (rainfinder) said :
#10

Hi Olivier,

> Sure you can use those events for the differential distributions.
> Since for the mtt, you have this nice structure you should have in each bin close to 100% of the events having the same sign.
> This should allow to have a quite good accuracy on each bin.

Thanks for reassuring that the shape of the differential distribution in mtt is reproduced correctly. Do I read your message right in that it does not say anything about the rates (which has been being my question)?

Cheers,
Andrey

Can you help with this problem?

Provide an answer of your own, or ask Andrey Popov for more information if necessary.

To post a message you must log in.