What is the limit for events simulated with good randomness

Asked by David Wang on 2020-09-17

Hi Whizard Creator,

I am wondering how good is our TAO random number generator? I want to simulate 100k events but not sure if the random number list will start to repeat itself at some point. If this do happen, any suggestions?

Question information

Language:
English Edit question
Status:
Solved
For:
WHIZARD Edit question
Assignee:
Simon Braß Edit question
Solved by:
Simon Braß
Solved:
2020-09-20
Last query:
2020-09-20
Last reply:
2020-09-19
Juergen Reuter (j.r.reuter) said : #1

Please confer the WHIZARD manual, Chap. 6, and arXiv:1811.09711, Sec. 4.4.

David Wang (david-mhw) said : #2

Thanks! Sorry for not stating my question clearly. I actually had known that the random number generator can have random number with period about 2^30~10^9. But, I am not sure how this random number list is used in our Monte-Carlo simulations. To simulate a event, does the program only needs one random number or it needs multiple of random numbers.

If simulating 1 event needs 1000 random numbers, then we in total need 10^8 random numbers for simulating 100k events. So, I guess under this scenario, we are fine. Nevertheless, this is just my naive guess and I am not sure how many random numbers we need to simulate one event in whizard.

I also guess the random numbers needed for one event might depend on how many diagrams are there. If my process has 1000 diagrams, then it needs 1000 random numbers?

Thank you!

Thorsten Ohl (thomega) said : #3

The number of diagrams is irrelevant for the number of random numbers consumed.

Instead you need a one random number for each independent momentum component (i.e. 3*(n+1) for an on-shell 2 → n process), multplied by the inverse of the rejection efficiency in the generation step.

David Wang (david-mhw) said : #4

Thanks, Torsten! Just want to double check is this the number of random numbers consumed for 1 event? Does the consumption of random numbers increase with the number of events generated? Is the rejection efficiency you mentioned indicated in Eff[%] column during the integration?

If I have a 2->5 process, then one event would consume 3*(n+1) = 18 random numbers. If the efficiency, which indicated by Eff[%], to be 0.01. Then, I need 18/1e-4 = 1.8e5 random numbers for just 1 event. Then, if I want to simulate 100k events, I need 1.8e10 random numbers.

Meanwhile, it seems like that TAO generator is for personal computer(with OpenMP) and RNGstream with VAMP2 should be used if we compute things on a cluster. Is that correct?

Simon Braß (sbrass) said : #5

Dear David,

WHIZARD uses up to N random numbers per event, where N is constructed out of the number of independent variables for phase-space, structure functions variables and WHIZARD internal drawings, e.g. for accepting/rejecting events, selecting the particle content of an event (iff there is an ambiguity) and so on.

In principle, either of the random number generators is sufficient for typical applications, however, only VAMP2 supports both, TAO and RNGstream.
Especially, if you want to use the MPI facility of VAMP2, we strongly recommend to use the RNGstream (as the TAO RNG could lead to correlated artefacts during event generation) as RNGstream allows us to deterministically assign independent random number sequences to different executions of WHIZARD during a parallel run.

Back to your fear of an insufficient length of the random number sequence, it is always preferable to have a pool of random numbers much larger than the needed numbers of random numbers.
However, the order and application of the each single random number matters also.
For WHIZARD, a rather small sized pool (2^30) is already sufficient to produce good results (WHIZARD applies a mechanism internally to dissect the overall TAO random numbers sequence internally into different accessible sequences and selects from these sequences).

You will only get *interesting* results if you reuse the exact same random number again and again in the same places, which will seldom happen (then, however, due to some bug in our program).

But, keep in mind, the biggest random number consumers come later in the process of event generation: shower, hadronization and detector simulation.

David Wang (david-mhw) said : #6

Thanks, Simon! Now, I am clear about the parallel side.

But, I still feel a bit shaky on the random number issue since I try to simulate 100K events and need to make sure they are faithful. For the example above, 2->5 process with rejection efficiency to be 0.01%. If I understand correctly, I do in fact need 1.8e10 random numbers which are more than 2^30. So, for that process with that rejection efficiency, the result is not trustworthy I guess.
Then, I had better simulate two 50K events with different seeds?

Best Simon Braß (sbrass) said : #7

Hi David,

you can always replace VAMP with VAMP2, which you can apply for both serial and parallel runs.
One of the advantages of VAMP2 is that you can use RNGstream with it (as you already know).

RNGstream should provide with a sufficient large pool of random numbers.

Cheers,
Simon

David Wang (david-mhw) said : #8

Thanks Simon Braß, that solved my question.