How many threads should be selected to achieve the best speed?

Asked by Wang Yu on 2020-09-16

I am confused about OpenMP threads in Yade

1. In my understanding, the number of cores and threads are different, with two threads per core, but the doc notes that -j and --cores seem to be equivalent? Shouldn't -j be double --cores?

2. I also found something in, for example, my cpu is i9 10900, 10-cores and 20-threads. So, what number should I use in "-j" "--cores" "OMP_NUM_THREADS" respectively??

Thank you for your help,

Question information

English Edit question
Yade Edit question
No assignee Edit question
Last query:
Last reply:
Jan Stránský (honzik) said : #1

The "best speed" very much depends on your simulation. The best is to try several -j options and measure the time.

Wang Yu (wangyu93) said : #2

Thanks Jan,

I have just tried some simulations. I use different "-j" and "--cores"parameters and found that :
When I use "-j 20" the speed is fastest. But whether it's '--cores=20' or '--cores=10' or whatever number, the computing speed remains mediocre, show little acceleration.
I want to know why it's different from doc's description, why the speeds are different with -J and --cores?

Also, as you said, why isn't speed the best when using the maximum number of threads?I hope you can give me a brief explanation.

Best regards

Jérôme Duriez (jduriez) said : #3

As for 1, the "2 threads per core" idea assumes hyperthreading, which is hardware-dependent.

I tend myself to see "thread" and (logical, if hyperthreading is enabled) "core" as interchangeable, assuming that the user will never launch an OPENMP simulation deploying more threads than there are cores available.

This being said, I'd be curious about more details regarding possible differences between the --cores and -j / --threads options, when deploying such an OPENMP simulation. Note that the doc in yade* -h ssuggests --cores include an extra assignment of affinity, with respect to -j / --threads.

Robert Caulk (rcaulk) said : #4

>why isn't speed the best when using the maximum number of threads?

There could be quite a few reasons for this. Impossible to answer without an MWE [1].




Can you help with this problem?

Provide an answer of your own, or ask Wang Yu for more information if necessary.

To post a message you must log in.