Intel hardware; hwloc-related?
Quick question. Most of my workstations are Threadripper, but I have inherited two workstations that house multiple RTX4900s and are of use. These use Intel 12th generations processors and I can not, for the life of me, prevent the [drain] state if I correctly set the slurm.conf.
The simpler is the i5, whose correct topology is 10 cores (6 P-cores, 4 E-cores), thus 16 threads. I can only run it with CPU=6, thread=2. The problem scales with the i9. I think I set the e_core and config_overrides correctly.
lscpu currently identifies the core and thread count, thus my guess that hwloc is at fault. Is this a bug or 'feature'.
Any advice? Thank you for all your help.
Question information
- Language:
- English Edit question
- Status:
- Open
- For:
- Ubuntu slurm-wlm Edit question
- Assignee:
- No assignee Edit question
- Last query:
- Last reply:
Can you help with this problem?
Provide an answer of your own, or ask Vince Amoroso for more information if necessary.